标签 python 下的文章

“”

Installing MySQL for Python 3

sudo apt-get install build-essential python3-dev libmysqlclient-dev

sudo apt-get install python3-pip

**install the MySQL connector
using pip:**

sudo pip3 install mysql-connector

or

sudo apt-get install python3-mysql.connector

demo_mysql_connection.py:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword"
)

print(mydb)

Creating a Database
To create a database in MySQL, use the "CREATE DATABASE" statement:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword"
)

mycursor = mydb.cursor()

mycursor.execute("CREATE DATABASE mydatabase")

Creating a Table

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

mycursor.execute("CREATE TABLE customers (name VARCHAR(255), address VARCHAR(255))")

Insert Into Table

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "INSERT INTO customers (name, address) VALUES (%s, %s)"
val = ("John", "Highway 21")
mycursor.execute(sql, val)

mydb.commit()

print(mycursor.rowcount, "record inserted.")

Select From a Table

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

mycursor.execute("SELECT * FROM customers")

myresult = mycursor.fetchall()

for x in myresult:
  print(x)

Python 使用 MySQLdb
檢查是否有安裝 MySQLdb 模組

python -c "import MySQLdb"

ImportError: No module named MySQLdb

MySQLdb 只支持python2,不支持python3

python mysql db 全局变量

global conn 
conn = mysql.connector.connect(
      host="localhost",
      user="yourusername",
      password="yourpassword"
    )

python html parse

from html.parser import HTMLParser


class MyHTMLParser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        print("Encountered a start tag:", tag, type(tag))
        for attr in attrs:
            # print("     attr:", attr, type(attr))
            if(tag == "a" and attr[0] == 'href'):
                print("href = ", attr[1])


    def handle_endtag(self, tag):
        print("Encountered an end tag :", tag)

    def handle_data(self, data):
        print("Encountered some data  :", data)


parser = MyHTMLParser()
parser.feed('<html><head><title>Test</title></head>'
            '<body><h1>Parse me!</h1><a href="https://www.baidu.com/">baidu</a></body></html>')

python file read

    try:
        f = open(filebat)
    except FileNotFoundError:
        print("file not found :", filebat)
    else:
        strfile = f.read()

python bytes to string

>>> b"abcde"
b'abcde'

# utf-8 is used here because it is a very common encoding, but you
# need to use the encoding your data is actually in.
>>> b"abcde".decode("utf-8") 
'abcde'

python encoding

import os
import sys

ch = '中国'

print('GBK:', ch.encode('gbk'))
print('UTF8:', ch.encode('utf8'))

print('GBK:', ch.encode('gbk').decode('gbk'))
print('UTF8:', ch.encode('utf8').decode('utf8'))

GBK: b'xd6xd0xb9xfa'
UTF8: b'xe4xb8xadxe5x9bxbd'
GBK: 中国
UTF8: 中国

python tcp socket

import socket

socket.setdefaulttimeout(2)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
    s.connect(("127.0.0.1", 5000))
except Exception as e:
    print( str(e))
    exit(0)

s.send(b'GET / HTTP/1.1\r\n\r\n')
while True:
    data = s.recv(1024)
    if data:
        print(data.decode("utf8"))
    else:
        break

输出:

HTTP/1.0 200 OK

Content-Type: text/html; charset=utf-8
Content-Length: 10
Server: Werkzeug/2.0.3 Python/3.8.10
Date: Wed, 09 Mar 2022 01:46:40 GMT


Index Page

安装bluetooth

sudo apt-get update
sudo apt-get install python-pip python-dev ipython

sudo apt-get install bluetooth libbluetooth-dev
sudo pip install pybluez

python3 requests post octet-stream

import requests
with open('./x.png', 'rb') as f:
    data = f.read()
res = requests.post(url='http://httpbin.org/post',
                    data=data,
                    headers={'Content-Type': 'application/octet-stream'})

python 实现sm3计算hash

p1 = subprocess.Popen(["echo",  "-n", str], stdout=subprocess.PIPE)
p2 = subprocess.Popen(["openssl", "dgst", "-sm3"], stdin=p1.stdout, stdout=subprocess.PIPE)
sm3 = p2.stdout.readline()[9:-1]
print('sm3 = ', sm3)

python hex string to string

a = 'aabbccddeeff'
a_bytes = bytes.fromhex(a)
print(a_bytes)
b'\xaa\xbb\xcc\xdd\xee\xff'
aa = a_bytes.hex()
print(aa)
aabbccddeeff

python3 read file line by line

file1 = open('myfile.txt', 'r')
Lines = file1.readlines()
 
count = 0
for line in Lines:
    count += 1
    print("Line{}: {}".format(count, line.strip()))

python3 ubuntu beep

import os
import time
duration = 1  # seconds
freq = 440  # Hz

bool1 = True
elapsed_time = 0
while bool1:
    start = time.time()
    elapsed_time += time.time() - start
    if elapsed_time >= 2.5:#Adjust yourself
        os.system('play -nq -t alsa synth {} sine {}'.format(duration, freq))
        bool1 = False

pyOpenSSL是Python的openssl库.
通过pip安装:

pip install pyOpenSSL

产生密钥对

from OpenSSL.crypto import PKey
from OpenSSL.crypto import TYPE_RSA, FILETYPE_PEM
from OpenSSL.crypto import dump_privatekey, dump_publickey

pk = PKey()
print(pk)
pk.generate_key(TYPE_RSA, 1024)
dpub = dump_publickey(FILETYPE_PEM, pk)
print(dpub)
dpri = dump_privatekey(FILETYPE_PEM, pk)
print(dpri)

运行结果:
<OpenSSL.crypto.PKey object at 0x76c3b090>
b'-----BEGIN PUBLIC KEY-----nMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCyNCTZuEzZrX2OaaPgcdCsd3VInPXVGyWKzCc0rUdmmrD7+czdeCgoeHuCwwkig+pGhYFYZvFNZFaEzxKmmJOTxrklBnxOk2K2mTvqsviPMFG780qG69zM+Zm+tYPy+aU4taRoPhlSY9hy2YWubKiLqUkGWXnfoJOElkGFD+O4IwsWwIDAQABn-----END PUBLIC KEY-----n'
b'-----BEGIN PRIVATE KEY-----nMIICdgIBADANBgkqhkiG9w0BAQEFAASCAmAwggJcAgEAAoGBALI0JNm4TNmtfY5pno+Bx0Kx3dUg9dUbJYrMJzStR2aasPv5zN14KCh4e4LDCSKD6kaFgVhm8U1kVoTPEnqaYk5PGuSUHE6TYraZO+qy+I8wUbvzSobr3Mz5mb61g/L5pTi1pGg+GVJj2HLZhan5sqIupSQZZd+gk4SWQYUP47gjCxbAgMBAAECgYEAhqYNvhCayNNlDmlV8O4uvVIZn5TbC2XSrRhq+0t+qtFxr0Llf+Ydec7njDswOMsyBo0z2YcXBuIs2XbZYdXhlH9AgnWWrkhRPVLN0mvs/XPpXRkQh233orznIgoz8UBuoUlcXppA/KOnSSJ2RuwVtiqAl5nQf71oRSVZ7AmKy6PLOECQQDofMyLSx9KWoQ/HSiw7w9lgX7+olyB3ybg+Hi9YdrhnyxWJ2VJoY5TTCCsREsAyF86fRSNuaF3PpU/6oYkT+6brAkEAxDnp8jAKmrfX8qKTnnNOwFJaAJBAihtrgeURPWGiCTSWR9w0S5w+AKeeGU9oEhuPaUK1rUdtqYlFMhAgjn2iYUUQJAEw9wQZdGGHV1VCtS07a1v2+vdrbe+LLP4C/ezkAAjvR0bpnHnNFVOTv5nM+win7i98ubbMckSr9xwwy6NK3s9QwJAU2wjr5kJCRnbrwW7J9M/aqFJPQu3AgoPnoL6P1RApRU8RrSxbuuv2GtqZWxC3F/nKmL4BgD1+DuptUzx6sYW64QJAC5jNhQw7nd1AzDBc4X8fkcOH1+fn3sNGT5UBZ5+1l8jy44QR0ZaDsbGKyWEizDiJVvC01eNJdnTDj99Venyyug6Q==n-----END PRIVATE KEY-----n'

签名与验签

from OpenSSL.crypto import PKey
from OpenSSL.crypto import TYPE_RSA, FILETYPE_PEM
from OpenSSL.crypto import sign, verify
from OpenSSL.crypto import X509

pk = PKey()
pk.generate_key(TYPE_RSA, 1024)
 
signature = sign(pk, 'hello, world!', 'sha256')
print(signature)
 
x509 = X509()
x509.set_pubkey(pk)
verify(x509, signature, 'hello, world!', 'sha256')

结果:
b'txe1xb1rxc1}x82x9dxbexa2x97x14x88xdbxf7x19x835xeb=xc0x87xa5xe9xe7x10xcdxaax90Qx11xee;oxf4Axafxa0xfcj3Xtxd9=x10xf3xbdxe9xc3>@xc1xafxffx8dxfbtxd9x81xfaxdexa2QLxc2xf0t+_wxfex1bx86x0f\xebJ\x17xcaxf4x11xb0lxd6x17`xfdx194xa6x0cxe3yx93Exd2x92Bx984-(xc8qxdax1e:,xd4x83jxca(jxe4xb5Gxa6(xfaxffx97xa2xabxa9xd6'

如果验签失败,会出现以下错误.
OpenSSL.crypto.Error: [('rsa routines', 'int_rsa_verify', 'bad signature')]

安装rsa模块

pip install rsa

RSA加解密过程:

python

Python 3.5.3 (default, Nov 4 2021, 15:29:10)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.

import rsa
(pubkey, privkey) = rsa.newkeys(512)
print(pubkey, privkey)

PublicKey(7923383863263798057086493131602855244106043226226580784778365931183912547588793153219251578468413935121113886104438479941506950504197502414270136297859107, 65537) PrivateKey(7923383863263798057086493131602855244106043226226580784778365931183912547588793153219251578468413935121113886104438479941506950504197502414270136297859107, 65537, 4082772678981620464589634451595715726894137353703581688236651319042384096914189930827359104990339125600702388770355631338209042409912997066322913565755233, 5111290471875396921428642977019854300275057344938560249265410597941457399078028889, 1550172878427042048581175711004305712704687713604245356094500484144351963)

rsa.encrypt("hello".encode('utf-8'), pubkey)

b'x16rx17xbbOxdcQxa0xffxdfjxad]x1axc7x96xbcx94LxcfBx83GOxa9x18Syx94x13xcaxafN_xd2xd25xa9Etxa1xb6lxc9xb1~xc8xc1+x10x9bx90x06xc6xddxb4xeax86x00x13xf8x0bN~'

crypto = rsa.encrypt('hello'.encode('utf-8'), pubkey)
print(crypto)

b'x16b~x81x0bxd9xb8>x1ex0fKxd9KY5xf1nx80x:xc2w/QFxa6!&xd0+q!xf5x14xe7xe9=Nx1dx0cxdd"6x80xa4xabxb8xf5=xcbxb05x07xf4xb0xa6xe9xe4DbXx99x17x8d'

rsa.decrypt(crypto, privkey)

b'hello'

安装部署参考的官方文档在这里https://paddlepaddle.github.io/PaddleOCR/latest/ppocr/quick_start.html#211

1.搭建一个python的虚拟环境.

mkdir paddleocc
cd paddleocr
python -m venv .
cd bin
source activate

然后在命令行的提示语中就有(paddleocr)这个提示信息了.

2.安装相应的依赖

pip install --upgrade pip
pip install pysocks -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install paddlepaddle -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install paddleocr -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pymupdf -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pyfftw -i https://pypi.tuna.tsinghua.edu.cn/simple
sudo apt install -y ccache
sudo apt install libgomp1

3.验证安装是否正确

paddleocr -h

4.转码pdf文件

paddleocr --image_dir ./2.pdf --use_angle_cls true --use_gpu false
paddleocr --image_dir ./2.pdf --use_angle_cls true --use_gpu false --savefile true

输出如下:

[2025/04/22 11:11:53] ppocr INFO: for usage help, please use paddleocr --help
[2025/04/22 11:11:53] ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=False, use_xpu=False, use_npu=False, use_mlu=False, use_gcu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir='./2.pdf', page_num=0, det_algorithm='DB', det_model_dir='/home/hesy/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='quad', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='/home/hesy/.paddleocr/whl/rec/ch/ch_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='/media/hesy/Elements/python-all/paddleocr/lib/python3.11/site-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=True, cls_model_dir='/home/hesy/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, onnx_providers=False, onnx_sess_options=False, return_word_box=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, formula_algorithm='LaTeXOCR', formula_model_dir=None, formula_char_dict_path=None, formula_batch_num=1, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, formula=False, ocr=True, recovery=False, recovery_to_markdown=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='ch', det=True, rec=True, type='ocr', savefile=True, ocr_version='PP-OCRv4', structure_version='PP-StructureV2')
[2025/04/22 11:11:54] ppocr INFO: ./2.pdf

最后生成的文件在output目录下,如下所示:

head 2.txt 

[[[132.0, 132.0], [552.0, 132.0], [552.0, 211.0], [132.0, 211.0]],
('财富的真相', 0.9918341636657715)] [[[129.0, 232.0], [555.0, 233.0],
[555.0, 262.0], [129.0, 261.0]], ('一种学校不教却人人需要的知识',
0.9969227910041809)] [[[300.0, 326.0], [384.0, 326.0], [384.0, 348.0], [300.0, 348.0]], ('李笑来著', 0.9979466795921326)] [[[248.0, 942.0],
[309.0, 942.0], [309.0, 965.0], [248.0, 965.0]], ('WGS',
0.6117124557495117)] [[[320.0, 948.0], [436.0, 948.0], [436.0, 966.0], [320.0, 966.0]], ('广东经济出版社', 0.9972420930862427)]

出现的几个错误解决:
错误1:
ERROR: Could not install packages due to an OSError: Missing dependencies for SOCKS support.
WARNING: There was an error checking the latest version of pip.

解决办法:

unset all_proxy
unset ALL_PROXY
pip install pysocks -i https://pypi.tuna.tsinghua.edu.cn/simple

错误2:
运行paddleocr -h的时候报错,错误如下:
/home/hesy/paddleocr/lib/python3.11/site-packages/paddle/utils/cpp_extension/extension_utils.py:711: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
warnings.warn(warning_message)

解决办法:

sudo apt install -y ccache

错误3:
运行paddleocr -h的时候报错,错误如下:
ImportError: libgomp-24e2ab19.so.1.0.0: cannot open shared object file: No such file or directory

解决办法:

pip install pyfftw
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/paddleocr/lib/python3.11/site-packages/pyFFTW.libs