亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

使用 OpenCV 預處理圖像以進行 pytesseract OCR

使用 OpenCV 預處理圖像以進行 pytesseract OCR

牧羊人nacy 2023-07-27 10:02:27
我想使用 OCR (pytesseract) 來識別圖像中的文本,如下所示:我有數千支這樣的箭頭。到目前為止,過程如下:我首先調整圖像大?。ㄓ糜诹硪粋€過程)。然后我裁剪圖像以去除箭頭的大部分。接下來,我繪制一個白色矩形作為框架以消除更多噪音,但文本和圖像邊框之間仍然有距離,以便更好地識別文本。我再次調整圖像大小,以確保大寫字母的高度達到約 30 像素,最后我用閾值 150 對圖像進行二值化。完整代碼:import cv2image_file = '001.jpg'# load the input image and grab the image dimensionsimage = cv2.imread(image_file, cv2.IMREAD_GRAYSCALE)(h_1, w_1) = image.shape[:2]# resize the image and grab the new image dimensionsimage = cv2.resize(image, (int(w_1*320/h_1), 320))(h_1, w_1) = image.shape# crop imageimage_2 = image[70:h_1-70, 20:w_1-20]# get image_2 height, width(h_2, w_2) = image_2.shape# draw white rectangle as a frame around the number -> remove noisecv2.rectangle(image_2, (0, 0), (w_2, h_2), (255, 255, 255), 40)# resize image, that capital letters are ~ 30 px in heightimage_2 = cv2.resize(image_2, (int(w_2*50/h_2), 50))# image binarizationret, image_2 = cv2.threshold(image_2, 150, 255, cv2.THRESH_BINARY)# save image to filecv2.imwrite('processed_' + image_file, image_2)# tesseract part can be commented outimport pytesseractconfig_7 = ("-c tessedit_char_whitelist=0123456789AB --oem 1 --psm 7")text = pytesseract.image_to_string(image_2, config=config_7)print("OCR TEXT: " + "{}\n".format(text))問題在于箭頭中的文本永遠不會居中。有時我會使用上述方法刪除部分文本(例如在圖像 50A 中)。圖像處理中是否有一種方法可以以更優雅的方式消除箭頭?例如使用輪廓檢測和刪除?我對 OpenCV 部分比識別文本的 tesseract 部分更感興趣。任何幫助表示贊賞。
查看完整描述

2 回答

?
慕尼黑的夜晚無繁華

TA貢獻1864條經驗 獲得超6個贊

如果您查看圖片,您會發現圖像中有一個白色箭頭,這也是最大的輪廓(特別是如果您在圖像上繪制黑色邊框)。如果您制作一個空白蒙版并繪制箭頭(圖像上最大的輪廓),然后稍微腐蝕它,您可以對實際圖像和腐蝕蒙版執行每個元素的按位連接。如果不清楚,請查看底部代碼和注釋,您會發現它實際上非常簡單。


# imports

import cv2

import numpy as np


img = cv2.imread("number.png")  # read image

# you can resize the image here if you like - it should still work for both sizes

h, w = img.shape[:2]  # get the actual images height and width

img = cv2.resize(img, (int(w*320/h), 320))

h, w = img.shape[:2]


gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)  # transform to grayscale

thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]  # perform OTSU threhold

cv2.rectangle(thresh, (0, 0), (w, h), (0, 0, 0), 2)

contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[0]  # search for contours

max_cnt = max(contours, key=cv2.contourArea)  # select biggest one

mask = np.zeros((h, w), dtype=np.uint8)  # create a black mask

cv2.drawContours(mask, [max_cnt], -1, (255, 255, 255), -1)  # draw biggest contour on the mask

kernel = np.ones((15, 15), dtype=np.uint8)  # make a kernel with appropriate values - in both cases (resized and original) 15 is ok

erosion = cv2.erode(mask, kernel, iterations=1)  # erode the mask with given kernel


reverse = cv2.bitwise_not(img.copy())  # reversed image of the actual image 0 becomes 255 and 255 becomes 0

img = cv2.bitwise_and(reverse, reverse, mask=erosion)  # per-element bit-wise conjunction of the actual image and eroded mask (erosion)

img = cv2.bitwise_not(img)  # revers the image again


# save image to file and display

cv2.imwrite("res.png", img)

cv2.imshow("img", img)

cv2.waitKey(0)

cv2.destroyAllWindows()

結果:

http://img1.sycdn.imooc.com//64c1d06f0001a66c02350199.jpg

查看完整回答
反對 回復 2023-07-27
?
慕尼黑5688855

TA貢獻1848條經驗 獲得超2個贊

你可以嘗試簡單的Python腳本:


import cv2

import numpy as np

img = cv2.imread('mmubS.png', cv2.IMREAD_GRAYSCALE)

thresh = cv2.threshold(img, 200, 255, cv2.THRESH_BINARY_INV )[1]

im_flood_fill = thresh.copy()

h, w = thresh.shape[:2]

im_flood_fill=cv2.rectangle(im_flood_fill, (0,0), (w-1,h-1), 255, 2)

mask = np.zeros((h + 2, w + 2), np.uint8)

cv2.floodFill(im_flood_fill, mask, (0, 0), 0)

im_flood_fill = cv2.bitwise_not(im_flood_fill)

cv2.imshow('clear text', im_flood_fill)

cv2.imwrite('text.png', im_flood_fill)

結果:

http://img1.sycdn.imooc.com//64c1d07c0001405700560047.jpg

查看完整回答
反對 回復 2023-07-27
  • 2 回答
  • 0 關注
  • 237 瀏覽
慕課專欄
更多

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯系客服咨詢優惠詳情

幫助反饋 APP下載

慕課網APP
您的移動學習伙伴

公眾號

掃描二維碼
關注慕課網微信公眾號