python - Highly inconsistent OCR result for tesseract -

- September 15, 2014

this original screenshot , cropped image 4 parts , cleared background of image extent can possibly tesseract detects last column here , ignores rest.

the output tesseract shown there blank spaces remove while processing result

  femme—fatale.      darklordeia   achinesen1gg4    noob_diablo_

the output tesseract shown there blank spaces remove while processing result

kicked.  nosnoel chikizd death_eag|e_42  chai—.

3579 10 1 7 148  2962 3 o 7 101  2214 2 2 7 99  2205 1 3 6 78

8212  7198  6307  5640  4884  15  40  40  6o  80  80

am dumping output of

result = `pytesseract.image_to_string(image.open("d:/newapproach/b&w"+str(i)+".jpg"),lang="new_language")`

but not know how proceed here consistent result.is there anyway if can force tesseract recognize text.because in trainer tesseract on default recognition scan it's not detected once select area scanned , received correctly

code

my suggestion perform ocr on full image.

i have preprocessed image grayscale image.

import cv2 image_obj = cv2.imread('1d4bb.jpg') gray = cv2.cvtcolor(image_obj, cv2.color_bgr2gray) cv2.imwrite("gray.png", gray)

i have run tesseract on image terminal , accuracy seems on 90% in case.

tesseract gray.png out  3579 10 1 7 148 3142 9 o 5 10 2962 3 o 7 101 2214 2 2 7 99 2205 1 3 6 78 score kills assists deaths connection 8212 15 1 4 4o 7198 7 3 6 40 6307 6 1 5 60 5640 2 3 6 80 4884 1 1 5

below few suggestions -

do not use image_to_string method directly converts image bmp , saves in 72 dpi.
if want use image_to_string override save image in 300 dpi.
you can use run_tesseract method , read output file.

image on ran ocr.

another approach problem can crop digits , deep neural network prediction.

Search This Blog

ANy

python - Highly inconsistent OCR result for tesseract -

code

Comments

Post a Comment

Popular posts from this blog

ZeroMQ on Windows, with Qt Creator -

ios - MKAnnotationView layer is not of expected type: MKLayer -

python - Error while using APScheduler: 'NoneType' object has no attribute 'now' -