python - Highly inconsistent OCR result for tesseract -


enter image description here

this original screenshot , cropped image 4 parts , cleared background of image extent can possibly tesseract detects last column here , ignores rest.

enter image description here

the output tesseract shown there blank spaces remove while processing result

  femme—fatale.      darklordeia   achinesen1gg4    noob_diablo_ 

enter image description here

the output tesseract shown there blank spaces remove while processing result

kicked.  nosnoel chikizd death_eag|e_42  chai—. 

enter image description here

3579 10 1 7 148  2962 3 o 7 101  2214 2 2 7 99  2205 1 3 6 78 

enter image description here

8212  7198  6307  5640  4884  15  40  40  6o  80  80 

am dumping output of

result = `pytesseract.image_to_string(image.open("d:/newapproach/b&w"+str(i)+".jpg"),lang="new_language")` 

but not know how proceed here consistent result.is there anyway if can force tesseract recognize text.because in trainer tesseract on default recognition scan it's not detected once select area scanned , received correctly

code

my suggestion perform ocr on full image.

i have preprocessed image grayscale image.

import cv2 image_obj = cv2.imread('1d4bb.jpg') gray = cv2.cvtcolor(image_obj, cv2.color_bgr2gray) cv2.imwrite("gray.png", gray) 

i have run tesseract on image terminal , accuracy seems on 90% in case.

tesseract gray.png out  3579 10 1 7 148 3142 9 o 5 10 2962 3 o 7 101 2214 2 2 7 99 2205 1 3 6 78 score kills assists deaths connection 8212 15 1 4 4o 7198 7 3 6 40 6307 6 1 5 60 5640 2 3 6 80 4884 1 1 5 

below few suggestions -

  1. do not use image_to_string method directly converts image bmp , saves in 72 dpi.
  2. if want use image_to_string override save image in 300 dpi.
  3. you can use run_tesseract method , read output file.

image on ran ocr. enter image description here

another approach problem can crop digits , deep neural network prediction.


Comments

Popular posts from this blog

ZeroMQ on Windows, with Qt Creator -

unity3d - Unity SceneManager.LoadScene quits application -

python - Error while using APScheduler: 'NoneType' object has no attribute 'now' -