Why ocr function doesn't recognize the numbers?
24 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Adriano
am 16 Jan. 2018
Kommentiert: José Luis Sandoval
am 8 Jun. 2020
Hi,
I have the image below:
I need to capture the numbers through ocr function. Thus, I use this code to do it:
capture = imread('Capture.png');
my_image = imresize(capture, 1.4);
ocrResults = ocr(my_image,'CharacterSet','.0123456789');
recognizedText = ocrResults.Words;
However, ocr function only recognize some numbers. In fact, the output is a cell array 37x1 while I should have 39 rows:
{'18.33'}
{'1423' }
{'1' }
{'6.55' }
{'1' }
{'5.65' }
{'12.54'}
{'14.77'}
{'10.33'}
{'13.79'}
{'12.94'}
{'1255' }
{'1' }
{'1.70' }
{'9.84' }
{'10.71'}
{'9.74' }
{'9.98' }
{'933' }
{'9.00' }
{'7.22' }
{'3.02' }
{'7.45' }
{'7.10' }
{'6.56' }
{'6.28' }
{'5.86' }
{'5.40' }
{'5.01' }
{'4.57' }
{'4.10' }
{'174' }
{'3.39' }
{'3.011'}
{'2.71' }
{'2.33' }
{'2.118'}
Then, many numbers are worng. Please, somone can help me? Thanks!
0 Kommentare
Akzeptierte Antwort
Birju Patel
am 17 Jan. 2018
Bearbeitet: Birju Patel
am 17 Jan. 2018
Hi,
A little bit of pre-processing and using ROIs to specify where the words are will help. By default, OCR uses page layout analysis to determine blocks of text. In this case, the image doesn't look like a normal page of text (like a PDF article for example).
To make it easier for OCR, first you can find the location of the words using regionprops and then pass the location of the words (as bounding boxes) to the OCR function. See the code below and results. They look accurate. You may have to play around more with the pre-processing to make this robust for a collection of different images. But hopefully, this gives you an idea on how to proceed:
capture = imread('Captura.PNG');
% Increase image size by 3x
my_image = imresize(capture, 3);
figure
imshow(my_image)
% Localize words
BW = imbinarize(rgb2gray(my_image));
BW1 = imdilate(BW,strel('disk',6));
s = regionprops(BW1,'BoundingBox');
bboxes = vertcat(s(:).BoundingBox);
% Sort boxes by image height
[~,ord] = sort(bboxes(:,2));
bboxes = bboxes(ord,:);
% Pre-process image to make letters thicker
BW = imdilate(BW,strel('disk',1));
% Call OCR and pass in location of words. Also, set TextLayout to 'word'
ocrResults = ocr(BW,bboxes,'CharacterSet','.0123456789','TextLayout','word');
words = {ocrResults(:).Text}';
words = deblank(words)
words =
39×1 cell array
{'0' }
{'18.33'}
{'0' }
{'14.23'}
{'0' }
{'16.55'}
{'0' }
{'15.65'}
{'12.64'}
{'14.77'}
{'10.83'}
{'13.79'}
{'12.94'}
{'12.55'}
{'0' }
{'11.70'}
{'9.84' }
{'10.71'}
{'9.74' }
{'9.98' }
{'9.33' }
{'9.00' }
{'7.22' }
{'8.02' }
{'7.45' }
{'7.10' }
{'6.56' }
{'6.28' }
{'5.86' }
{'5.40' }
{'5.02' }
{'4.57' }
{'4.10' }
{'3.74' }
{'3.39' }
{'3.00' }
{'2.71' }
{'2.33' }
{'2.08' }
4 Kommentare
Weitere Antworten (0)
Siehe auch
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!