how to remove horizontal and vertical lines from image?

73 views (last 30 days)
Hi, i'am working on hand written numbers OCR, to do so i need to segment each digit of the number (4 to 7 digits) and then feed these numbers to neural to classify number, every thing is ok but the segmentation due to the connection between box lines (that contain the number) and the number as in the image.
So, is there possible way to remove lines?

Accepted Answer

Image Analyst
Image Analyst on 21 Aug 2021
Try this for a start. Continue development on your project to generalize it for your other images.
% Demo by Image Analyst
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 20;
%--------------------------------------------------------------------------------------------------------
% READ IN IMAGE
grayImage = imread('numbers.jpeg');
%--------------------------------------------------------------------------------------------------------
% Display the image.
subplot(2, 2, 1);
imshow(grayImage, []);
axis('on', 'image');
title('Original Image', 'FontSize', fontSize, 'Interpreter', 'None');
impixelinfo;
hFig = gcf;
hFig.WindowState = 'maximized'; % May not work in earlier versions of MATLAB.
drawnow;
% Get the dimensions of the image.
% numberOfColorChannels should be = 1 for a gray scale image, and 3 for an RGB color image.
[rows, columns, numberOfColorChannels] = size(grayImage)
if numberOfColorChannels > 1
% It's not really gray scale like we expected - it's color.
% Extract the red channel (so the magenta lines will be white).
grayImage = grayImage(:, :, 2);
end
% Crop off white frame.
mask = bwareafilt(grayImage < 255, 1);
props = regionprops(mask, 'BoundingBox');
subplot(2, 2, 2);
grayImage = imcrop(grayImage, props.BoundingBox);
imshow(grayImage);
axis('on', 'image');
title('Gray Scale Image', 'FontSize', fontSize, 'Interpreter', 'None');
impixelinfo;
drawnow;
subplot(2, 2, 3);
imhist(grayImage);
grid on;
xlabel('Gray Level', 'FontSize', fontSize);
ylabel('Pixel Count', 'FontSize', fontSize);
% Interactively threshold the image.
lowThreshold = 0;
highThreshold = 51;
% Ref: https://www.mathworks.com/matlabcentral/fileexchange/29372-thresholding-an-image?s_tid=srchtitle
% [lowThreshold, highThreshold] = threshold(lowThreshold, highThreshold, grayImage)
% Threshold it.
mask = grayImage < highThreshold;
% Find out areas.
props = regionprops(mask, 'Area');
allAreas = sort([props.Area], 'Descend')
subplot(2, 2, 4);
% Keep areas only if they're bigger than 500 pixels and less than 2000 pixels.'
mask = bwareafilt(mask, [500, 20000]);
imshow(mask);
% Label each blob with 8-connectivity, so we can make measurements of it
[labeledImage, numberOfBlobs] = bwlabel(mask, 8);
% Apply a variety of pseudo-colors to the regions.
coloredLabelsImage = label2rgb (labeledImage, 'hsv', 'k', 'shuffle');
% Display the pseudo-colored image.
imshow(coloredLabelsImage);
caption = sprintf('Final Mask');
title(caption, 'FontSize', fontSize);

More Answers (3)

DGM
DGM on 21 Aug 2021
Edited: DGM on 21 Aug 2021
I'll throw in my own. This uses hough() and houghpeaks() to rectify the image so that rectangular filters will align with the box edges. The rest is just regular filtering. Parameters of interest are the binarization sensitivity, the filter widths and the areas specified in the bwareaopen() calls.
A more robust solution may make better use of hough() on a restricted angle range for better identification of box edges with less reliance on the filtering to segregate the box from the text. That's just an idea, not a recommendation with any certainty.
inpict = imread('ocrbox.jpg');
inpict = rgb2gray(cropborder(inpict,[NaN NaN NaN NaN])); % cropborder is from MIMT on FEX
inpict = ~imbinarize(inpict,'adaptive','foregroundpolarity','dark','sensitivity',0.15);
% find angle of dominant line, rectify
[H,T,R] = hough(inpict);
P = houghpeaks(H,1,'threshold',ceil(0.3*max(H(:))));
th0 = 90-T(P(1,2));
inpict = imrotate(double(inpict),-th0,'bilinear','crop');
w = 100;
linemask = medfilt2(inpict,[1 w]) + medfilt2(inpict,[w 1]);
linemask = bwareaopen(linemask>0.5,500);
linemask = imdilate(linemask,strel('disk',5));
inpict = bwareaopen(inpict & ~linemask,200);
imshow(inpict)
As the comment states, cropborder() is from MIMT on the File Exchange. That was only necessary because the provided image was a saved figure with unnecessary padding. Since you have the original image, you shouldn't need to crop anything.
Don't save images by saving the figure. Save the actual image instead of what amounts to a screenshot of the image.

Image Analyst
Image Analyst on 20 Aug 2021
See
You might also try a morphological opening to erase the lines, though it might affect your numbers as well.
Or you could try hough() or houghlines() to see if you can find the lines and erase only along the lines.
An algorithm that might work on this particular image might fail on some other image.
Good luck.
  1 Comment
mutasam hazzah
mutasam hazzah on 21 Aug 2021
thanks for the reply.
actually i tried the OCR im matlab and the hough() also but i couldn't get result due to the connection between the lines and the numbers.

Sign in to comment.


darova
darova on 21 Aug 2021
Here is an attempt. Use bwlabel then to separate each region and analyze
I0 = imread('image.jpeg');
I1 = im2bw(I0,0.2);
I2 = imdilate(I1,ones(2)); %imopen(I1,ones(3));
imshow(I2)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by