Selecting specific data from pdf

4 Ansichten (letzte 30 Tage)
Nathaniel Porter
Nathaniel Porter am 23 Feb. 2022
Beantwortet: Riya am 15 Sep. 2023
%Trying to obtain any values between 48-64 and corresponding values in the
%column to the right
%For example the first line with value 58 in third column and would like to also
%obtain 100 from it
%I tried extracting the pdf first but unsure of where to go from here
clear;
pages = [1:18];
str = extractFileText("data-01.pdf",'Pages',pages);

Antworten (1)

Riya
Riya am 15 Sep. 2023
Hello Nathaniel Porter,
As per my understanding, you want to obtain specific values from a PDF file. Such that values are between 48 and 64 in a specific column and want to retrieve the corresponding values in the column to the right.
You can follow the steps given below for the same:
% Split the text into lines
lines = splitlines(str);
% Initialize variables
result = [];
% Iterate over the lines
for i = 1:numel(lines)
line = lines{i};
% Use regular expressions to find values between 48 and 64 in the third column
pattern = '\d+\s+\d+\s+([48-64])\s+(\d+)';
match = regexp(line, pattern, 'tokens');
% If a match is found, extract the values
if ~isempty(match)
value = str2double(match{1}{1});
correspondingValue = str2double(match{1}{2});
% Store the values in the result
result = [result; value, correspondingValue];
end
end
% Display the result
disp(result);
For more information about regexp’, you can refer the following documentation:
I hope it helps!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by