Spliing text between characters
3 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Hello everyone, I have a very long list of questions with their answers. but their pattern is the same. I took a photo ocr. This is the pattern question number, question, three dots and and answer. I have hundrerds of question and answer like that. I want to separete it into question and answer for csv. is it possible to do it?
I used extractbetween for question.
extractbetween(text,'.','...')
"640. En selektif etkili antibiyotik...Penisilinler"
but with very long list of text I am stucked with both question and the answer .
can someone help me or tell is it possible.
0 Kommentare
Antworten (2)
Mathieu NOE
am 1 Feb. 2021
hello Ongun
I created a small txt file containing these lines (as example) :
"640. En selektif etkili antibiyotik...Penisilinler"
"641. En selektif etkili antibiyotik...Penisilinler1"
"642. En selektif etkili antibiyotik...Penisilinler2"
"643. En selektif etkili antibiyotik...Penisilinler3"
"644. En selektif etkili antibiyotik...Penisilinler4"
then I tested this code that generated the attached xlsx file
lines = readlines('Document1.txt');
for ci =1:numel(lines)
Qnumber_str = extractBefore(lines(ci),'.'); % extract question number (with double quote)
Qnumber_str = strrep(Qnumber_str, '"', ''); % remove start double quote
Qnumber{ci} = str2num(Qnumber_str); % convert to num
Question{ci} = extractBetween(lines(ci),'.','...'); % extract question
Answer_str = extractAfter(lines(ci),'...'); % extract answer
Answer_str = strrep(Answer_str, '"', ''); % remove end double quote
Answer{ci} = Answer_str;
end
A = [Qnumber' Question' Answer'];
T = array2table(A, 'VariableNames',{'Q number','Question','Answer'})
writetable(T,'test.xlsx')
0 Kommentare
Walter Roberson
am 1 Feb. 2021
S = fileread('Document1.txt');
tokens = regexp(S, '^(?<Q>.+)\.{3}(?<A>).+)$', 'lineanchors', 'names', 'dotexceptnewline');
Questions = vertcat(tokens.Q);
Answers = vertcat(tokens.A);
T = table(Questions, Answers);
writetable(T, 'QandA.csv')
0 Kommentare
Siehe auch
Kategorien
Mehr zu Text Data Preparation finden Sie in Help Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!