Extracting strings between different characters from a cell array without loop

2 Ansichten (letzte 30 Tage)
Hello,
I would like to rewrite the code below to remove the loop. I assume I need to use cellfun but after several attempts I am not managing.
In the following cell array, I need to extract the characters between "LS=" and "DS=", between "DS=" and "CS=", and between "CS=" and "]" and save them in their corresponsing field in a table.
This is how my orgiginal cell array looks like (I've cut it to 3 rows but there are actually a lot more in my dataset):
commentO =
3×1 cell array
{'LS=Students201509DS=Teachers201509CS=ConfigVS=English]' }
{'LS=Preschoolers201801DS=Students201910CS=AbsVS=Italian]' }
{'LS=Preschoolers201902DS=Assistants201902CS=NAVS=Italian]'}
This is the code I have written to extract the strings I need and store them in their corresponding field in a table:
commentO = {'LS=Students201509DS=Teachers201509CS=ConfigVS=English]'; 'LS=Preschoolers201801DS=Students201910CS=AbsVS=Italian]'; 'LS=Preschoolers201902DS=Assistants201902CS=NAVS=Italian]'};
MyDataTable = table;
MyDataTable.LS = cell(size(commentO));
MyDataTable.DS = cell(size(commentO));
MyDataTable.CS = cell(size(commentO));
MyDataTable.VS = cell(size(commentO));
for i = 1:length(commentO)
tempStr = commentO{i,1};
MyDataTable.LS{i} = tempStr(strfind(tempStr, 'LS=')+3:(strfind(tempStr, 'DS=')-1));
MyDataTable.DS{i} = tempStr(strfind(tempStr, 'DS=')+3:(strfind(tempStr, 'CS=')-1));
MyDataTable.CS{i} = tempStr(strfind(tempStr, 'CS=')+3:(strfind(tempStr, 'VS=')-1));
MyDataTable.VS{i} = tempStr(strfind(tempStr, 'VS=')+3:(end-1));
end
It takes ages so if someone could help me out with the correct cellfun call to do this much more efficiently. I would be very grateful.
Thank you very much.
Best Regards;
Cecile
  1 Kommentar
dpb
dpb am 4 Jan. 2020
cellfun alone is unlikely to make much difference performance-wise--it's just a loop under the hood.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Stephen23
Stephen23 am 4 Jan. 2020
>> C = {'LS=Students201509DS=Teachers201509CS=ConfigVS=English]'
'LS=Preschoolers201801DS=Students201910CS=AbsVS=Italian]'
'LS=Preschoolers201902DS=Assistants201902CS=NAVS=Italian]'};
>> [M,S] = regexp(C,'[A-Z]S=','match','split');
>> M = strrep(vertcat(M{:}),'=','');
>> S = strrep(vertcat(S{:}),']','');
>> T = cell2table(S(:,2:end),'VariableNames',M(1,:))
T =
LS DS CS VS
____________________ __________________ ________ _________
'Students201509' 'Teachers201509' 'Config' 'English'
'Preschoolers201801' 'Students201910' 'Abs' 'Italian'
'Preschoolers201902' 'Assistants201902' 'NA' 'Italian'

Weitere Antworten (0)

Kategorien

Mehr zu Cell Arrays finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by