To extract the last sub string from strings

96 Ansichten (letzte 30 Tage)
Smithy
Smithy am 2 Sep. 2022
Verschoben: Image Analyst am 2 Sep. 2022
Hello everybody,
I would like to extract the last string just before 'EP'.
I tried first split the string with strsplit function. and then get the last right string using for-loop.
I assume that it can be done without using for-loop.
Please give me some help how to make the code without using for-loop.
load input.mat
temp = cellfun(@(x) strsplit(x, {' ','EP'}), input, 'UniformOutput', false); % split the string
for i=1:length(temp)
num(i,:) = str2double(temp{i}(end-1)); % fill the last right string in cells
end

Akzeptierte Antwort

Dyuman Joshi
Dyuman Joshi am 2 Sep. 2022
Bearbeitet: Dyuman Joshi am 2 Sep. 2022
You can use extractBetween with index of last space character and the common denominator at the end of the string ('EP')
load input.mat
num=cellfun(@(x) str2double(extractBetween(x,find(x==' ',1,'last')+1,'EP')), input)
num = 30×1
11.0000 11.0000 8.0000 8.0000 11.4000 11.4000 11.0000 11.0000 8.0000 8.0000
  1 Kommentar
Smithy
Smithy am 2 Sep. 2022
Wow.. Thank you very mcuh it works really well. I really appreciate with it.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (5)

Stephen23
Stephen23 am 2 Sep. 2022
Avoid slow CELLFUN, STR2NUM, REGEXP, etc.
One SSCANF call is probably the most efficient approach by far, as well as being very simple:
S = load('input.mat');
C = S.input
C = 30×1 cell array
{'N88 220D/2 11EP' } {'N88 220D/2 11EP' } {'N88 260D/2 8EP' } {'N88 260D/2 8EP' } {'A35000D+N14 1260D 11.4EP'} {'A35000D+N14 1260D 11.4EP'} {'N88 220D/2 11EP' } {'N88 220D/2 11EP' } {'N88 260D/2 8EP' } {'N88 260D/2 8EP' } {'A35000D+N14 1260D 11.4EP'} {'A35000D+N14 1260D 11.4EP'} {'N88 220D/2 11EP' } {'N88 220D/2 11EP' } {'N88 260D/2 8EP' } {'N88 260D/2 8EP' } {'A35000D+N14 1260D 11.4EP'} {'A35000D+N14 1260D 11.4EP'} {'N88 220D/2 11EP' } {'N88 220D/2 11EP' } {'N88 260D/2 8EP' } {'N88 260D/2 8EP' } {'A35000D+N14 1260D 11.4EP'} {'A35000D+N14 1260D 11.4EP'} {'N88 220D/2 11EP' } {'N88 220D/2 11EP' } {'N88 260D/2 8EP' } {'N88 260D/2 8EP' } {'A35000D+N14 1260D 11.4EP'} {'A35000D+N14 1260D 11.4EP'}
V = sscanf([C{:}],'%*s %*s%fEP')
V = 30×1
11.0000 11.0000 8.0000 8.0000 11.4000 11.4000 11.0000 11.0000 8.0000 8.0000
  2 Kommentare
Dyuman Joshi
Dyuman Joshi am 2 Sep. 2022
Stephen, how do you know that cellfun, str2num, regexp etc are slower functions? And are there any resources where I can read more about this?
Stephen23
Stephen23 am 2 Sep. 2022
Bearbeitet: Stephen23 am 2 Sep. 2022
"how do you know that cellfun, str2num, regexp etc are slower functions?"
  1. many years of reading this and other forums, learning from the combined knowledge of many users.
  2. many years of writing unit tests for my own code (i.e. making modifications and comparing).
  3. reading the documentation, to know what features functions have and how to use them.
  4. knowledge about functions, e.g. STR2NUM calls EVAL inside (so is not optimised by the JIT engine), and CELLFUN by design must call a function handle repeatedly (slower than a loop).
Lets compare the answers given so far on this thread:
S = load('input.mat');
C = repmat(S.input,1e3,1); % bigger array -> easier to compare
timeit(@()funAtsushiUeno(C)) % REGEXP, CELLFUN, STR2NUM
ans = 0.5426
timeit(@()funDyumanJoshi(C)) % CELLFUN, EXTRACTBETWEEN, STR2DOUBLE
ans = 0.4321
timeit(@()funChunru(C)) % REGEXP, CELLFUN, STR2NUM
ans = 0.4083
timeit(@()funImageAnalyst(C)) % loop and indexing
ans = 0.6536
timeit(@()funKSSV(C)) % REGEXP, STRREP, STR2DOUBLE
ans = 0.3133
timeit(@()funS23(C)) % SSCANF
ans = 0.0377
So, my function is more than eight times faster than the next fastest function (from KSSV), as well as being the simplest. And I had a fair idea that would be the case, even before writing this test code.
"And are there any resources where I can read more about this?"
If you want to learn how to use MATLAB efficiently, my advice is to read this forum a lot. And when I write "a lot", I don't mean "just a little bit". And not just new threads: there are some really important topics that have been discussed in some old yet canonical threads on this forum.
V1 = funAtsushiUeno(C);
V2 = funDyumanJoshi(C);
V3 = funChunru(C);
V4 = funImageAnalyst(C);
V5 = funKSSV(C);
V6 = funS23(C);
isequal(V1(:), V2(:), V3(:), V4(:), V5(:), V6(:)) % checking the function outputs:
ans = logical
1
function num = funAtsushiUeno(C)
num = regexp(C,'([\d+-e.]+)EP\s*$','tokens');
num = [num{:}];
num = [num{:}];
num = cellfun(@str2num, num);
end
function num = funDyumanJoshi(C)
num=cellfun(@(x) str2double(extractBetween(x,find(x==' ',1,'last')+1,'EP')), C);
end
function y = funChunru(C)
s = regexp(C, '\s([\d\.]*)EP$', 'tokens');
y = cellfun(@(s) str2num(s{1}{1}), s);
end
function numbers = funImageAnalyst(data)
rows = numel(data);
numbers = zeros(rows, 1);
for row = 1 : rows
words = strsplit(data{row});
lastWord = words{end};
numbers(row) = str2double(lastWord(1:end-2));
end
end
function s = funKSSV(C)
expression = '\d+\.?\d*EP';
s = regexp(C,expression,'match') ;
s = strrep([s{:}]','EP','') ;
s = str2double(s);
end
function V = funS23(C);
V = sscanf([C{:}],'%*s %*s%fEP');
end

Melden Sie sich an, um zu kommentieren.


Chunru
Chunru am 2 Sep. 2022
load(websave("input.mat", "https://www.mathworks.com/matlabcentral/answers/uploaded_files/1114425/input.mat"))
input
input = 30×1 cell array
{'N88 220D/2 11EP' } {'N88 220D/2 11EP' } {'N88 260D/2 8EP' } {'N88 260D/2 8EP' } {'A35000D+N14 1260D 11.4EP'} {'A35000D+N14 1260D 11.4EP'} {'N88 220D/2 11EP' } {'N88 220D/2 11EP' } {'N88 260D/2 8EP' } {'N88 260D/2 8EP' } {'A35000D+N14 1260D 11.4EP'} {'A35000D+N14 1260D 11.4EP'} {'N88 220D/2 11EP' } {'N88 220D/2 11EP' } {'N88 260D/2 8EP' } {'N88 260D/2 8EP' } {'A35000D+N14 1260D 11.4EP'} {'A35000D+N14 1260D 11.4EP'} {'N88 220D/2 11EP' } {'N88 220D/2 11EP' } {'N88 260D/2 8EP' } {'N88 260D/2 8EP' } {'A35000D+N14 1260D 11.4EP'} {'A35000D+N14 1260D 11.4EP'} {'N88 220D/2 11EP' } {'N88 220D/2 11EP' } {'N88 260D/2 8EP' } {'N88 260D/2 8EP' } {'A35000D+N14 1260D 11.4EP'} {'A35000D+N14 1260D 11.4EP'}
s = regexp(input, '\s([\d\.]*)EP$', 'tokens');
y = cellfun(@(s) str2num(s{1}{1}), s)
y = 30×1
11.0000 11.0000 8.0000 8.0000 11.4000 11.4000 11.0000 11.0000 8.0000 8.0000
  1 Kommentar
Smithy
Smithy am 2 Sep. 2022
It works so well. Thank you alot. I really appreciate with it.

Melden Sie sich an, um zu kommentieren.


Atsushi Ueno
Atsushi Ueno am 2 Sep. 2022
Verschoben: Image Analyst am 2 Sep. 2022
This is a bit of a challenge because we need to deal with cell arrays with irregular number sizes.
I would use the regexp function.
load input.mat
num = regexp(input,'([\d+-e.]+)EP\s*$','tokens');
num = [num{:}];
num = [num{:}];
num = cellfun(@str2num, num)
num = 1×30
11.0000 11.0000 8.0000 8.0000 11.4000 11.4000 11.0000 11.0000 8.0000 8.0000 11.4000 11.4000 11.0000 11.0000 8.0000 8.0000 11.4000 11.4000 11.0000 11.0000 8.0000 8.0000 11.4000 11.4000 11.0000 11.0000 8.0000 8.0000 11.4000 11.4000

Image Analyst
Image Analyst am 2 Sep. 2022
Here's one way. Simple, straightforward, and intuitive. There may be more compact but cryptic methods though.
s = load('input.mat')
data = s.input; % DON'T call your variables input, which is a build in function!
rows = numel(data)
numbers = zeros(rows, 1);
for row = 1 : rows
words = strsplit(data{row});
lastWord = words{end};
numbers(row) = str2double(lastWord(1:end-2));
end
  1 Kommentar
Smithy
Smithy am 2 Sep. 2022
Thank you very much. I really really appreciate with it.

Melden Sie sich an, um zu kommentieren.


KSSV
KSSV am 2 Sep. 2022
expression = '\d+\.?\d*EP';
s = regexp(input,expression,'match') ;
s = strrep([s{:}]','EP','') ;
s = str2double(s)
  1 Kommentar
Smithy
Smithy am 2 Sep. 2022
Thank you alot. It helps me alot. I really really appreciate with it.

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Cell Arrays finden Sie in Help Center und File Exchange

Tags

Produkte


Version

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by