regexprep incorrect multiple replacement
5 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Paolo
am 5 Jun. 2018
Kommentiert: Paolo
am 5 Jun. 2018
Let's say we have the following char vector as input:
str = 'abc(1,2,3)';
I would like to replace '1','2' and '3' with different numbers.
Let's say I want to replace the numbers with the following numbers:
rep = '{'5';'8';'3'};
My desired output is:
str = 'abc(5,8,3)';
The format for using regexprep is:
regexprep(str,expression,replace)
I have tried to solve the problem in two ways:
- One expression.
expression = '\d';
replace = {'5';'2';'3'};
regexprep(str,expression,replace)
ans = 'abc(3,3,3)'
The output is incorrect, despite the documentation stating:
If replace is a cell array of N character vectors and expression is a single character vector, then regexprep attempts N matches and replacements.
- Multiple expressions.
expression = {'\d';'\d';'\d'};
replace = {'5';'2';'3'};
regexprep(str,expression,replace)
ans = 'abc(3,3,3)'
The output for the second case is incorrect, despite the documentation stating:
If both replace and expression are cell arrays of character vectors, then they must contain the same number of elements. regexprep pairs each replace element with its corresponding element in expression.
In both cases regexprep is replacing all three matches using only the last value from the replace cell array, rather than all three.
What am I missing?
2 Kommentare
Stephen23
am 5 Jun. 2018
Bearbeitet: Stephen23
am 5 Jun. 2018
"The output is incorrect, despite the documentation stating:..."
"What am I missing?"
The output is correct in both cases. The documentation states that it "...attempts N matches and replacements": so it matches the digits and replaces them with cell one, then it starts afresh and matches the digits and replaces them with cell 2, then it starts afresh and matches the digits and replaces them with cell 3. Which is exactly the output you are getting.
Each time regexp starts parsing the string from the start again, whereas you assumed that it starts from where it finished replacing the last string. To get the behavior you want you will have to add a dynamic expression of some kind.
Akzeptierte Antwort
Walter Roberson
am 5 Jun. 2018
regexprep (S, {A, B }, { P, Q })
is the same as
regexprep( regexprep(S, A, P), B, Q)
That is, the first pair is applied to the entire string, and the second pair is applied to the string that results.
It appears to you that only the third was done because your replacement text happens to match the second and third pattern and got rereplaced.
The 'once' option will not solve the problem.
3 Kommentare
Walter Roberson
am 5 Jun. 2018
str = 'abc(1,2,3)';
regexprep(str, '\d+(\D+)\d+(\D+)\d+', '5$18$23')
The $1 in the replacement pattern matches the first () expression, the $2 matches the second () expression. So we match one or more digits, then remember the sequence of non-digits that follows that, then match another series of digits, then remember the sequence of non-digits that follows that, then match another series of digits. And we replace that all with fixed text followed by the first remembered series of non-digits, then fixed text followed by the second remembered series of non-digits, then more fixed text.
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Characters and Strings finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!