Efficient way to use regexp and contains and matching

38 Ansichten (letzte 30 Tage)
Tiasa Ghosh
Tiasa Ghosh am 7 Sep. 2018
Kommentiert: Tiasa Ghosh am 12 Sep. 2018
Hello!
I am using the following code to match two cell array contents. But it takes way too long to process. Can anybody suggest a better way to code the same thing?
validVar={};
str = {'abc==';'bac[2]';'fuh[2]';'fgh'};
list={'abc(1)';'cde';'fgh'};
for x=1:numel(list)
expression = sprintf('%s..',list{x});
for y=1:numel(str)
if ~isempty(regexp(str{y},expression,'match')) || contains(str{y},list{x})
validVar=[validVar;list{x}];
end
end
end
Also, the result gives me validVar with 'fgh' only but I want 'abc(1)' in the list as well since it is a part of str{1}. Is there a way to match the entries of list with str in such a way that even if a part of list entry matches part of str entry then it should be listed under validVar.
  3 Kommentare
Tiasa Ghosh
Tiasa Ghosh am 7 Sep. 2018
My mistake. I have edited the question with examples and more bugs. I am using regexp as one search pattern so that the expression match to be used can have extended parts and the condition turns true even if a part of the string matches with part of the input.
Greg
Greg am 8 Sep. 2018
First, your performance is suffering because you're looping over both lists. Both regexp and contains will work on a vector with a scalar, removing one of the loops.
Second, if you know how to use regexp expertly (this is not a dig - regexp is extremely powerful but even more difficult to master), you could do all of your checking with one expression.
Finally, your requirements are very ill-formulated. What in the word does "... even if a part of list entry matches part of str entry" mean? A part of bac[2] matches a part of cde - the c character. I'm sure this isn't what you had in mind, so you need more explicit rules for validVar.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Guillaume
Guillaume am 10 Sep. 2018
Right now, your double loops can be simplified to:
validVar = {};
for x = 1:numel(list)
if ~isempty(cell2mat(regexp(str, [list{x}, '(..)?'], 'once'))) %match either list{x} or list{x} followed by any two characters.
validVar = [validVar; list{x}];
end
end
which should be a lot faster.
However, I don't think that's exactly what you want. I agree with greg that it's not really clear what it is exactly you want. We need a very clear rule of what patterns you want to match and not match with the regex.
  1 Kommentar
Tiasa Ghosh
Tiasa Ghosh am 12 Sep. 2018
I realised where the question went vague and somehow the question isn't relevant to me now. Anyhow, thank you for your time and answer. Hope it helps somebody else in future . :)

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Characters and Strings finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by