Filter löschen
Filter löschen

Regular expression help for capturing tokens from a c++ if_else function block

1 Ansicht (letzte 30 Tage)
I am trying to convert some c++ code matlab code and need help because trying to capture the conditions on the if_else statements. The sample code is posted below, normally it is much longer and contains many sets of the same repeating piecewise constraints with different functions to evaluate.
T[4][0] = if (x <= 2.0E2) {
t4 = x*-7.43939368315E2;
} else {
if (2.0E4 < x) {
t4 = x*-3.15202357052E2;
} else {
if ((2.0E2 < x) && (x <= 1.0E3)) {
t4 = -x*(x*-3.3581574807E-2+log(x)*4.7023733515E1+(log(x)*2.400503067842E3)/x+1.09445880410878E5/x+1.0/(x*x)*6.1158591370349E4+(x*x)*1.9946376054E-5-(x*x*x)*7.445595608E-9+(x*x*x*x)*1.243670758E-12-1.11582896139E2);
} else {
if ((1.0E3 < x) && (x <= 6.0E3)) {
t4 = -x*(x*-2.3265021934E-3+log(x)*4.8602554649E1+(log(x)*1.5974720059903E4)/x+3.623374787802E4/x+1.0/(x*x)*1.897212861710375E6+(x*x)*1.9151215358E-7-(x*x*x)*1.2237095959E-11+(x*x*x*x)*3.95116007E-16-1.62570932876E2);
} else {
if ((6.0E3 < x) && (x <= 2.0E4)) {
t4 = -x*(x*-1.6249864554E-1+log(x)*2.049900452325E3+(log(x)*6.16116287553448E6)/x-4.067298995421662E7/x+1.0/(x*x)*(5.126459124735035E23/1.40737488355328E14)+(x*x)*4.514672712E-6-(x*x*x)*9.025238189E-11+(x*x*x*x)*8.209541318E-16-1.8977498210781E4);
} else {
t4 = NAN;
}
}
}
}
};
T[5][0] = if (x <= 2.0E2) {
t5 = x*-6.99993596709E2;
} else {
if (6.0E3 < x) {
t5 = x*-2.95957496736E2;
} else {
if ((2.0E2 < x) && (x <= 1.0E3)) {
t5 = x*(x*-5.9732340052E-2+log(x)*1.344708739E1+(log(x)*6.623440520686E3)/x-1.30277758259044E5/x+1.0/(x*x)*1.93975305124744E5+(x*x)*2.359195784E-5-(x*x*x)*7.135910089E-9+(x*x*x*x)*1.0512057259E-12-2.8910827165E2);
} else {
if ((1.0E3 < x) && (x <= 6.0E3)) {
t5 = -x*(x*1.8859601673E-5+log(x)*3.6779467978E1+(log(x)*2.62795681346E2)/x+1.11227336090838E5/x-1.0/(x*x)*7.24982080986207E5+(x*x)*4.872002157E-8-(x*x*x)*9.084382373E-12+(x*x*x*x)*6.625791502E-16-4.3668943967E1);
} else {
t5 = NAN;
}
}
}
}
I have tried (and other variations)
' tokens = regexp(funcode,'if\s\((.+)\)\s\{','tokens') '
but it captures the whole segment after the first 'if (' and ends with the last ') {'
I would also like to eventually capture tokens for the expressions for 't4 = ... ' etc with each condition.
Any help would be greatly appreciated. P.S. Matlab needs to make MatlabFunction() work for piecewise symbolic functions.
  2 Kommentare
Joseph
Joseph am 12 Jul. 2011
This method can not be applied because of the simplicity of the identifiers available for the delimiters.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Walter Roberson
Walter Roberson am 12 Jul. 2011
Your most immediate problem is that .+ captures as many characters as possible and then backtracks only as much as is necessary to match the rest of the expression. If you use .+? then that will capture only as many characters as are necessary to match the rest of the expression.
However, you have a deeper problem that you really only want to stop when you encounter the balancing ')'. Determining whether a delimiter is balanced or not is something that is known to not be theoretically possible in pure regular expressions. MATLAB's "regular expressions" are, though, extensions to the standard regular expressions. MATLAB's expressions have much in common with Perl's "regular expressions", and it is possible in Perl to find the balancing delimiter. It has been a number of years since I looked at the relevant (tricky) Perl code; I think it is possible in the regular expressions that MATLAB provides, but I would not want to try to reinvent the technique -- too ugly and hard to debug.
The easiest thing to do might be to use MATLAB's perl() command to call a perl routine to do the parsing for you, having looked in the Perl FAQ to find the mechanism.
  1 Kommentar
Joseph
Joseph am 12 Jul. 2011
Haha, that might work. Though I have no prior experience in perl. It is funny because I have been trying to develop this code to turn arrays of piecewise symbolic functions into matlab code, but to do it I have to create strings of function blocks from ccode(). ccode() is able to generate code from piecewise symbolic functions for example when defining specific heats or free energies over various temperature ranges when I need it for numeric problems such as constrained minimization. I have some code that works fine making the matlab functions, but is redundant because it retains all the redundant evaluations of the correct interval for each expression when they can all be grouped into one.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Oleg Komarov
Oleg Komarov am 12 Jul. 2011
tokens = regexp(s,'if\ ([\(\)\w\ \.><=&]+)\s+{','tokens')

Kategorien

Mehr zu Characters and Strings finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by