Multiple named tokens in a regexp

2 Ansichten (letzte 30 Tage)
Joseph
Joseph am 6 Dez. 2013
Bearbeitet: Cedric am 28 Dez. 2013
Hi all,
I am having a little trouble finding out if you can have named tokens that are optional in a regular expression.
My goal is to seperate a string such as:
'CO(g) [atm] = 1000'
where it has the form
'Parameter [Units] = Value '
However, some Parameter don't have Units
'name = carbon'
If I have a regular expression pattern
'(?<Parameter>[^\[=]+)\s?\[(?<Units>[^\]]+)\]\s?=\s?(?<Value>.*)'
this will only work if all three named tokens are present.
Is there a way of modifying this to make it capture either Parameters,Units,Value or Parameters,Value? I tried to use a none capturing grouping '(?:\[?<Units>[^\]]+)\])?' but that doesn't seem to work right.
Basically, can there be optionally captured Named Tokens? If so, how do you construct the regular expression.
UPDATE:
I used:
(?<parameter>[^\[=]+)\s?\[(?<units>[^\]]+)\]?\s?=\s?(?<value>.*)|(?<parameter>[^\[=]+)\s?=\s?(?<value>.*)
So,
'co [k] = 5'
Parameter = 'co '
Units = 'k'
Value = '5'
And,
'co = 5'
Parameter = 'co '
Units = []
Value = '5'
However, the regular expressions looks very unelegant due to the redundance after the '|'. Anyone have any suggestions how to make it look better?
  1 Kommentar
Cedric
Cedric am 28 Dez. 2013
Bearbeitet: Cedric am 28 Dez. 2013
The following is a bit simpler, but I wouldn't call it "more elegant" ..
pattern = '(?<parameter>\S+) \[?(?<unit>[^=\]]*)\]?\s*=\s*(?<value>\w+)' ;

Melden Sie sich an, um zu kommentieren.

Antworten (0)

Kategorien

Mehr zu Characters and Strings finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by