c# - Number of conditions hit in a Regex containing multiple rules separated by OR condition -
i have regex or conditions in it. want find number of conditions satisfied out of conditions present in regex separated '|'.
example : (.*begin.*)|(.*middle.*)|(.*end.*)
i have string : "hello begin.hello middle."
now if see 2 of 3 or conditions in regex hit in rule. want find number of conditions hit.
i not want split regex based on '|' , apply each 1 individually.i want run entire regex @ once.
the order of submatches not begin-->middle-->end in string searching. it's random string apply regex contains conditions combined 1 regex. want know how many of conditions in regex got hit.
in short, isn't possible using standard alternation. once text has been matched, can't matched again. also, once expression satisfied, not continue searching. if regexes attempted match every possible permutation, extremely ineffecient , no 1 use them.
while question isn't addressed explicitly in documentation, can find, covered under topic of backtracking. see msdn's backtracking optional quantifiers or alternation constructs.
essentially, alternation list (.|.|.) creates opportunity backtracking. if first alternate doesn't match, second attempted. backtracking not occur, however, unless first alternate fails, , once match made other alternates ignored.
if want match multiple expressions, use lookaheads so:
string l_pattern = @"(?i)" + /*make regex case-insensitive*/ @"(?=(?<cond1>.*?begin)+)?" + @"(?=(?<cond2>.*?middle)+)?" + @"(?=(?<cond3>.*?end)+)?"; string l_input = "oops - put middle first!" + "hello begin.this begin."; var l_match = regex.match( l_input, l_pattern ); console.writeline( "cond1 matched {0} times.", l_match.groups["cond1"].captures.count ); console.writeline( "cond2 matched {0} times.", l_match.groups["cond2"].captures.count ); console.writeline( "cond3 matched {0} times.", l_match.groups["cond3"].captures.count ); console.readkey( true ); this output:
cond1 matches 2 times.
cond2 matches 1 times.
cond3 matches 0 times.
lookaheads not capture text, function kind of mini-regex within regex. essentially, expression no different running 3 expressions separately. (take care note each lookahead optional, otherwise entire expression fail if 1 of lookaheads failed.)
also note when using lookaheads have shown, order doesn't matter.
for more on lookaheds, see msdn's zero-width positive lookahead assertions. topic little big address on answer.
i can't i'd recommend approach on others - can difficult maintain if not familiar regexes , it's not effecient pattern, fits stated requirements.
Comments
Post a Comment