[regex-coach] ".+" and ".+?" with optional parenthesized text
John Clements
johnjc-regex at publicinfo.net
Sun Aug 22 16:37:36 UTC 2004
That is absolutely brilliant, Edi! Thank you so much!
At 14:55 22/08/04, you wrote:
>On Sun, 22 Aug 2004 14:04:47 +0100, John Clements
><johnjc-regex at publicinfo.net> wrote:
>
> > I ran the pattern with "i" checked.
>
>I guess you also had "s" checked because your target string contained
>line breaks.
If it had line breaks they were introduced by the mailer(s) because Regex
Coach didn't show any.
> > What I want it to do is match the string from the beginning through
> > "between", and when there is no instance of "between", I want it to
> > match the entire string.
>
>This regex should work:
>
>^\s*An appeal.+?(Joined )?Cases? ?t ?[-] ?\d{1,3}\/ ?\d{2}(.+?between|.*)
I was just looking over some tutorial material which was talking about what
you enclose in parentheses and what not, and it hadn't dawned on me that it
was relevant to my problem!
Yes, putting the ".+?" inside the parenthesis does the trick. And the "|.*"
makes perfect sense. It says so directly "or the rest of the string".
I had settled for a solution that used the "greedy" version of ".+" before
"between", which in the presence of a second instance of the word "between"
would have brought in unwanted text. Now it's just right. I really
appreciate this!
>The behaviour you saw was right. (As a rule of thumb Regex Coach is
>always right as long as it does the same as Perl... :)
Yeah, that's what I thought, too. :)
>You had ".+?(between)?" which meant "match as few characters as
>possible up to ..." where ... was "the string 'between' OR ANYTHING"
>because you made 'between' optional, i.e. you regex was equivalent to
>".+?". So, the regex engine matched exactly zero characters.
>
>Does that help?
Indeed, indeed! Thanks for that explanation, too. I need to see the logic
of something to really absorb it. I had accepted what I saw as the
limitation of the regex engine but without understanding its logic hadn't
worked out how to get that refinement that I needed.
All the best, John
John Clements
john.clements at publicinfo.net
+44(0)20 8959-6432
http://www.publicinfo.net
PublicInfo.Net Ltd.
29 Gibbs Green
Edgware, Middlesex
United Kingdom HA8 9RS
More information about the regex-coach
mailing list