[regex-coach] problem with dot '.' inside brackets

era+regex=coach at iki.fi era+regex=coach at iki.fi
Thu Apr 27 12:56:49 UTC 2006


On Wed, 26 Apr 2006 14:41:09 -0700, sites at brynmosher.com said:
> I'm actually trying to match html tags with small contents inside like;
> <tag> V </tag>
> with the expression similar to "/(>.{1,4}<[\S\s]*){4}/" and noticed the 
> behaviour when I changed the ".{1,4}" to "[.^<]{1,4}".

This also explains your comment about ^ not working. It needs to be
first in the character class, like [^.<]

But actually, I'm guessing your real problem is with the greediness of
the * operator, which would skip over as much as possible, and that is
why you have artificially constrained it with the {1,4} to only match a
few characters.

What you're actually looking for, then, is "a > followed by anything
except <", i.e. >[^<]*, yielding /(>[^<]*<[^>]*){4}/ ... or even, in
Perl-compatible regular expressions, the non-greedy *?, but that is a
bit hard to apply here without more knowledge of what you are actually
trying to match. (I still don't understand the significance of the final
{4}, for example. Or maybe you were meaning to say <.{1,4} but applying
the repeat to the wrong scope?)

Hope this helps,

/* era */

-- 
If this were a real .signature, it would suck less.  Well, maybe not.




More information about the regex-coach mailing list