[cl-ppcre-devel] Matching on very long strings.

Matthew D. Swank akopa.gmane.poster at gmail.com
Mon Sep 29 16:04:09 UTC 2008


On Mon, 29 Sep 2008 13:16:29 +0200
Sébastien Saint-Sevin <seb-cl-mailist at matchix.com> wrote:

> Hi Matthew,
> 
> You are probably not doing the same thing with the "line oriented
> approach" and the "full file in one string" approach.
> 
> With full file in, if not taking care of stopping the scan at end of
> each line (if you want a line by line scanning as you suggest by
> trying such an approach as well), I guess your are scanning until the
> end of the full string for each line (which for sure is very
> expensive).
> 
> But that's just a guess as I've only had a very quick look to your
> code :-)
> 
> Cheers,
> Sebastien.

Well, the lexer code is line agnostic; i.e. you could replace 'end
of each line' with any old stop.  What it does is adjust the start
index as it matches tokens.

One thing I did notice is that I read the file into an adjustable
vector, and that is the string I pass to the scanners.  I suppose ppcre
has to coerce that every time a scanner runs?

Matt
-- 
"You do not really understand something unless you can explain it to
your grandmother." -- Albert Einstein.



More information about the Cl-ppcre-devel mailing list