[cl-ppcre-devel] Buffered multi-line question

Sébastien Saint-Sevin seb-cl-mailist at matchix.com
Thu Oct 14 16:18:46 UTC 2004


> > How can I then abort the scan quickly, while avoiding funcalling the
> > filter with the rest of the string ? Something like (setf
> > *start-pos* end-of-string-value) ?
>
> No, never change these internal values unless you're looking for
> trouble - see docs. Just return NIL from the filter. (I suppose you're
> talking about the 0.9.0 filters here.)
>
> Something like
>
>   (defvar *max-start-pos* 0)
>
>   (defun my-filter (pos)
>     (and (< pos *max-start-pos*) pos))
>
>   (scan '(:sequence ... (:filter my-filter 0) ...) target)
>
> should assure that there's only a match if the position between the
> first ... and the second ... is below *MAX-START-POS*.
>
> The zero is optional but it'll potentially help the regex engine to
> optimize the scanner depending on the rest of the parse tree.
>

The majority of regex I'm using are unfortunately not optimizable.

Going back to my buffer. Let's say I'm looking at ten lines at a time. I
want start to occurs only at first line and I can do it with filters (that's
great !). But the engine will still continue moving forward into the string
for the nine remaining lines, and it will call my filter for each position
in each line to just get nil everytime.

So the question for forcing a full abort immediatly and not calling so many
times the filter.
In fact this is the case for all filter that once it has returned nil, will
return nil forever (and are in a position in the parse tree where they can't
be shadowed by some backtracking!).

I know it's an optimization problem but I'm running regex on big files...

Cheers,
Sebastien.





More information about the Cl-ppcre-devel mailing list