[cl-ppcre-devel] Buffered multi-line question

Sébastien Saint-Sevin seb-cl-mailist at matchix.com
Thu Oct 14 21:34:30 UTC 2004


> > ==> Here is the trouble: how to make the match abort when position
> > 17 is reach. Coz from there, the filter will always returns nil. So
> > the last 30 calls are wasted time.
>
> Well, this is Common Lisp...
>
>   CL-USER> (defvar *my-string* "line1 word1 word2
>   line2 word1 word2
>   line3 word1 word2")
>   *MY-STRING*
>   CL-USER> (defvar *my-scanner*
>                    '(:sequence
>                       (:filter my-filter 0)
>                       :word-boundary
>                       (:greedy-repetition 1 nil :word-char-class)
>                       :word-boundary))
>   *MY-SCANNER*
>   CL-USER> (let ((end-of-first-line 17))
>                    (defun my-filter (pos)
>                       (format t "Called at: ~A~%" pos)
>                       (cond ((< pos end-of-first-line)
>                               pos)
>                             (t
>                               (throw 'stop-it nil)))))
>   ; Converted MY-FILTER.
>   MY-FILTER
>   CL-USER> (catch 'stop-it
>              (scan *my-scanner* *my-string*))
>   Called at: 0
>   0
>   5
>   #()
>   #()
>   CL-USER> (setf *my-scanner*
>                   '(:sequence
>                      (:filter my-filter 0)
>                      :word-boundary
>                      "line2"
>                      (:greedy-repetition 1 nil :word-char-class)
>                      :word-boundary))
>   (:SEQUENCE (:FILTER MY-FILTER 0) :WORD-BOUNDARY "line2"
>    (:GREEDY-REPETITION 1 NIL :WORD-CHAR-CLASS) :WORD-BOUNDARY)
>   CL-USER> (catch 'stop-it
>              (scan *my-scanner* *my-string*))
>   Called at: 0
>   Called at: 1
>   Called at: 2
>   Called at: 3
>   Called at: 4
>   Called at: 5
>   Called at: 6
>   Called at: 7
>   Called at: 8
>   Called at: 9
>   Called at: 10
>   Called at: 11
>   Called at: 12
>   Called at: 13
>   Called at: 14
>   Called at: 15
>   Called at: 16
>   Called at: 17
>   NIL
>


Throw & Catch, of course.
I'm just not very familiar with this kind of big jumps.
I should !!!!


> > I think the loop I'm speaking about is created by "insert-advance-fn"
>
> Yes. It's the normal loop that advances through the regular
> expression.
>
> > Last point, I can't access the position where the match actually has
> > started (the first of the fourth values returned by scan), so I have
> > no way to extract the current global match without using register.
>
> Sure you can:
>
>   CL-USER> (let (match-start)
>              (defun set-match-start (pos)
>                (setq match-start pos))
>              (defun show-match-start (pos)
>                (format t "Match start is ~A, pos is ~A~%"
>                        match-start pos)
>                pos))
>   ; Converted SET-MATCH-START.
>   ; Converted SHOW-MATCH-START.
>   SHOW-MATCH-START
>   CL-USER> (setf *my-scanner* '(:sequence (:filter set-match-start 0)
>                                           "abc"
>                                           (:filter show-match-start 0)
>                                           (:alternation #\x #\y)))
>   (:SEQUENCE (:FILTER SET-MATCH-START 0) "abc" (:FILTER
> SHOW-MATCH-START 0)
>    (:ALTERNATION #\x #\y))
>   CL-USER> (scan *my-scanner* "abczabcabcx")
>   Match start is 0, pos is 3
>   Match start is 4, pos is 7
>   Match start is 7, pos is 10
>   7
>   11
>   #()
>   #()
>
> Just make sure SET-MATCH-START is at the very beginning of your
> regular expression and not within a group or alternation or somesuch.
>

It just add a little work to craft the parse tree but that's OK.
It seems that filters are really powerful !!!

I've got everything I need for now. I will try all that & will give you some
feedback when it's done in a few days.

Finally, I just want to thank you very much, Edi, for all your help & work.

Cheers,
Sebastien.





More information about the Cl-ppcre-devel mailing list