[cl-ppcre-devel] Buffered multi-line question

Edi Weitz edi at agharta.de
Thu Oct 14 21:11:14 UTC 2004


On Thu, 14 Oct 2004 20:38:17 +0200, Sébastien Saint-Sevin <seb-cl-mailist at matchix.com> wrote:

> ==> Here is the trouble: how to make the match abort when position
> 17 is reach. Coz from there, the filter will always returns nil. So
> the last 30 calls are wasted time.

Well, this is Common Lisp...

  CL-USER> (defvar *my-string* "line1 word1 word2
  line2 word1 word2
  line3 word1 word2")
  *MY-STRING*
  CL-USER> (defvar *my-scanner*
                   '(:sequence
                      (:filter my-filter 0)
                      :word-boundary
                      (:greedy-repetition 1 nil :word-char-class)
                      :word-boundary))
  *MY-SCANNER*
  CL-USER> (let ((end-of-first-line 17))
                   (defun my-filter (pos)
                      (format t "Called at: ~A~%" pos)
                      (cond ((< pos end-of-first-line)
                              pos)
                            (t
                              (throw 'stop-it nil)))))
  ; Converted MY-FILTER.
  MY-FILTER
  CL-USER> (catch 'stop-it
             (scan *my-scanner* *my-string*))
  Called at: 0
  0
  5
  #()
  #()
  CL-USER> (setf *my-scanner*
                  '(:sequence
                     (:filter my-filter 0)
                     :word-boundary
                     "line2"
                     (:greedy-repetition 1 nil :word-char-class)
                     :word-boundary))
  (:SEQUENCE (:FILTER MY-FILTER 0) :WORD-BOUNDARY "line2"
   (:GREEDY-REPETITION 1 NIL :WORD-CHAR-CLASS) :WORD-BOUNDARY)
  CL-USER> (catch 'stop-it
             (scan *my-scanner* *my-string*))
  Called at: 0
  Called at: 1
  Called at: 2
  Called at: 3
  Called at: 4
  Called at: 5
  Called at: 6
  Called at: 7
  Called at: 8
  Called at: 9
  Called at: 10
  Called at: 11
  Called at: 12
  Called at: 13
  Called at: 14
  Called at: 15
  Called at: 16
  Called at: 17
  NIL

> I think the loop I'm speaking about is created by "insert-advance-fn"

Yes. It's the normal loop that advances through the regular
expression.

> Last point, I can't access the position where the match actually has
> started (the first of the fourth values returned by scan), so I have
> no way to extract the current global match without using register.

Sure you can:

  CL-USER> (let (match-start)
             (defun set-match-start (pos)
               (setq match-start pos))
             (defun show-match-start (pos)
               (format t "Match start is ~A, pos is ~A~%"
                       match-start pos)
               pos))
  ; Converted SET-MATCH-START.
  ; Converted SHOW-MATCH-START.
  SHOW-MATCH-START
  CL-USER> (setf *my-scanner* '(:sequence (:filter set-match-start 0)
                                          "abc"
                                          (:filter show-match-start 0)
                                          (:alternation #\x #\y)))
  (:SEQUENCE (:FILTER SET-MATCH-START 0) "abc" (:FILTER SHOW-MATCH-START 0)
   (:ALTERNATION #\x #\y))
  CL-USER> (scan *my-scanner* "abczabcabcx")
  Match start is 0, pos is 3
  Match start is 4, pos is 7
  Match start is 7, pos is 10
  7
  11
  #()
  #()

Just make sure SET-MATCH-START is at the very beginning of your
regular expression and not within a group or alternation or somesuch.

Cheers,
Edi.





More information about the Cl-ppcre-devel mailing list