[climacs-devel] Re: Regexs in climacs

Lawrence Mitchell wencel at gmail.com
Sun Jan 30 17:59:42 UTC 2005


Lawrence Mitchell wrote:

> after my rather abortive attempts to try and define a
> delete-other-windows the other day, I thought I'd look at maybe
> supporting some kind of regex search in climacs.

Here's a rather trivial buffer-string based idea to demonstrate
that things [cw]ould work :).

(define-named-command com-re-search-backward ()
  (let ((regex (accept 'string
                       :prompt "RE search backward")))
    (re-search-backward regex (buffer (current-window))
                        (point (current-window)))))

(defun re-search-forward (regex buffer offset)
  (let ((string (coerce (buffer-sequence buffer offset (size buffer)) 'string)))
    (multiple-value-bind (m-start m-end r-starts r-ends)
        (cl-pprcre:scan regex string)
      (when m-start
        (incf (offset (point buffer)) m-end)))))

(defun re-search-backward (regex buffer mark)
  (let ((string (coerce (buffer-sequence buffer 0 (offset mark))
                        'string)))
    (multiple-value-bind (m-start m-end r-starts r-ends)
        (cl-ppcre:scan regex string)
      (when m-start
        (decf (offset mark) (- (offset mark) m-start))))))

(define-named-command com-re-search-forward ()
  (let ((regex (accept 'string
                       :prompt "RE Search")))
    (re-search-forward regex (buffer (current-window))
                       (point (current-window)))))

[...]

> Alternately, find a nice way of hooking the cl-ppcre way of doing
> things into climacs' mode of operation.

Well, I've emailed Edi about this, and I enclose what he had to
say about the matter, both regarding Robert's worries of
incompatible licenses, and implementation-wise.

Basically, it would appear that we'd need to change cl-ppcre
slightly for our needs (make string accesses via SCHAR into
buffer accesses), other than that, probably not a big problem.
Obviously, Edi is worried about something like this slowing
cl-ppcre down, however, from our point of view, I think it would
still be fast enough, as long as we can provide decently fast
random access into a buffer.  I think this would be done using
BUFFER-OBJECT.  Ideally, of course, we'd want to merge this back
into cl-ppcre, so that we can take advantage of any changes that
Edi makes --- his *feature* conditionalising sounds like a
reasonable way of doing that.

Any comments?

If this seems like a good idea, I'll look at putting together
something next weekend.

Lawrence

Edi's response to my mail follows:

| Hi!

| Lawrence Mitchell <wencel at gmail.com> wrote:

|> I'm looking at trying to use cl-ppcre to add regular expression
|> support to the Climacs editor (<URL:
|> http://common-lisp.net/project/climacs/>).

| Sounds cool... )

|> A few things spring to mind:
|>
|> o Licensing differences.  Climacs is released under the LGPL, while
|> cl-ppcre is under as BSD-style license.  I don't think this is a
|> problem (as far as I can tell from reading the licenses), but if you
|> know otherwise, I'd be grateful to hear.

| I don't see a problem but IANAL.  It is my understanding that the BSD
| license basically means that you can do with CL-PPCRE whatever you
| want as long as you credit my original work - this is what I intended.
| So you could, e.g., incorporate it into a LPGL project without a
| problem.  Of course, the original CL-PPCRE will still be available
| under the old license.

|> o How to best match up cl-ppcre's matching on strings with climacs'
|> idea of a buffer.
|>
|> A climacs buffer is a sequence of objects (which may or may not be
|> characters, but we'll ignore that for the moment).  Now, I can
|> easily generate a string of the contents of the buffer, and call
|> SCAN (or whatever) on the string.  However, this is going to be slow
|> for large buffers (especially if we find something just after point,
|> we've still constructed the whole buffer-string).
|>
|> The "obvious" solution to this is to use streams instead (probably),
|> so, I wonder if cl-ppcre would be amenable to something like this?

| Well, supporting all of Perl's regex facilities implies that you need
| to have random access to the target - I don't think you can fit
| streams into this picture.  I'm not a CS guy but my understanding is
| that CL-PPCRE is based on an NFA and you can't change that easily.
| You can build a DFA that implements a subset of CL-PPCRE and that
| would work with streams but that wouldn't be CL-PPCRE anymore... :)

| Now, using another kind of structures (like, say, your buffers) that
| aren't strings but are random-access - that wouldn't be /too/ hard.
| It would involve going through three or four files and change SCHAR to
| something else but basically I don't really see a problem.  However,
| as CL-PPCRE has a reputation for being quite fast I wouldn't want to
| sacrifice this for greater flexibility (buffers instead of strings,
| arbitrary objects instead of characters - you name it).  I think the
| right way to do it would be to offer the ability to build different
| versions of CL-PPCRE based on *FEATURES*, i.e. at compile time you
| decide whether you want a fast regex engine for strings or if you want
| a not-so-fast regex engine for, say, buffers.  Would that be OK for
| you?

[...]

| Cheers,
| Edi.




More information about the climacs-devel mailing list