[cl-ppcre-devel] Using cl-ppcre in a text editor

Edi Weitz edi at agharta.de
Sat Jan 29 19:47:56 UTC 2005


On Sat, 29 Jan 2005 16:16:50 +0000, Lawrence Mitchell <wencel at gmail.com> wrote:

> I'm looking at trying to use cl-ppcre to add regular expression
> support to the Climacs editor (<URL:
> http://common-lisp.net/project/climacs/>).

Sounds cool... )

> A few things spring to mind:
> o Licensing differences.  Climacs is released under the LGPL, while
> cl-ppcre is under as BSD-style license.  I don't think this is a
> problem (as far as I can tell from reading the licenses), but if you
> know otherwise, I'd be grateful to hear.

I don't see a problem but IANAL.  It is my understanding that the BSD
license basically means that you can do with CL-PPCRE whatever you
want as long as you credit my original work - this is what I intended.
So you could, e.g., incorporate it into a LPGL project without a
problem.  Of course, the original CL-PPCRE will still be available
under the old license.

> o How to best match up cl-ppcre's matching on strings with climacs'
> idea of a buffer.
> A climacs buffer is a sequence of objects (which may or may not be
> characters, but we'll ignore that for the moment).  Now, I can
> easily generate a string of the contents of the buffer, and call
> SCAN (or whatever) on the string.  However, this is going to be slow
> for large buffers (especially if we find something just after point,
> we've still constructed the whole buffer-string).
> The "obvious" solution to this is to use streams instead (probably),
> so, I wonder if cl-ppcre would be amenable to something like this?

Well, supporting all of Perl's regex facilities implies that you need
to have random access to the target - I don't think you can fit
streams into this picture.  I'm not a CS guy but my understanding is
that CL-PPCRE is based on an NFA and you can't change that easily.
You can build a DFA that implements a subset of CL-PPCRE and that
would work with streams but that wouldn't be CL-PPCRE anymore... :)

Now, using another kind of structures (like, say, your buffers) that
aren't strings but are random-access - that wouldn't be /too/ hard.
It would involve going through three or four files and change SCHAR to
something else but basically I don't really see a problem.  However,
as CL-PPCRE has a reputation for being quite fast I wouldn't want to
sacrifice this for greater flexibility (buffers instead of strings,
arbitrary objects instead of characters - you name it).  I think the
right way to do it would be to offer the ability to build different
versions of CL-PPCRE based on *FEATURES*, i.e. at compile time you
decide whether you want a fast regex engine for strings or if you want
a not-so-fast regex engine for, say, buffers.  Would that be OK for

> On another, somewhat unrelated note.  One thing that one would like
> to do is regexp search and replace, now, if I know how many groups
> the user is going to input into their regexp before the fact, I can
> use REGISTER-GROUPS-BIND to get at the substring matches via
> variables.  I guess there isn't any way to do this without knowing
> the input beforehand.  So is the idea then just to use SCAN and then
> manually grab the substrings via REG-STARTS and REG-ENDS, or have I
> missed something obvious?

No, I don't see a better way.  If you don't now the regex then you
have to check the return value and see how long the register arrays


More information about the Cl-ppcre-devel mailing list