From eadmund42 at gmail.com Mon Jun 1 16:54:18 2009 From: eadmund42 at gmail.com (Robert Uhl) Date: Mon, 01 Jun 2009 10:54:18 -0600 Subject: [cl-ppcre-devel] RPM Spec File In-Reply-To: (cl-ppcre-devel-owner@common-lisp.net's message of "Mon\, 01 Jun 2009 12\:33\:51 -0400") References: Message-ID: I've created a spec file which can be used to build an RPM for CL-PPCRE; it uses common-lisp-controller and _should_ work on any clc-supporting Fedora Lisp. I'm willing to volunteer to maintain the spec file and figure out how to submit RPMs to the Fedora Project; is that okay? -- Robert Uhl -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 834 bytes Desc: not available URL: From edi at agharta.de Wed Jun 3 19:23:49 2009 From: edi at agharta.de (Edi Weitz) Date: Wed, 3 Jun 2009 21:23:49 +0200 Subject: [cl-ppcre-devel] RPM Spec File In-Reply-To: References: Message-ID: On Mon, Jun 1, 2009 at 6:54 PM, Robert Uhl wrote: > I've created a spec file which can be used to build an RPM for CL-PPCRE; > it uses common-lisp-controller and _should_ work on any clc-supporting > Fedora Lisp. ?I'm willing to volunteer to maintain the spec file and > figure out how to submit RPMs to the Fedora Project; is that okay? Sounds good to me. Let me know where the RPM is hosted and I'll add a pointer to it to the documentation/website. Thanks, Edi. From eadmund42 at gmail.com Fri Jun 19 02:30:52 2009 From: eadmund42 at gmail.com (Robert Uhl) Date: Thu, 18 Jun 2009 20:30:52 -0600 Subject: [cl-ppcre-devel] RPM Spec File In-Reply-To: (Edi Weitz's message of "Wed\, 3 Jun 2009 21\:23\:49 +0200") References: Message-ID: I apologise for the delay...I had to wait for my employer to give me permission to work on this in my free time. The CL-PPCRE RPM is available at: http://yum.octopodial-chrome.com/11/RPMS/noarch/cl-ppcre-2.0.1-1.fc10.noarch.rpm Anyone who wishes to keep up-to-date on releases as they occur is invited to install the octopodial-chrome RPM available in the same location; this file will add the signing key used for these RPMs as well as a Yum repository so that you'll always have the latest and great. Many other Lisp libraries are available at the same location. -- Robert Uhl Exodus will never disconnect a spammer. By the time the complaints reach a level adequate to persuade them, the small-arms fire will prevent their admins from reaching the servers --clifto, in nanae From netawater at gmail.com Thu Jun 25 02:31:45 2009 From: netawater at gmail.com (Xiangjun Wu) Date: Thu, 25 Jun 2009 10:31:45 +0800 Subject: [cl-ppcre-devel] report a bug Message-ID: <6f8d23640906241931g30c95a85nadf270a9243ebbd3@mail.gmail.com> In order to fix issue of montezuma, http://code.google.com/p/montezuma/issues/detail?id=3, I suppose I found a bug of cl-ppcre. CL-USER> (cl-ppcre:scan (cl-ppcre:create-scanner "(\\w+)*\\@\\w+") "______________________________________" :start 0) ;; Evaluation aborted. It hangs when the number of underscore hit a critical value. I speculate that '\w' includes underscore in regular expression would account for this bug. and replace with other character of '\w' also has this problem. CL-USER> (cl-ppcre:scan (cl-ppcre:create-scanner "(a\\w+)*\\@\\w+") "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" :start 0) ;; Evaluation aborted. but if I eliminate the last \w, it is OK. CL-USER> (cl-ppcre:scan (cl-ppcre:create-scanner "(_\\w+)*\\@") "_______________________________________" :start 0) NIL I also check it in perl, Maybe perl is more efficient in regular expression operation, I raise the number of underscores, but it is OK. $str = "john._______________________________________ __________________________________"; if ($str =~ m/(_*\w+)*\@\w+/) { print "ok\n"; } Please check it and give your comment. ?????????? -------------- next part -------------- An HTML attachment was scrubbed... URL: From edi at agharta.de Thu Jun 25 19:21:53 2009 From: edi at agharta.de (Edi Weitz) Date: Thu, 25 Jun 2009 21:21:53 +0200 Subject: [cl-ppcre-devel] report a bug In-Reply-To: <6f8d23640906241931g30c95a85nadf270a9243ebbd3@mail.gmail.com> References: <6f8d23640906241931g30c95a85nadf270a9243ebbd3@mail.gmail.com> Message-ID: Hi, On Thu, Jun 25, 2009 at 4:31 AM, Xiangjun Wu wrote: > ???????????????? "(\\w+)*\\@\\w+" That's the type of regular expression that typically leads to a combinatorial explosion in regex engines unless they use specific "tricks" to deal with this. Recent versions of Perl are pretty clever in this regard (they look for "floating" substrings) while CL-PPCRE isn't, but - frankly - I don't really see the point of this. I think this is mainly so that the regex engine looks good in benchmarks. I definitely wouldn't call this a bug. The question is - what do you want to achieve with this regular expression? Can't you write it in a simpler way? Cheers, Edi. From sky at viridian-project.de Fri Jun 26 13:10:14 2009 From: sky at viridian-project.de (Leslie P. Polzer) Date: Fri, 26 Jun 2009 15:10:14 +0200 (CEST) Subject: [cl-ppcre-devel] report a bug Message-ID: On Jun 25, 9:21 pm, Edi Weitz wrote: > The question is - what do you want to achieve with this regular > expression? Can't you write it in a simpler way? Isn't this pattern pretty useful in general: A at B where A and B are word characters and @ is a specific non-word character? How else could we specify it? [a-zA-Z0-9] doesn't seem acceptable to me since it relies on the latin alphabet... Leslie -- http://www.linkedin.com/in/polzer From hans.huebner at gmail.com Fri Jun 26 14:09:44 2009 From: hans.huebner at gmail.com (=?ISO-8859-1?Q?Hans_H=FCbner?=) Date: Fri, 26 Jun 2009 16:09:44 +0200 Subject: [cl-ppcre-devel] report a bug In-Reply-To: References: Message-ID: On Fri, Jun 26, 2009 at 15:10, Leslie P. Polzer wrote: > On Jun 25, 9:21 pm, Edi Weitz wrote: > >> The question is - what do you want to achieve with this regular >> expression? ?Can't you write it in a simpler way? > > Isn't this pattern pretty useful in general: > > A at B > > where A and B are word characters and @ is a specific non-word > character? Sure, but the original bug report was about this: (\\w+)*\\@\\w+ I can't make any sense of this regular expression, but maybe it is because I am lacking some skills. Maybe Wu can explain what he wants to achive with it? -Hans From netawater at gmail.com Fri Jun 26 14:46:38 2009 From: netawater at gmail.com (Xiangjun Wu) Date: Fri, 26 Jun 2009 22:46:38 +0800 Subject: [cl-ppcre-devel] report a bug In-Reply-To: References: Message-ID: <6f8d23640906260746r20db5dbaheddaa7524bade17b@mail.gmail.com> Very sorry, it is a typo, :( It should be: (cl-ppcre:scan (cl-ppcre:create-scanner "(_\\w+)*\\@\\w+") "______________________________________" :start 0) but other examples indicate the accurate idea. On 6/26/09, Hans H?bner wrote: > On Fri, Jun 26, 2009 at 15:10, Leslie P. Polzer > wrote: > >> On Jun 25, 9:21 pm, Edi Weitz wrote: >> >>> The question is - what do you want to achieve with this regular >>> expression? ?Can't you write it in a simpler way? >> >> Isn't this pattern pretty useful in general: >> >> A at B >> >> where A and B are word characters and @ is a specific non-word >> character? > > Sure, but the original bug report was about this: > > (\\w+)*\\@\\w+ > > I can't make any sense of this regular expression, but maybe it is > because I am lacking some skills. Maybe Wu can explain what he wants > to achive with it? > > -Hans > > _______________________________________________ > cl-ppcre-devel site list > cl-ppcre-devel at common-lisp.net > http://common-lisp.net/mailman/listinfo/cl-ppcre-devel > -- ?????????? From sky at viridian-project.de Fri Jun 26 15:01:39 2009 From: sky at viridian-project.de (Leslie P. Polzer) Date: Fri, 26 Jun 2009 17:01:39 +0200 (CEST) Subject: [cl-ppcre-devel] report a bug In-Reply-To: <6f8d23640906260746r20db5dbaheddaa7524bade17b@mail.gmail.com> References: <6f8d23640906260746r20db5dbaheddaa7524bade17b@mail.gmail.com> Message-ID: Xiangjun Wu wrote: > Very sorry, it is a typo, :( > > It should be: > > (cl-ppcre:scan > (cl-ppcre:create-scanner > "(_\\w+)*\\@\\w+") "______________________________________" > :start 0) > > but other examples indicate the accurate idea. Looking at this I'm not sure what this is good for. Why would we want to match strings of the form _xxx at xxx in a full-text indexer? Perhaps it would be best to get rid of the whole messy regex (of which this is only a small part) and write a new documented one from scratch. Or use a custom state-based tokenizer. From ctdean at sokitomi.com Fri Jun 26 18:17:52 2009 From: ctdean at sokitomi.com (Chris Dean) Date: Fri, 26 Jun 2009 11:17:52 -0700 Subject: [cl-ppcre-devel] report a bug In-Reply-To: <6f8d23640906260746r20db5dbaheddaa7524bade17b@mail.gmail.com> (Xiangjun Wu's message of "Fri, 26 Jun 2009 22:46:38 +0800") References: <6f8d23640906260746r20db5dbaheddaa7524bade17b@mail.gmail.com> Message-ID: Xiangjun Wu writes: > (cl-ppcre:scan > (cl-ppcre:create-scanner > "(_\\w+)*\\@\\w+") "______________________________________" > :start 0) > Perhaps (cl-ppcre:create-scanner "(_[_\\w]+)?@\\w+") will work for your app? The problem in the original expression is the "+" followed by the "*" can lead to a combinatorial explosion. If you loosen the requirement that all non-zero matches in the first expression must begin with an "_" you could have: (cl-ppcre:create-scanner "[_\\w]*@\\w+") Cheers, Chris Dean