[Langutils-devel] Period not correctly tokenized?
Ian Eslick
eslick at media.mit.edu
Tue Oct 11 16:01:31 UTC 2011
Periods are handled specially because they show up in numbers, abbreviations, e.g. and i.e., etc. You lose numbers as tokens if you split out periods naively.
Sent from my iPhone
On Oct 11, 2011, at 12:33 AM, Jianshi Huang <jianshi.huang at gmail.com> wrote:
> Hey Kevin,
>
> On Fri, Oct 7, 2011 at 2:57 PM, Jianshi Huang <jianshi.huang at gmail.com> wrote:
>> Currently it works for me, but I'm not sure whether it will break
>> something else...
>>
>> There must be a reason for not including #\. in the punctuation type.
>>
>> Anyway, here's the patch for git.
>>
>
> I messed up your repository with eslick's cl-langutils. LOL
>
> So here's the patch for your langutils.
>
>
> --
> 黄 澗石 (Jianshi Huang)
> http://huangjs.net/
> <0001-Fix-tokenization-for-sentence-ending-periods.patch>
> _______________________________________________
> Langutils-devel mailing list
> Langutils-devel at common-lisp.net
> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/langutils-devel
More information about the langutils-devel
mailing list