[armedbear-devel] [ansi-test-devel] Unicode, CHAR-UPCASE/CHAR-DOWNCASE and char-upcase.1/char-upcase.2
Erik Huelsmann
ehuels at gmail.com
Sun Apr 4 12:04:47 UTC 2010
Hi Sam,
On Sun, Apr 4, 2010 at 10:58 AM, Sam Steingold <sds at gnu.org> wrote:
> On 4/3/10, Erik Huelsmann <ehuels at gmail.com> wrote:
>>
>> However, in section 13.1.10, there seems to be an escape hatch:
>> "Documentation of implementation-defined scripts". A script is a
>> subtype of CHARACTER, nothing more nothing less. An
>> implementation-defined script gets to document the effect on
>> CHAR-UPCASE and CHAR-DOWNCASE.
>
> I don't think this gives you a license to discard the round-tripping invariant.
I read the same section again and on second reading I think the
section indeed does not allow that freedom.
>> there's no need to have the round-tripping requirement apply to most
>> of unicode - as can't be expected, see latin-small-letter-dotless-i
>> for an example.
>
> why not make it its own upper case?
> this is not exactly correct from the unicode pov, but, I think, it is
> better that the alternative.
> this round-tripping requirement is, i think, pretty important in symbol i/o.
I hadn't thought about the reader and printer behaviours regarding
*readtable-case* and *print-case*. However, it would be logical by
analogy that if a string doesn't get recoded in a round-trip, then the
symbol name won't either.
But I agree now this isn't CLHS compliant. Does clisp handle this by
making it non-alphabetical or by making it a character without case
(ie a character which up/lowercases to itself)?
I think now that the tests are correct, but that the requirement in
the CLHS is out-dated. However, that's something to address in the
implementation itself. I'll discuss on the ABCL list.
Thanks for your time!
Bye,
Erik.
More information about the armedbear-devel
mailing list