[armedbear-devel] [ansi-test-devel] Unicode, CHAR-UPCASE/CHAR-DOWNCASE and char-upcase.1/char-upcase.2

Sun Apr 4 12:04:47 UTC 2010

Hi Sam,

On Sun, Apr 4, 2010 at 10:58 AM, Sam Steingold <sds at gnu.org> wrote:
> On 4/3/10, Erik Huelsmann <ehuels at gmail.com> wrote:
>>
>>  However, in section 13.1.10, there seems to be an escape hatch:
>>  "Documentation of implementation-defined scripts". A script is a
>>  subtype of CHARACTER, nothing more nothing less. An
>>  implementation-defined script gets to document the effect on
>>  CHAR-UPCASE and CHAR-DOWNCASE.
>
> I don't think this gives you a license to discard the round-tripping invariant.

I read the same section again and on second reading I think the
section indeed does not allow that freedom.

>>  there's no need to have the round-tripping requirement apply to most
>>  of unicode - as can't be expected, see latin-small-letter-dotless-i
>>  for an example.
>
> why not make it its own upper case?
> this is not exactly correct from the unicode pov, but, I think, it is
> better that the alternative.
> this round-tripping requirement is, i think, pretty important in symbol i/o.

I hadn't thought about the reader and printer behaviours regarding
*readtable-case* and *print-case*. However, it would be logical by
analogy that if a string doesn't get recoded in a round-trip, then the
symbol name won't either.

But I agree now this isn't CLHS compliant. Does clisp handle this by
making it non-alphabetical or by making it a character without case
(ie a character which up/lowercases to itself)?

I think now that the tests are correct, but that the requirement in
the CLHS is out-dated. However, that's something to address in the
implementation itself. I'll discuss on the ABCL list.

Thanks for your time!

Bye,

Erik.