[ansi-test-devel] Unicode, CHAR-UPCASE/CHAR-DOWNCASE and char-upcase.1/char-upcase.2

Thu Apr 15 21:18:06 UTC 2010

On Thu, Apr 15, 2010 at 3:15 AM, Raymond Toy <toy.raymond at gmail.com> wrote:
> On 4/3/10 7:00 PM, Erik Huelsmann wrote:
>> Ever since ABCL raised its CHAR-CODE-LIMIT from 256 to #x10000, 2
>> tests started failing: char-upcase.1 and char-upcase.2.
>>
>> These 2 tests iterate through all integers between 0 and
>> CHAR-CODE-LIMIT. While doing so, they test for the property that
>> upcasing and downcasing returns the same character again
>> ("round-tripping"). This property of characters is specified in
>> section 13.1.4.3
>> (http://www.lispworks.com/documentation/lw51/CLHS/Body/13_adc.htm)
>> "Characters with case". In short: characters with case are defined in
>> pairs; additional characters with case have to be defined in pairs
>> too.
>>
> But doesn't 13.1.4.3 also say characters with case are a subset of
> alphabetic characters, and the glossary says alphabetic characters are
> A-Z and a-z or any other implementation-defined character with case or
> other graphic character defined by the implementation to be alphabetic.
> So doesn't this mean the implementation can define the dotless-i
> character as a non-alphabetic?  I guess that would also imply that
> alpha-char-p return non-NIL for such characters.

Right. You can do that, but then it can't have case anymore, meaning
that CHAR-UPCASE should return the same value. Along the same lines of
definition of STRING-UPCASE, that would mean that so should
STRING-UPCASE...

Which neither cmucl(?) and ABCL do, if I understand correctly.

Bye,

Erik.