[Ecls-list] Ecl PARSE-NAMESTRING unable to handle unicode path name

Juan Jose Garcia-Ripoll juanjose.garciaripoll at gmail.com
Sun Jan 13 20:59:46 UTC 2013


On Sun, Jan 13, 2013 at 4:02 PM, Peter Enerccio <enerccio at gmail.com> wrote:

> Hello, I am trying to pass unicode string into PARSE-NAMESTRING, however,
> it doesn't work.
>
> (parse-namestring "aaaaaAaaajあ")
>

http://en.wikipedia.org/wiki/Filename#Encoding_interoperability

some limited interoperability issues are remaining, such as normalization
(equivanlence), or Unicode version in Use. [...] On Linux, this makes the
name of the filename is not enough to open a file: additionally to the
file's name, the exact byte representation of the filename stored in the
disk should also be known. This can be solved at application level, with
some tricky normalization calls.


There are lots of issues that have to be solved. For instance, according to
this, extended character strings would not be enough to identify a file.
Thus, DIRECTORY could never return pathnames with extended characters.

There are also other issues, such as the need for a library that normalizes
pathnames (I contributed this code to cl-unicode, but it is not part of ECL
yet), including incompatible extensions (Apple HFS+ normalizes to a
"nearly" the same for as Unicode NFD http://en.wikipedia.org/wiki/HFS_Plus).

None of this is unsurmountable, but right now it is not on the top of my
priorities, given that you can access _all_ filenames using base strings.

Juanjo

-- 
Instituto de Física Fundamental, CSIC
c/ Serrano, 113b, Madrid 28006 (Spain)
http://juanjose.garciaripoll.googlepages.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.common-lisp.net/pipermail/ecl-devel/attachments/20130113/cabab2f1/attachment.html>


More information about the ecl-devel mailing list