[asdf-devel] source file encoding

Fri Apr 13 19:36:08 UTC 2012

I agree. Encoding is a per-file property, not a per-system property. If it is defined per system, then maintaining the information also becomes an overhead.

For example, I have source files that are shared by different system definitions. If I would now want to change the encoding of that file, I would also have to update the system definition(s) that it is part of. I don't think that's a good idea.

Pascal

On 13 Apr 2012, at 08:44, Douglas Crosher wrote:

> 
> Dealing with non-ASCII encoding in system definition files does look easy to solve.  It does not seem practical to just extend
> 'find-system to accept the encoding because 'find-system can in turn attempt to load other systems, and there are other entry points.
> 
> The only practical solution seems to be to detect the encoding from the file.   I could write portable code for ASDF to read an
> ASCII header line and look for encoding declarations, and handle a few common headers (emacs has 'coding', LispWorks seems to use
> 'encoding' or 'external-format').  Auto-detection could handle some of the common codings, but could be a big chunk of code.   The
> quicklisp project may be prepared to patch in headers to system definition file using non-ASCII encodings, and this could be largely
> automated.
> 
> If infrastructure is added for the system definition files then it would be only a small step to also use this for the lisp source
> files. Perhaps this suggests an alternative path to address the coding issues.
> 
> Lispworks appears to be able to automatically detect file coding, and it would be interesting to know if the ASDF encoding problems
> are not an issue for LispWorks users?   If so then this would appear to add more support to making the default :default.
> http://www.lispworks.com/documentation/lw61/LW/html/lw-659.htm#39723
> 
> It seems the issue could be dealt with by the CL implementations adding file external-format detection.
> 
> Regards
> Douglas Crosher
> 
> On 04/12/2012 06:51 PM, Douglas Crosher wrote:
>> 
>> It may be significant that a number of the quicklisp releases use non-ascii in the system definition files.  Can this be addresses
>> in ASDF alone?   Should an attempt be made to add an encoding argument to 'find-system, and to have quicklisp record the encoding in
>> its release database and use this when calling 'find-system?  If so then perhaps this could be stored as a default encoding for a
>> system.
>> 
>> Looking at non-ascii usage in quicklisp releases shows that the UTF-8 usage is not that significant.
>> 
>> Releases considered: 716
>> Releases with UTF-8 lisp source files:  86  (12%)
>> Releases with UTF-8 in comments only :  34
>> Releases using UTF-8 in their system definitions: 21
>> Releases for which all the UTF-8 could be recoded to ISO-8859-1:  59
>> Releases with other non-ascii source files:  21  (3%)
>> Releases with other non-ascii source files in comments only: 12
>> 
>> Releases using non-ascii characters from only the ISO-8859-1 set: 59 + 12? = 71? (10%)
>> 
>> Releases using only ASCII in source files: 716 - 86 - 21 = 609 (85%)
>> 
>> Some of the UTF-8 is rather gratuitous and if portability was a concert there would have been suitable ASCII substitutes.  There
>> does not appear to be much respect for portability in some of these releases, so even adding encoding support to ASDF system
>> definitions files many not help for some of these releases.
>> 
>> If you accept that library authors will choose their encoding, even for the system definition files, then the only solution seems to
>> be to add an encoding option to 'find-system and suggest this be used to load the system definition.
>> 
>> Regards
>> Douglas Crosher
>> 
>> 
>> On 04/12/2012 12:43 AM, Faré wrote:
>>>>> No. Library authors have *already* largely adopted UTF-8.
>>>>> See previous analysis by Orivej Desh:
>>>>>      "I did a ckeck of quicklisp systems.
>>>>>        There are 263 lisp files in 107 systems which assume non-ASCII,
>>>>>        and only 31 of them in 20 systems assume non-UTF-8"
>>>> 
>>>> I saw those statistics.  I have no idea what "assume non-ASCII" means.
>>>> That there are files that have non-ascii characters in them?  And that
>>>> only 31 files are not in utf-8 already?
>>>> 
>>> Yes, of which only 13 files were actually managed by ASDF as opposed
>>> to examples, one is a MCL-only file that doesn't support UTF-8 anyway,
>>> two have already been fixed, and the rest are only latin1 or such in
>>> comments. Bugs filed for all the other systems (but no response so
>>> far).
>>> 
>>> IOW, I believe we're mostly arguing about a non-issue.
>>> 
>>> —♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
>>> 
>>> _______________________________________________
>>> asdf-devel mailing list
>>> asdf-devel at common-lisp.net
>>> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/asdf-devel
>>> 
>> 
>> 
>> _______________________________________________
>> asdf-devel mailing list
>> asdf-devel at common-lisp.net
>> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/asdf-devel
>> 
> 
> 
> _______________________________________________
> asdf-devel mailing list
> asdf-devel at common-lisp.net
> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/asdf-devel

--
Pascal Costanza