[asdf-devel] source file encoding

Douglas Crosher dtc-asdf at scieneer.com
Fri Apr 13 06:44:34 UTC 2012


Dealing with non-ASCII encoding in system definition files does look easy to solve.  It does not seem practical to just extend
'find-system to accept the encoding because 'find-system can in turn attempt to load other systems, and there are other entry points.

The only practical solution seems to be to detect the encoding from the file.   I could write portable code for ASDF to read an
ASCII header line and look for encoding declarations, and handle a few common headers (emacs has 'coding', LispWorks seems to use
'encoding' or 'external-format').  Auto-detection could handle some of the common codings, but could be a big chunk of code.   The
quicklisp project may be prepared to patch in headers to system definition file using non-ASCII encodings, and this could be largely
automated.

If infrastructure is added for the system definition files then it would be only a small step to also use this for the lisp source
files. Perhaps this suggests an alternative path to address the coding issues.

Lispworks appears to be able to automatically detect file coding, and it would be interesting to know if the ASDF encoding problems
are not an issue for LispWorks users?   If so then this would appear to add more support to making the default :default.
http://www.lispworks.com/documentation/lw61/LW/html/lw-659.htm#39723

It seems the issue could be dealt with by the CL implementations adding file external-format detection.

Regards
Douglas Crosher

On 04/12/2012 06:51 PM, Douglas Crosher wrote:
> 
> It may be significant that a number of the quicklisp releases use non-ascii in the system definition files.  Can this be addresses
> in ASDF alone?   Should an attempt be made to add an encoding argument to 'find-system, and to have quicklisp record the encoding in
> its release database and use this when calling 'find-system?  If so then perhaps this could be stored as a default encoding for a
> system.
> 
> Looking at non-ascii usage in quicklisp releases shows that the UTF-8 usage is not that significant.
> 
> Releases considered: 716
> Releases with UTF-8 lisp source files:  86  (12%)
> Releases with UTF-8 in comments only :  34
> Releases using UTF-8 in their system definitions: 21
> Releases for which all the UTF-8 could be recoded to ISO-8859-1:  59
> Releases with other non-ascii source files:  21  (3%)
> Releases with other non-ascii source files in comments only: 12
> 
> Releases using non-ascii characters from only the ISO-8859-1 set: 59 + 12? = 71? (10%)
> 
> Releases using only ASCII in source files: 716 - 86 - 21 = 609 (85%)
> 
> Some of the UTF-8 is rather gratuitous and if portability was a concert there would have been suitable ASCII substitutes.  There
> does not appear to be much respect for portability in some of these releases, so even adding encoding support to ASDF system
> definitions files many not help for some of these releases.
> 
> If you accept that library authors will choose their encoding, even for the system definition files, then the only solution seems to
> be to add an encoding option to 'find-system and suggest this be used to load the system definition.
> 
> Regards
> Douglas Crosher
> 
> 
> On 04/12/2012 12:43 AM, Faré wrote:
>>>> No. Library authors have *already* largely adopted UTF-8.
>>>> See previous analysis by Orivej Desh:
>>>>       "I did a ckeck of quicklisp systems.
>>>>         There are 263 lisp files in 107 systems which assume non-ASCII,
>>>>         and only 31 of them in 20 systems assume non-UTF-8"
>>>
>>> I saw those statistics.  I have no idea what "assume non-ASCII" means.
>>> That there are files that have non-ascii characters in them?  And that
>>> only 31 files are not in utf-8 already?
>>>
>> Yes, of which only 13 files were actually managed by ASDF as opposed
>> to examples, one is a MCL-only file that doesn't support UTF-8 anyway,
>> two have already been fixed, and the rest are only latin1 or such in
>> comments. Bugs filed for all the other systems (but no response so
>> far).
>>
>> IOW, I believe we're mostly arguing about a non-issue.
>>
>> —♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
>>
>> _______________________________________________
>> asdf-devel mailing list
>> asdf-devel at common-lisp.net
>> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/asdf-devel
>>
> 
> 
> _______________________________________________
> asdf-devel mailing list
> asdf-devel at common-lisp.net
> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/asdf-devel
> 





More information about the asdf-devel mailing list