[asdf-devel] source file encoding
Pascal Costanza
pc at p-cos.net
Fri Apr 13 19:36:08 UTC 2012
I agree. Encoding is a per-file property, not a per-system property. If it is defined per system, then maintaining the information also becomes an overhead.
For example, I have source files that are shared by different system definitions. If I would now want to change the encoding of that file, I would also have to update the system definition(s) that it is part of. I don't think that's a good idea.
Pascal
On 13 Apr 2012, at 08:44, Douglas Crosher wrote:
>
> Dealing with non-ASCII encoding in system definition files does look easy to solve. It does not seem practical to just extend
> 'find-system to accept the encoding because 'find-system can in turn attempt to load other systems, and there are other entry points.
>
> The only practical solution seems to be to detect the encoding from the file. I could write portable code for ASDF to read an
> ASCII header line and look for encoding declarations, and handle a few common headers (emacs has 'coding', LispWorks seems to use
> 'encoding' or 'external-format'). Auto-detection could handle some of the common codings, but could be a big chunk of code. The
> quicklisp project may be prepared to patch in headers to system definition file using non-ASCII encodings, and this could be largely
> automated.
>
> If infrastructure is added for the system definition files then it would be only a small step to also use this for the lisp source
> files. Perhaps this suggests an alternative path to address the coding issues.
>
> Lispworks appears to be able to automatically detect file coding, and it would be interesting to know if the ASDF encoding problems
> are not an issue for LispWorks users? If so then this would appear to add more support to making the default :default.
> http://www.lispworks.com/documentation/lw61/LW/html/lw-659.htm#39723
>
> It seems the issue could be dealt with by the CL implementations adding file external-format detection.
>
> Regards
> Douglas Crosher
>
> On 04/12/2012 06:51 PM, Douglas Crosher wrote:
>>
>> It may be significant that a number of the quicklisp releases use non-ascii in the system definition files. Can this be addresses
>> in ASDF alone? Should an attempt be made to add an encoding argument to 'find-system, and to have quicklisp record the encoding in
>> its release database and use this when calling 'find-system? If so then perhaps this could be stored as a default encoding for a
>> system.
>>
>> Looking at non-ascii usage in quicklisp releases shows that the UTF-8 usage is not that significant.
>>
>> Releases considered: 716
>> Releases with UTF-8 lisp source files: 86 (12%)
>> Releases with UTF-8 in comments only : 34
>> Releases using UTF-8 in their system definitions: 21
>> Releases for which all the UTF-8 could be recoded to ISO-8859-1: 59
>> Releases with other non-ascii source files: 21 (3%)
>> Releases with other non-ascii source files in comments only: 12
>>
>> Releases using non-ascii characters from only the ISO-8859-1 set: 59 + 12? = 71? (10%)
>>
>> Releases using only ASCII in source files: 716 - 86 - 21 = 609 (85%)
>>
>> Some of the UTF-8 is rather gratuitous and if portability was a concert there would have been suitable ASCII substitutes. There
>> does not appear to be much respect for portability in some of these releases, so even adding encoding support to ASDF system
>> definitions files many not help for some of these releases.
>>
>> If you accept that library authors will choose their encoding, even for the system definition files, then the only solution seems to
>> be to add an encoding option to 'find-system and suggest this be used to load the system definition.
>>
>> Regards
>> Douglas Crosher
>>
>>
>> On 04/12/2012 12:43 AM, Faré wrote:
>>>>> No. Library authors have *already* largely adopted UTF-8.
>>>>> See previous analysis by Orivej Desh:
>>>>> "I did a ckeck of quicklisp systems.
>>>>> There are 263 lisp files in 107 systems which assume non-ASCII,
>>>>> and only 31 of them in 20 systems assume non-UTF-8"
>>>>
>>>> I saw those statistics. I have no idea what "assume non-ASCII" means.
>>>> That there are files that have non-ascii characters in them? And that
>>>> only 31 files are not in utf-8 already?
>>>>
>>> Yes, of which only 13 files were actually managed by ASDF as opposed
>>> to examples, one is a MCL-only file that doesn't support UTF-8 anyway,
>>> two have already been fixed, and the rest are only latin1 or such in
>>> comments. Bugs filed for all the other systems (but no response so
>>> far).
>>>
>>> IOW, I believe we're mostly arguing about a non-issue.
>>>
>>> —♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
>>>
>>> _______________________________________________
>>> asdf-devel mailing list
>>> asdf-devel at common-lisp.net
>>> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/asdf-devel
>>>
>>
>>
>> _______________________________________________
>> asdf-devel mailing list
>> asdf-devel at common-lisp.net
>> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/asdf-devel
>>
>
>
> _______________________________________________
> asdf-devel mailing list
> asdf-devel at common-lisp.net
> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/asdf-devel
--
Pascal Costanza
More information about the asdf-devel
mailing list