[asdf-devel] source file encoding

Douglas Crosher dtc-asdf at scieneer.com
Sun Apr 15 12:11:46 UTC 2012


A draft version adding support for reading the encoding from the file options header is available at:
http://www.scieneer.com/files/asdf-encoding-file-option.lisp

It has a bias towards UTF-8 which is used if other encodings are not detected or declared in the file options and if the file is
valid UTF-8 with UTF-8 specific sequences.  I don't expect too many false positives from the UTF-8 detector.  I would not suggest
trying to detect any further encodings.

For UTF-8 files, no action needs to be taken when upgrading ASDF -  ASDF will reliably detect them and load and compile them as
UTF-8 rather than using the :default CL external-format.  Files with other encodings, that are not detected, will load and compile
using the :default external-format as is currently the case - library authors can add file options headers in order for such files
to load and compile reliably across systems with a range of default external-formats.   There would not appear to be any migration
loss or inconvenience for anyone, except if there incorrect encoding file options that need to be fixed.

For 8 bit CL implementations, the encoding detection and file options reading could probably just be disabled, or perhaps it could
remain and issue warnings for clearly incompatible encodings.

This may offer a solution to the problem of defining the system definition file encoding, is convenient for UTF-8 users, and
provides a reliable mechanism for writing portable libraries in other encodings.

An encoding file option could also be handy for other tools, such as editors, web servers, tools for recoding lisp source files,
etc.  I think it warrants some consideration.

You do a lot for ASDF and deserve thanks.

Regards
Douglas Crosher

On 04/15/2012 11:00 AM, Faré wrote:
> On Fri, Apr 13, 2012 at 02:44, Douglas Crosher <dtc-asdf at scieneer.com> wrote:
>> The only practical solution seems to be to detect the encoding from the file.
>> I could write portable code for ASDF to read an ASCII header line
>> and look for encoding declarations, and handle a few common headers
>> (emacs has 'coding', LispWorks seems to use 'encoding' or 'external-format').
>> Auto-detection could handle some of the common codings,
>> but could be a big chunk of code.
>> The quicklisp project may be prepared to patch in headers
>> to system definition file using non-ASCII encodings,
>> and this could be largely automated.
>>
> Yes, this is a valid approach, though it is somewhat heavy in coding
> and will grow ASDF by a few hundred more lines of code.
> Don't forget to support the way Emacs detects encoding, etc.
> It is certainly more than I am willing to code,
> and making the semantics of loading more complex than I am comfortable with.
> Before you code it yourself,
> I'd like to hear about other users here what they think.
> 
> An additional small thing I don't like about the approach is that
> you have to open a file twice, once to detect encoding,
> the other time to load or compile-file it, which is not atomic
> and can be slightly nasty (if e.g. the file is actually a URL or
> mounted on a weird filesystem or whatnot).
> But that's secondary.
>
> Also, I'm not sure how big the market for such support is.
> There again, I'd like to hear from potential users.
> 
> 
>> If infrastructure is added for the system definition files
>> then it would be only a small step to also use this for the lisp source files.
> Indeed.
> 
> Alternatively, this could be an :automatic mode added to asdf-encodings,
> rather than a part of ASDF itself, at which point it would be available
> to source files, but not system files.
> 
>> Lispworks appears to be able to automatically detect file coding, and
>> it would be interesting to know if the ASDF encoding problems
>> are not an issue for LispWorks users?   If so then this would appear to add more support to making the default :default.
>> http://www.lispworks.com/documentation/lw61/LW/html/lw-659.htm#39723
>>
> If you want your code to be portable, you can't rely on users using LispWorks.
> Deterministic well-defined semantics require that the meaning of your code
> should not depend on magic that may or may not happen.
> 
> 
> PS: This long discussion on a relatively minor topic reminds me of
> Parkinson's Law of Triviality. What color should the bikeshed be painted?
>
> —♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
> Classical Liberalism: the only truly subversive ideology.
> 





More information about the asdf-devel mailing list