[asdf-devel] source file encoding
avodonosov at yandex.ru
Sat Jan 29 20:42:00 UTC 2011
29.01.2011, 19:36, "Faré" <fahree at gmail.com>:
> Dear Anton,
> Sorry, I see no trace of *compile-file-external-format*
> it seems to rely
> on some local patch to ASDF that was never merged upstream.
You are right! Now I remember, when I worked on that project several years ago
I just opened asdf.lisp, found the compile-file call and introduced the
*compile-file-external-format* there, and then passed the encoding via this variable.
I am not undertaking the patch now, because the project I am working on will only be
started on my development machine and my server, and I can use some easy workaround,
e.g. most lisps accept default encoding as a command line argument.
My first letter was to ensure I am not overlooking a standard way for
specifying the encoding.
Anyway, thank you for the info, it's interesting to know.
Also, several notes, which may be useful later, when someone will implement the
In 99.9% of cases it is enough to specify encoding for the whole system,
not for separate files. Only in some extraordinary case the system author
would chose to store source files in different encodings.
> Also it might or might not be a good idea to store the external-format
> in a slot of cl-source-file, and to have a proper :initform in it with
> a valid default value to be used when upgrading ASDF.
How the slots are populated from the defsystem expression?
E.g. if I have
(:file "package" :enc :utf-8)
will the :enc :utf-8 be passed as initargs to (make-instance 'cl-source-file)?
Are these attributes passed to the component instantiation as initargs?
> The problem for you will be to reasonably support 11 implementations
> existing implementations or so.
Actually, not a big problem. We will just create a mapping from the encoding
specifications allowed in .asd files to the encoding specification of the underlying compiler.
(defun enc (enc)
((:utf8 :utf-8) #+:clisp 'charset:utf-8 #+:sbcl :utf8 #+ccl :utf-8 ....)
((:cp1251 :cp-1251) #+:clisp 'charset:cp1251 #+:sbcl :cp1251 #+ccl :cp-1251) ...)
Would you accept a patch with support only 7-10 the most important encodings (all unicodes +
several the most frequent single-byte encodings)?
29.01.2011, 20:15, "Cyrus Harmon" <ch-lisp at bobobeach.com>:
> asdf:*load-external-format* perhaps?
Does the problem with national characters in .asd files really exits? Do you use non ASCII
characters in .asd files?
asdf:*load-external-format* would be more flexible than a hard-coded encoding, but it
still doesn't solve the problem you mentioned: handling several .asd files
with different encodings.
If start improvements, IMHO enforcing UTF-8 is a good start and should be enough (the
option 4 listed by Fare).
If more is needed, a complete solution allowing per .asd encoding specification is better.
We need to chose a good notation, that will allow reasonably simple implementation.
It might be either Emacs comment in the first line
;;; -*- coding: utf-8; -*-
Or special lisp form:
But interpretation of that form will require switching encoding of the lisp reader stream,
which I believe will be problematic on some Lisps. Therefore it will require feeding
the reader from our custom input stream implementation, like flexi-streams. And
still it will be not good enough, because only ASDF will create that special
stream for the .asd files, when you execute it from REPL/SLIME, the meaning
of that expression is unclear.
Another alternative, is naming conventions for .asd files: mysystem.utf-8.asd.
It's simple to implement, and after some thinking, it seems better than the
two suggestions above.
But again, we should decide if the problem really exists and avoid solving problems
that we don't have. I personally never use national characters in .asd files.
More information about the asdf-devel