[asdf-devel] source file encoding

Anton Vodonosov avodonosov at yandex.ru
Sat Jan 29 20:42:00 UTC 2011

29.01.2011, 19:36, "Faré" <fahree at gmail.com>:
> Dear Anton,
> Sorry, I see no trace of *compile-file-external-format* 
> ...
> it seems to rely
> on some local patch to ASDF that was never merged upstream.

You are right! Now I remember, when I worked on that project several years ago
I just opened asdf.lisp, found the compile-file call and introduced the 
*compile-file-external-format* there, and then passed the encoding via this variable.

I am not undertaking the patch now, because the project I am working on will only be
started on my development machine and my server, and I can use some easy workaround,
e.g. most lisps accept default encoding as a command line argument.

My first letter was to ensure I am not overlooking a standard way for 
specifying the encoding.

Anyway, thank you for the info, it's interesting to know.

Also, several notes, which may be useful later, when someone will implement the 
patch eventually.

In 99.9% of cases it is enough to specify encoding for the whole system, 
not for separate files. Only in some extraordinary case the system author
would chose to store source files in different encodings.

> Also it might or might not be a good idea to store the external-format
> in a slot of cl-source-file, and to have a proper :initform in it with
> a valid default value to be used when upgrading ASDF.

How the slots are populated from the defsystem expression?

E.g. if I have 

   (:file "package" :enc :utf-8)

will the :enc :utf-8 be passed as initargs to (make-instance 'cl-source-file)?

Or for 

(defsystem :mysystem
  :version "0.1.0"
  :serial t
  :enc :utf-8

Are these attributes passed to the component instantiation as initargs?

> The problem for you will be to reasonably support 11 implementations
> existing implementations or so.

Actually, not a big problem. We will just create a mapping from the encoding
specifications allowed in .asd files to the encoding specification of the underlying compiler.


(defun enc (enc)
  (case enc 
    ((:utf8 :utf-8) #+:clisp 'charset:utf-8 #+:sbcl :utf8 #+ccl :utf-8 ....)
    ((:cp1251 :cp-1251) #+:clisp 'charset:cp1251 #+:sbcl :cp1251 #+ccl :cp-1251) ...)

Would you accept a patch with support only 7-10 the most important encodings (all unicodes +
several the most frequent single-byte encodings)?

29.01.2011, 20:15, "Cyrus Harmon" <ch-lisp at bobobeach.com>:
> asdf:*load-external-format* perhaps?

Does the problem with national characters in .asd files really exits? Do you use non ASCII 
characters in .asd files?

asdf:*load-external-format* would be more flexible than a hard-coded encoding, but it 
still doesn't solve the problem you mentioned: handling several .asd files 
with different encodings.

If start improvements, IMHO enforcing UTF-8 is a good start and should be enough (the 
option 4 listed by Fare).

If more is needed, a complete solution allowing per .asd encoding specification is better.
We need to chose a good notation, that will allow reasonably simple implementation.

It might be either Emacs comment in the first line
 ;;; -*- coding: utf-8; -*-

Or special lisp form:
   (asdf:asd-file-encoding :utf-8)

But interpretation of that form will require switching encoding of the lisp reader stream,
which I believe will be problematic on some Lisps. Therefore it will require feeding
the reader from our custom input stream implementation, like flexi-streams. And
still it will be not good enough, because only ASDF will create that special
stream for the .asd files, when you execute it from REPL/SLIME, the meaning
of that expression is unclear.

Another alternative, is naming conventions for .asd files: mysystem.utf-8.asd.
It's simple to implement, and after some thinking, it seems better than the 
two suggestions above.

But again, we should decide if the problem really exists and avoid solving problems
that we don't have. I personally never use national characters in .asd files.

Best regards,
- Anton

More information about the asdf-devel mailing list