[asdf-devel] source file encoding

Faré fahree at gmail.com
Wed Mar 21 01:36:33 UTC 2012


On Tue, Mar 20, 2012 at 20:04, Orivej Desh <orivej at gmx.fr> wrote:
> Now, the topic of supporting specifying source encoding is a year away.
> Should I have not replied to it and rather started a new one?
>
I think you did well to reply.

>> You need to send me a patch to ASDF that modifies
>> (defmethod perform ((operation compile-op) (c cl-source-file))
>>    ...)
>> and
>> (defmethod perform ((operation load-source-op) (c cl-source-file))
>>    ...)
>> to do something about external-format.
>
> I propose the attached file.
>
Thanks.

I know that Stelian Ionescu was also working on it,
so I'm giving him an opportunity to chime in before I merge that.

Also, I agree with Stelian that it's better to standardize
on one default encoding for all files to be loaded by ASDF.
If we do, then there's a chance that things will work
without user configuration. If we don't, we're pushing configuration
onto the user, and guaranteeing misery for newbies,
and hard-to-debug situations even for seasoned users.
These days, UTF-8 looks like the obvious encoding to standardize on.
And on implementations that don't support UTF-8,
some 8-bit-clean encoding that will at least accept UTF-8 encoded comments
and has a chance of doing the right things with strings and symbols.

Therefore, we'd use something like that:

(defparameter *utf-8-external-format*
  #+sbcl :utf-8
  ...
  #-(or sbcl ...) :default
  "external-format argument to pass for CL:OPEN to accept UTF-8
encoded source code")

>> Also it might or might not be a good idea to store the external-format
>> in a slot of cl-source-file, and to have a proper :initform in it with
>> a valid default value to be used when upgrading ASDF.
>
> It stores encoding in a property of the component, the component being a
> system or a source file.  This allows for both per system and per source
> file component encoding, the latter taking precedence, without
> additional effort.  In my implementation default :initform would not
> have helped because #'component-encoding switches between per component
> system encoding and per component encoding based on the former being
> specified or not.  Hence the default (:default) is embedded in
> #'component-encoding.
>
I think you're doing the right thing, except that
(1) we should probably use "external-format" instead of "encoding",
 since that's what the CLHS calls it, and that
(2) the default should be *utf-8-external-format*.
Then there's the whole horror of CR/LF that I'm trying to not think about.

>> The problem for you will be to reasonably support 11 existing
>> implementations or so.
>
> Since making single specification portable requires comparing all
> external formats of all supported implementations, I think it is
> reasonable to leave it to the author of a system definition to research
> by which name his preferred encoding is accessible in different
> implementations he wants to support, and to specify appropriate
> read-time conditionals.
>
I think it's OK to require authors who want non-default settings
to do their research on how to do it on each and every platform
they want to support (or depend on a library that does it for them).
But I think it's a mistake to fail to provide a sensible default,
which in effect forces EVERYONE to do to the research
or face crazy error situations in some of their users.

—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
The Constitution may not be perfect, but it's a lot better than what we've got!




More information about the asdf-devel mailing list