[asdf-devel] source file encoding

Faré fahree at gmail.com
Mon Mar 26 04:21:09 UTC 2012

On Sun, Mar 25, 2012 at 22:47, Orivej Desh <c at orivej.org> wrote:
>> Is there some reason why we must put the external-format into the
>> property list instead of just giving it a slot in the component class
>> definition?
> It was my ignorance.
I thought it was to allow a backwards-compatible syntax of
  (:file "foo" :properties (:encoding :latin1))
If instead we want to encourage people to use
  (:file "foo" :encoding :latin1)
and force users to upgrade ASDF,
then it makes no sense using properties instead of a slot.

Are we set in requiring that this new encoding specification
will require ASDF 2.21 to work? If so, we should ask people
to not actually start using it in libraries
until a few months from now, when ASDF 2.21 is more widely available
(i.e. has made it to Quicklisp, SBCL, etc.).

Note that in the end, I prefer :encoding if we're going to add
an implicit translation layer between that and the actual :external-format
option of CL:LOAD, so the user understands there's a difference.
If we're going to NOT going to add a translation layer, and instead
require users to use #.(foo:encoding-to-external-format :latin1).

>> Also, what sort of an entity are the external format values?  Is it
>> always a keyword symbol?  Can we say that it should always be a keyword
>> and that we will massage it to something else, if necessary, for the
>> benefit of the implementation when reading a file?
> Maybe yes.  Consider that e.g. some implementations accept more options
> (mostly to control line terminators) — CLISP as instances of
> ext:encoding, LispWorks as lists like '(:latin-1 :eol-style :lf); but
> then CLISP explicitly says that line terminators don't matter during
> input.
Oh yeah, I had tried to blank out on line terminators.
Hopefully, they won't matter much indeed, since
ASDF only cares about input encoding, and
line terminators are an output option.
That's one more reason to call our thing :encoding instead of :external-format.

>> In that case we could have an accessor that will do the
>> implementation-specific massaging for us (e.g., we could store :utf-8,
>> but on clisp we would present charset:utf-8 when reading...).  That
>> seems somehow tidier to me, rather than changing the value behind the
>> programmer's back as we do here.  OTOH, we do quietly change symbols to
>> strings, so maybe I'm just talking through my hat.
> I'd appreciate if you explain in a more detail what happens when and
> how.  Is it like in the attached patch, but with logic moved from the
> setf'er to the accessor?
I suppose we'll have something like that:

(defun trivial-encoding-to-external-format-hook (encoding)
  (declare (ignore encoding))
(defvar *encoding-to-external-format-hook*

  ... (load ... :external-format (funcall
*encoding-to-external-format-hook* encoding) ...) ...

Then you'll have to :defsystem-depends-on (:asdf-encodings) or some such
to be able to use different encodings.

> If one goes beyond ASCII, saves not in UTF-8 (as expected on MS Windows,
> but even on Linux LispWorks Personal IDE tried to save a file in
> Latin-1), manages local projects with ASDF and upgrades ASDF, he will be
> affected.
Ouch. I'd say that in this case, LW personal has obsolete default settings.
I still think that ASDF should assume utf-8 by default.

I've committed something along those lines as 2.20.3.
Only minimal testing for no obvious breakage (make test, using sbcl).

—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
Time and money spent in helping men to do more for themselves is far better
than mere giving. — Henry Ford

More information about the asdf-devel mailing list