[asdf-devel] source file encoding

Faré fahree at gmail.com
Sat Mar 31 17:09:50 UTC 2012

On Sat, Mar 31, 2012 at 10:38, Orivej Desh <c at orivej.org> wrote:
> Douglas' main point may be transformed as follows, which is a legitimate
> question: if the task is to extend the supported character set to UTF-8,
> is not it solved by accepting :encoding option and defining default
> #'encoding-external-format which understands (nothing but) :utf-8?
Yes, that's what we have now with 2.20.x.

> Given that, should the default be UTF-8 rather then :default?  Answering
> `yes' might cause more or less trouble to some people, answering `no'
> will provide for a gradual transition.  I think we should ask Zach Beane
> about issues with unspecified and discerned external formats.
Source code that uses more than the ASCII character set
wasn't portably supported previously, but in practice,
utf-8 worked everywhere and was backhandedly enforced by
a lot of people using SBCL and utf-8 and sending reports to authors
so they make their packages compatible.
This change therefore only formalizes a de facto standard,
and allows for extension and customization
where no such thing was previously possible.

In the future, maybe we should distinguish between :default
that is :utf-8 where supported and falls back where not supported,
and :utf-8 that means "I really really want utf-8",
e.g. for lambda-reader? I think it'll be better solved
as using :utf-8 in all cases and #-asdf-unicode (error ...)
in the source code when it's not available.

> Another issue which somewhat bothers me: is such kind of a hook right?
> It seems to be inherently unmanaged (just like *macroexpand-hook*),
> i.e. setting it in a system affects future loaded systems, unless it is
> set lexically in around-compile.  But then, it might as well be another
> ASDF option (say, either a package designator which exports
> #'encoding-external-format, or a list of a package and a keyworded
> symbol designating desired function).
Good suggestion: I've refactored the external-format extraction
to happen inside the around-compile hook.
But yes, the hook is intended as a global hook to be used once,
by a global asdf extension called asdf-encodings, to be written.
The reason to make it an extension rather than put it all in asdf
is that I expect external-format support to be a long and painful thing
to write to support all encodings on all implementations;
I'd rather that be done outside of ASDF,
because it's a lot of code I'd rather not put in ASDF,
the development cycles are different, and
it shouldn't matter for the vast majority of us
who'll use the default settings (i.e. UTF-8).

> (By the way, I wouldn't call a
> hooked function a hook, so that #'default-encoding-external-format-hook
> would be #'default-encoding-external-format.)
Good suggestion. Renamed in 2.20.7.

> The last issue relates to the strictness of the
> default-encoding-external-format.  Probably it's all right, but then
> wouldn't it be good to define a permissive alternative which behaves
> like in 2.20.2?
I'm not sure who's to gain what with that. If you're writing a .asd,
you know what charset your code is using. If it's UTF-8 indeed,
why would you want to reduce the number of cases
in which your charset is correctly recognized?
And if it's not UTF-8, you're probably having trouble with "bug" reports
from all those SBCL + utf-8 users around today.
Or maybe you don't have end-users, and want to force your local encoding;
if such cases exist around the world,
we might need a solution quicker than expected;
and so you've convinced me to add support for an explicit :default
as a valid encoding for backwards-compatibility purposes only.

—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
Apparently a government can prevent itself and its successors indefinite
from doing bad things, just by writing a note to itself that says
"don't do bad things." — Mencius Moldbug on constitutions

More information about the asdf-devel mailing list