[asdf-devel] source file encoding

Douglas Crosher dtc-asdf at scieneer.com
Mon Apr 9 15:37:13 UTC 2012


On 04/09/2012 10:37 AM, Faré wrote:
> On Sun, Apr 8, 2012 at 15:28, Nikodemus Siivola
> <nikodemus at random-state.net> wrote:
>> On 8 April 2012 17:36, Faré <fahree at gmail.com> wrote:
>>
>>> I think requiring a few marginal hackers doing weird things
>>> to specifiy :encoding :default is a small price to pay for everyone to be able to specify
>>
>> I disagree. Consider this:
>>
>> X has a system that used to be in, say, LATIN-9. He uses latin-9 at
>> home, and everything works fine. His users either use it as well, or
>> at least another single-byte encoding.
>>
>> ASDF is updated, and X's user reports breakage. Everything works fine
>> for X, because he didn't update ASDF yet. So he updates ASDF, and X
>> updates his system to specify :LATIN-9 (or :DEFAULT, or whatever).
>>
>> Now another of his users reports breakage, because /they/ didn't
>> update ASDF yet -- and their ASDF doesn't support :ENCODING, so things
>> break. They update ASDF, which in turn breaks another :LATIN-N system
>> they were using.
>>
>> The potential cost is non-trivial, and I really don't pretend to know
>> eg. how many Japanese hackers user non-UTF-encodings in their source.
>>
>> IMO encouraging people to add :encoding :utf-8 is much saner.
>>
> I agree that transition costs must be considered.
> 
> Let's examine the two scenarios,
> where the default is :default vs where the default is :utf-8.
> 
> In both cases, crucial points follow:
> (a) currently :encoding is NOT supported by ASDF.
> (b) therefore, whenever anyone modifies his defsystem to use :encoding,
>  his system will NOT be backward compatible anymore.
> (c) we want to make most code as backward-compatible as possible.
> (d) the application programmer controls what version of ASDF is installed,
>  the library developer doesn't.
> 
> If the default is :utf-8 (my recommendation), then
> * A few programmers of non-UTF-8 applications may hit a snag upgrading ASDF;
> * then they can either use asdf-encodings or use :encoding :default.
> * Their code is then not compatible with older ASDFs anymore, but
> * as application programmers, they fully control which ASDF they use, and
> * even if they need to support old CL implementations,
>  ASDF still supports them (the exception being GCL, that looks quite dead).
> * Meanwhile, library authors can already start migrating to UTF-8,
>  and everyone who upgrades ASDF can reliably enjoy now
>  the benefits of non-ASCII, while preserving backwards-compatibility.

Won't library authors need to wait until their user base has upgraded ASDF
before they can start migrating to UTF-8?  The external-format support helps
write portable libraries using non-ASCII characters and is only available
after an upgrade.

I do see a concern that if developers are required to change their definitions
to add :encoding :default then they will be forced to also make sure their user
base has upgraded now.  Further if their users do upgrade ASDF then it breaks
again - there is no migration path for them.

> If the default is :default (your recommendation), then
> * library developers can't ensure their code use a predictable encoding;
> * this makes any attempt to actually use of non-ASCII characters unreliable.
> * Sure, they might be tempted to use :encoding :utf-8, but then
>  their libraries will be gratuitously incompatible
>  with anyone who hasn't upgraded his ASDF, which is a pain to users.

Perhaps the difference is that portable UTF-8 source is new source and requires
an upgrade of ASDF anyway, whereas making the default :utf-8 forces :encoding :default
on current users and affects legacy code that is already written without a migration
path.

> * thus, library developers can do nothing but wait for EVERYONE
>  to be using a recent ASDF before they can do anything.

Wouldn't this be the reality for portable libraries no matter which default is chosen?

> * Therefore, noone will enjoy any benefit of :encoding for a year,
>  and when we do, it will cause massive backward incompatibility.

I don't appreciate the 'massive backward incompatibility' so perhaps do not understand
your perspective?  I see that future projects using UTF-8 source would need to declare
this in the system definition, but this would not seem to qualify.

Choosing :default would seem to cause the least backward incompatibility as this
is the current behaviour, and offers a migration path to get ASDF upgrades in place.

> Admittedly, in either case, library developers
> could use such conditional reading as in
>   #+asdf-unicode #:asdf-unicode :encoding :utf-8
> or
>   #+asdf-unicode :encoding #:asdf-unicode :latin1
> to make their libraries safer in a backwards-compatible way.

It would be great if some suggestions like this could be offered to ease the transition.

> In both cases, library developers are encouraged to use UTF-8,
> which already most people do, if only that tends to be the default these days
> for users of SBCL and they send bug reports to library authors.

> A default of :default allows a few odd non-UTF8 application developers
> to continue using unportable hacks for a few months, while
> forcing everyone else to wait to do anything and bringing massive
> backwards-incompatibility of libraries in the end.

What 'massive backwards-incompatibility' would be caused by making :default the default?

Most portable libraries are ASCII, and there would be some benefit in libraries
needing UTF-8 support to declare this in the system definition.

> A default of :utf-8 forces these few odd non-UTF8 application developers
> to do a documented step before they continue doing what they are doing,
> at their own upgrade pace (they control when to upgrade ASDF);
> they can then replace a lot of non-portable hacks with a portable :encoding.
> Meanwhile, everyone starts enjoying reliable non-ASCII today.

There may be a concern that their users would have to upgrade ASDF now.

How can everyone enjoy reliable non-ASCII today, without the user base having upgraded
ASDF?

> 
> I agree there's no solution that makes everyone happy.
> I believe that a default of :utf-8 doesn't actually make anyone
> terribly unhappy, and empowers everyone to make the changes they need,
> without requiring for anyone to wait for other people to make changes
> (except indeed for a few stray libraries).
> And so my plans are unchanged for now
> (but of course, please keep sending the complaints
> if you think it's wrongheaded; it's still time to not do it).
> 
> NB: I'd especially like the opinion of people who actually
> develop non-ASCII and non-UTF8 libraries or applications.
> 
> PS: I just made asdf-encodings much less dumb.
> I added good support for: sbcl, clozure, clisp, ecl, cmu
> I added dubious support for: abcl, allegro, scl, lispworks
> I think these will remain 8-bit only: cormanlisp, gcl, genera, rmcl, xcl.
> Precious little testing so far,
> except that it doesn't break everything on SBCL.
> Help welcome to test and expand it.
> 	ssh://common-lisp.net/project/asdf/git/asdf-encodings.git
> 	git://common-lisp.net/projects/asdf/asdf-encodings.git
>         http://common-lisp.net/gitweb?p=projects/asdf/asdf-encodings.git
> 
> —♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
> Every major horror of history was committed in the name
> of an altruistic motive. — Ayn Rand
> 





More information about the asdf-devel mailing list