[asdf-devel] source file encoding
Douglas Crosher
dtc-asdf at scieneer.com
Mon Apr 9 15:37:13 UTC 2012
On 04/09/2012 10:37 AM, Faré wrote:
> On Sun, Apr 8, 2012 at 15:28, Nikodemus Siivola
> <nikodemus at random-state.net> wrote:
>> On 8 April 2012 17:36, Faré <fahree at gmail.com> wrote:
>>
>>> I think requiring a few marginal hackers doing weird things
>>> to specifiy :encoding :default is a small price to pay for everyone to be able to specify
>>
>> I disagree. Consider this:
>>
>> X has a system that used to be in, say, LATIN-9. He uses latin-9 at
>> home, and everything works fine. His users either use it as well, or
>> at least another single-byte encoding.
>>
>> ASDF is updated, and X's user reports breakage. Everything works fine
>> for X, because he didn't update ASDF yet. So he updates ASDF, and X
>> updates his system to specify :LATIN-9 (or :DEFAULT, or whatever).
>>
>> Now another of his users reports breakage, because /they/ didn't
>> update ASDF yet -- and their ASDF doesn't support :ENCODING, so things
>> break. They update ASDF, which in turn breaks another :LATIN-N system
>> they were using.
>>
>> The potential cost is non-trivial, and I really don't pretend to know
>> eg. how many Japanese hackers user non-UTF-encodings in their source.
>>
>> IMO encouraging people to add :encoding :utf-8 is much saner.
>>
> I agree that transition costs must be considered.
>
> Let's examine the two scenarios,
> where the default is :default vs where the default is :utf-8.
>
> In both cases, crucial points follow:
> (a) currently :encoding is NOT supported by ASDF.
> (b) therefore, whenever anyone modifies his defsystem to use :encoding,
> his system will NOT be backward compatible anymore.
> (c) we want to make most code as backward-compatible as possible.
> (d) the application programmer controls what version of ASDF is installed,
> the library developer doesn't.
>
> If the default is :utf-8 (my recommendation), then
> * A few programmers of non-UTF-8 applications may hit a snag upgrading ASDF;
> * then they can either use asdf-encodings or use :encoding :default.
> * Their code is then not compatible with older ASDFs anymore, but
> * as application programmers, they fully control which ASDF they use, and
> * even if they need to support old CL implementations,
> ASDF still supports them (the exception being GCL, that looks quite dead).
> * Meanwhile, library authors can already start migrating to UTF-8,
> and everyone who upgrades ASDF can reliably enjoy now
> the benefits of non-ASCII, while preserving backwards-compatibility.
Won't library authors need to wait until their user base has upgraded ASDF
before they can start migrating to UTF-8? The external-format support helps
write portable libraries using non-ASCII characters and is only available
after an upgrade.
I do see a concern that if developers are required to change their definitions
to add :encoding :default then they will be forced to also make sure their user
base has upgraded now. Further if their users do upgrade ASDF then it breaks
again - there is no migration path for them.
> If the default is :default (your recommendation), then
> * library developers can't ensure their code use a predictable encoding;
> * this makes any attempt to actually use of non-ASCII characters unreliable.
> * Sure, they might be tempted to use :encoding :utf-8, but then
> their libraries will be gratuitously incompatible
> with anyone who hasn't upgraded his ASDF, which is a pain to users.
Perhaps the difference is that portable UTF-8 source is new source and requires
an upgrade of ASDF anyway, whereas making the default :utf-8 forces :encoding :default
on current users and affects legacy code that is already written without a migration
path.
> * thus, library developers can do nothing but wait for EVERYONE
> to be using a recent ASDF before they can do anything.
Wouldn't this be the reality for portable libraries no matter which default is chosen?
> * Therefore, noone will enjoy any benefit of :encoding for a year,
> and when we do, it will cause massive backward incompatibility.
I don't appreciate the 'massive backward incompatibility' so perhaps do not understand
your perspective? I see that future projects using UTF-8 source would need to declare
this in the system definition, but this would not seem to qualify.
Choosing :default would seem to cause the least backward incompatibility as this
is the current behaviour, and offers a migration path to get ASDF upgrades in place.
> Admittedly, in either case, library developers
> could use such conditional reading as in
> #+asdf-unicode #:asdf-unicode :encoding :utf-8
> or
> #+asdf-unicode :encoding #:asdf-unicode :latin1
> to make their libraries safer in a backwards-compatible way.
It would be great if some suggestions like this could be offered to ease the transition.
> In both cases, library developers are encouraged to use UTF-8,
> which already most people do, if only that tends to be the default these days
> for users of SBCL and they send bug reports to library authors.
> A default of :default allows a few odd non-UTF8 application developers
> to continue using unportable hacks for a few months, while
> forcing everyone else to wait to do anything and bringing massive
> backwards-incompatibility of libraries in the end.
What 'massive backwards-incompatibility' would be caused by making :default the default?
Most portable libraries are ASCII, and there would be some benefit in libraries
needing UTF-8 support to declare this in the system definition.
> A default of :utf-8 forces these few odd non-UTF8 application developers
> to do a documented step before they continue doing what they are doing,
> at their own upgrade pace (they control when to upgrade ASDF);
> they can then replace a lot of non-portable hacks with a portable :encoding.
> Meanwhile, everyone starts enjoying reliable non-ASCII today.
There may be a concern that their users would have to upgrade ASDF now.
How can everyone enjoy reliable non-ASCII today, without the user base having upgraded
ASDF?
>
> I agree there's no solution that makes everyone happy.
> I believe that a default of :utf-8 doesn't actually make anyone
> terribly unhappy, and empowers everyone to make the changes they need,
> without requiring for anyone to wait for other people to make changes
> (except indeed for a few stray libraries).
> And so my plans are unchanged for now
> (but of course, please keep sending the complaints
> if you think it's wrongheaded; it's still time to not do it).
>
> NB: I'd especially like the opinion of people who actually
> develop non-ASCII and non-UTF8 libraries or applications.
>
> PS: I just made asdf-encodings much less dumb.
> I added good support for: sbcl, clozure, clisp, ecl, cmu
> I added dubious support for: abcl, allegro, scl, lispworks
> I think these will remain 8-bit only: cormanlisp, gcl, genera, rmcl, xcl.
> Precious little testing so far,
> except that it doesn't break everything on SBCL.
> Help welcome to test and expand it.
> ssh://common-lisp.net/project/asdf/git/asdf-encodings.git
> git://common-lisp.net/projects/asdf/asdf-encodings.git
> http://common-lisp.net/gitweb?p=projects/asdf/asdf-encodings.git
>
> —♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
> Every major horror of history was committed in the name
> of an altruistic motive. — Ayn Rand
>
More information about the asdf-devel
mailing list