[asdf-devel] source file encoding

Douglas Crosher dtc-asdf at scieneer.com
Mon Apr 9 23:53:45 UTC 2012


Let me given an example so we can all test your idea:

1. Library developer upgrades ASDF, and starts adding UTF-8 characters.
Lets say the developer assumes a default of UTF-8 so does not
add a declaration, which I think is your suggestion.  The library is
intended to be portable.

2. Users download the library, but have not yet upgraded ASDF.  They start
up an arbitrary CL implementation, which does not default to UTF-8.  The code
may fail to compile, or may have incorrect characters.

I hope this can be accepted and that it is clear that library developers
will need to wait until the user base has upgraded before add UTF-8 to
portable libraries.

This is why we need the external-format support in ASDF - to make this reliable.

Regards
Douglas Crosher

On 04/10/2012 08:05 AM, Faré wrote:
> On Mon, Apr 9, 2012 at 11:37, Douglas Crosher <dtc-asdf at scieneer.com> wrote:
>> Won't library authors need to wait until their user base has upgraded ASDF
>> before they can start migrating to UTF-8?
>>
> No. Library authors have *already* largely adopted UTF-8.
> See previous analysis by Orivej Desh:
> 	"I did a ckeck of quicklisp systems.
>         There are 263 lisp files in 107 systems which assume non-ASCII,
>         and only 31 of them in 20 systems assume non-UTF-8"
> That's out of 700 libraries in Quicklisp.
> Only 9 have been found to be an actual problem, and two are fixed already.
> 	https://github.com/orivej/asdf-encodings/wiki/Tracking-non-UTF-8-lisp-files-in-Quicklisp
> 
> The only issue is to make the results *reliable*
> for these systems that depend on UTF-8.
> 
>> I do see a concern that if developers are required to change their definitions
>> to add :encoding :default then they will be forced to also make sure their user
>> base has upgraded now.  Further if their users do upgrade ASDF then it breaks
>> again - there is no migration path for them.
>>
> Yes. No one in their right mind would use :encoding :default for a library.
> Each author knows what encoding he uses, say :latin1, :koi8-r, :mac-roman
> or :euc-jp, and would specify just that, not :default.
> 
> I was thinking of :default
> 1- because I hadn't written asdf-encodings yet, and
>  needed *some* way to support those setups
> 2- for full backwards compatibility:
>  "if it's not backwards, it's not compatible"
> 
>> Perhaps the difference is that portable UTF-8 source is new source and requires
>> an upgrade of ASDF anyway, whereas making the default :utf-8 forces :encoding :default
>> on current users and affects legacy code that is already written without a migration
>> path.
>>
> UTF-8 is not just for new source. It doesn't require an upgrade of ASDF.
> There is plenty of UTF-8 source already, though mostly for comments
> (but not only for comments: see e.g. λ-reader).
> All modern implementations support UTF-8, though not always as the default.
> Let's just make it a reliable default so we can WORM (write once run
> everywhere).
> And the migration path is clear:
> 	recode l1..u8 foo.lisp
> 
>>> * thus, library developers can do nothing but wait for EVERYONE
>>>  to be using a recent ASDF before they can do anything.
>>
>> Wouldn't this be the reality for portable libraries no matter which default is chosen?
>>
> Whatever the default encoding is,
> libraries can't use :encoding until all their users use a recent ASDF.
> But if :utf-8 becomes the default and they use it,
> they can already enjoy the benefits of deterministic encoding,
> and tell users who have encoding issues "just upgrade your ASDF".
> 
>>> * Therefore, noone will enjoy any benefit of :encoding for a year,
>>>  and when we do, it will cause massive backward incompatibility.
>>
>> I don't appreciate the 'massive backward incompatibility' so perhaps do not understand
>> your perspective?  I see that future projects using UTF-8 source would need to declare
>> this in the system definition, but this would not seem to qualify.
>>
> If the default is :default and you want to enjoy reliable utf-8,
> then you'll need to specify :encoding :utf-8, at which point
> your library ceases to be compatible with users who haven't upgraded ASDF.
> I call that massive backward incompatibility.
> 
> If the default is :utf-8 and your library has a latin1 character,
> you use recode, and your new code still works on old ASDFs as well as new ones.
> That's massive backward compatibility.
> 
>> Choosing :default would seem to cause the least backward incompatibility as this
>> is the current behaviour, and offers a migration path to get ASDF upgrades in place.
>>
> It's compatible for now, but setting us up for massive incompatibility later.
> 
> 
>>> Admittedly, in either case, library developers
>>> could use such conditional reading as in
>>>   #+asdf-unicode #:asdf-unicode :encoding :utf-8
>>> or
>>>   #+asdf-unicode :encoding #:asdf-unicode :latin1
>>> to make their libraries safer in a backwards-compatible way.
>>
>> It would be great if some suggestions like this could be offered to ease the transition.
>>
> I inserted this suggestion in the ASDF documentation.
> I can't retroactively modify old ASDF installations
> to point people at precisely the paragraph they need to consult in the docs
> when they upgrade and things break for them,
> but I trust that Google will help them.
> 
>> Most portable libraries are ASCII, and there would be some benefit in libraries
>> needing UTF-8 support to declare this in the system definition.
>>
> ASCII libraries will work everywhere anyway whatever we do about the default.
> That is, until some maniac writes a Lisp using EBCDIC;
> and still making UTF-8 the default will ensure he can still
> just download source from the net and use it
> without having to transcode it for his implementation.
> Of course, a lot of code that assumes ASCII or ASCII-like continuity
> of letter ranges with fail, but that's a given if he uses EBCDIC.
> 
>> There may be a concern that their users would have to upgrade ASDF now.
>>
> No. Making :utf-8 the default means no one needs to upgrade ASDF now,
> but a few people may have to upgrade a few libraries when they upgrade ASDF.
> 
> Making :default the default and forcing people to use :encoding :utf-8
> to enjoy any reliability means people who use libraries that want to
> be reliable will be forced to upgrade ASDF.
> 
>> How can everyone enjoy reliable non-ASCII today,
>> without the user base having upgraded ASDF?
>>
> Mostly, they can setup their system defaults to UTF-8
> and enjoy most Lisp code already on most implementations.
> When they stray from this default setup I want to formalize,
> nothing works reliably today.
> 
> —♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
> Merely having an open mind is nothing; the object of opening the mind,
> as of opening the mouth, is to shut it again on something solid.
> 	— G.K. Chesterton
> 





More information about the asdf-devel mailing list