[parenscript-devel] Big compiler refactoring done.

Red Daly reddaly at stanford.edu
Mon Aug 13 23:29:13 UTC 2007


Vladimir Sedach wrote:
> Hello everybody,
>
> I just finished pushing the compiler refactoring patch that I
> mentioned earlier. The big change is that now the intermediate
> ParenScript representation is just s-expressions instead of CLOS
> objects, which provides several benefits: the intermediate code is now
> much easier to manipulate and to write code walkers for, which should
> make things like optimization passes easier to write, it is easily
> inspectable and serializable, which enables easier debugging and unit
> testing (I plan to add unit tests for the compiler and the printer now
> that they have been decoupled).

I do not see the benefits of avoiding an object representation of the 
syntax tree in favor of a flat SEXP representation.  What sort of code 
walkers do you mean?  Can we just expose the macro-expanded Parenscript, 
rather than the `internal representation?'  Optimization passes often 
require additional information about nodes of the syntax tree.  With 
classes/structs, this information can be added as additional slots.  
Doesn't a transition from objects to SEXPs hinder the ability to attach 
information to nodes?  SLIME makes inspecting CLOS objects 
straightforward, and a print function can be defined to display syntax 
nodes in an informative fashion.

SBCL uses structs as an internal representation and some developers 
would like more OO functionality:

"[aside by WHN: The data representation of IR1 [(Intermediate 
Representation 1)] was set up before OO design was commonplace and 
before CLOS was part of the standard, and it shows it. On the other 
hand, a lot of things in it show good taste and anticipate OO design. On 
the third hand, a lot of things are done by mutating data structures 
which could be done much more cleanly by other methods, often simply by 
initializing something completely at constructor time. So the system 
could really benefit from some refactoring to take advantage of more 
modern design ideas (OO, invariants..) and the existence of CLOS. 
However, since we can't use CLOS to implement the target compiler until 
we restructure the system so that CLOS is built by the cross-compiler 
[...] most of those refactorings, even the obvious ones, can't be done 
as of sbcl-0.pre7.x.]" [1]

We, of course do not have to worry about cross-compilation and 
bootstrapping and have full CLOS at our disposal.  I do not see much 
reason to avoid it for something like a syntax tree, which has an 
obvious class/object representation.

On the other hand, SBCL has to deal with the full Common Lisp 
specification, a multi-stage compilation/optimization pipeline, and 
multiple architecture and OS targets.  For our (currently) very simple 
compiler, a SEXP representation is clearly sufficient to accomplish 
Parenscript's modest goals.  Optimization and semantic analysis phases, 
my next concern, are free to build up a different representation from 
the primitive list format.
> The other big change that accompanies
> this is that the decision whether to produce expression or statement
> code has been pushed from the printing code to the actual compiling
> code, which makes the aforementioned code walkers even more practical.
> This has also had the effect of simplifying the printing interface
> considerably (from three functions down to one).
>
>   
Great.  The decision to compile to statement/expression is best made 
earlier in compilation.
> The other changes I made are removing the last of the namespace
> functions, and the compilation environments. In the place of the
> namespace code is a mechanism that associates Lisp packages with a
> string prefix: any symbol in that package is printed with that prefix.
> I think that unless there are requests for further namespace
> functionality, that's really all that is necessary for most use cases
> (avoiding name clashes). The compilation environment and toplevel code
> I have removed because I feel that the eval-when functionality can be
> provided in a better way by just using Lisp code, and also to simplify
> the compiler interface (it's now down to one function from three) and
> implementation (which, although much simpler now, I think still has
> some room for improvement).
>   
I'll check out these changes and see what I think.  Ideally we should 
still be able to layer a package system on top of the simple symbol 
package to prefix mapping.
> On a more (or less?) controversial note, I have renamed all the
> foo-script-blah functions to be foopsblah (except for
> compile-script-form, which is now compile-parenscript-form). I've come
> to a pretty definite conclusion that it's not only clearer but also
> saves considerable typing.
That's all right for me.
> The next course of action I'm planning is to rewrite the way printing
> is done to simplify things, and correct the currently wonky
> indentation (a part of the decision for how to indent blocks was done
> by the compiler code previously, and of course this is currently
> gone). Once the rewrite is done, I'm planning to update the
> documentation, fix the deprecated interface, and then make a release.
>   
How are you planning to change the way printing is done?  As far as I 
remember, DWIM-JOIN is one of the slower parts of the compiler, though I 
have not profiled in a while.
> Happy hacking,
> Vladimir
>   
Thanks,
Red

[1] http://sbcl-internals.cliki.net/IR1



More information about the parenscript-devel mailing list