[parenscript-devel] Big compiler refactoring done.
Red Daly
reddaly at stanford.edu
Mon Aug 13 23:29:13 UTC 2007
Vladimir Sedach wrote:
> Hello everybody,
>
> I just finished pushing the compiler refactoring patch that I
> mentioned earlier. The big change is that now the intermediate
> ParenScript representation is just s-expressions instead of CLOS
> objects, which provides several benefits: the intermediate code is now
> much easier to manipulate and to write code walkers for, which should
> make things like optimization passes easier to write, it is easily
> inspectable and serializable, which enables easier debugging and unit
> testing (I plan to add unit tests for the compiler and the printer now
> that they have been decoupled).
I do not see the benefits of avoiding an object representation of the
syntax tree in favor of a flat SEXP representation. What sort of code
walkers do you mean? Can we just expose the macro-expanded Parenscript,
rather than the `internal representation?' Optimization passes often
require additional information about nodes of the syntax tree. With
classes/structs, this information can be added as additional slots.
Doesn't a transition from objects to SEXPs hinder the ability to attach
information to nodes? SLIME makes inspecting CLOS objects
straightforward, and a print function can be defined to display syntax
nodes in an informative fashion.
SBCL uses structs as an internal representation and some developers
would like more OO functionality:
"[aside by WHN: The data representation of IR1 [(Intermediate
Representation 1)] was set up before OO design was commonplace and
before CLOS was part of the standard, and it shows it. On the other
hand, a lot of things in it show good taste and anticipate OO design. On
the third hand, a lot of things are done by mutating data structures
which could be done much more cleanly by other methods, often simply by
initializing something completely at constructor time. So the system
could really benefit from some refactoring to take advantage of more
modern design ideas (OO, invariants..) and the existence of CLOS.
However, since we can't use CLOS to implement the target compiler until
we restructure the system so that CLOS is built by the cross-compiler
[...] most of those refactorings, even the obvious ones, can't be done
as of sbcl-0.pre7.x.]" [1]
We, of course do not have to worry about cross-compilation and
bootstrapping and have full CLOS at our disposal. I do not see much
reason to avoid it for something like a syntax tree, which has an
obvious class/object representation.
On the other hand, SBCL has to deal with the full Common Lisp
specification, a multi-stage compilation/optimization pipeline, and
multiple architecture and OS targets. For our (currently) very simple
compiler, a SEXP representation is clearly sufficient to accomplish
Parenscript's modest goals. Optimization and semantic analysis phases,
my next concern, are free to build up a different representation from
the primitive list format.
> The other big change that accompanies
> this is that the decision whether to produce expression or statement
> code has been pushed from the printing code to the actual compiling
> code, which makes the aforementioned code walkers even more practical.
> This has also had the effect of simplifying the printing interface
> considerably (from three functions down to one).
>
>
Great. The decision to compile to statement/expression is best made
earlier in compilation.
> The other changes I made are removing the last of the namespace
> functions, and the compilation environments. In the place of the
> namespace code is a mechanism that associates Lisp packages with a
> string prefix: any symbol in that package is printed with that prefix.
> I think that unless there are requests for further namespace
> functionality, that's really all that is necessary for most use cases
> (avoiding name clashes). The compilation environment and toplevel code
> I have removed because I feel that the eval-when functionality can be
> provided in a better way by just using Lisp code, and also to simplify
> the compiler interface (it's now down to one function from three) and
> implementation (which, although much simpler now, I think still has
> some room for improvement).
>
I'll check out these changes and see what I think. Ideally we should
still be able to layer a package system on top of the simple symbol
package to prefix mapping.
> On a more (or less?) controversial note, I have renamed all the
> foo-script-blah functions to be foopsblah (except for
> compile-script-form, which is now compile-parenscript-form). I've come
> to a pretty definite conclusion that it's not only clearer but also
> saves considerable typing.
That's all right for me.
> The next course of action I'm planning is to rewrite the way printing
> is done to simplify things, and correct the currently wonky
> indentation (a part of the decision for how to indent blocks was done
> by the compiler code previously, and of course this is currently
> gone). Once the rewrite is done, I'm planning to update the
> documentation, fix the deprecated interface, and then make a release.
>
How are you planning to change the way printing is done? As far as I
remember, DWIM-JOIN is one of the slower parts of the compiler, though I
have not profiled in a while.
> Happy hacking,
> Vladimir
>
Thanks,
Red
[1] http://sbcl-internals.cliki.net/IR1
More information about the parenscript-devel
mailing list