[Ecls-list] Status of CVS

Mon May 12 08:53:41 UTC 2008

Tag: (CVS 2008-05-12 10:20)

This week brings two major changes, one related to performance and one
related to safety. None of them is related to ASDF-ECL, which I am
still working on.

- When calling compiled functions, ECL has to create a vector of
arguments to be passed to the next function. ECL now shares the same
region of memory among all function calls, which leads not only to
saving memory but also to faster and leaner code.

- When SAFETY >= 1, ECL inserts automatic CHECK-TYPE forms for all
arguments of a function that have been declared with a nontrivial
type.

Let me explain the rationale of these changes, which should be pretty
harmless while bringing visible differences with respect to previous
behavior.

First of all, a few words about how ECL portably calls compiled
function. The way we do it is that we collect all arguments into a C
vector of lisp objects. Then we call one of three functions (APPLY,
APPLY_fixed and APPLY_closure) which dispatch to the appropriate C
function call depending on the number of arguments. Formerly, ECL
would simply create the vector of arguments in the lisp stack, call
the function and deallocate the vector. This is not too efficient.
Now, ECL notices whether the arguments were already pushed in the lisp
stack, case in which it does nothing, or whether the arguments come
through some other way (like in cl_funcall), case in which it can
safely use an existing vector (the vector where VALUES are stored) to
create the function call. This vector is shared by _all_ function
calls, and leads to no memory allocation / deallocation. Note that
this optimization is platform independent and works even without
--enable-asmapply.

Next to the safety settings. ECL is introducing some optimizations and
it already had some working ones.  Typically, these optimizations are
activated when the arguments of a lisp form match the appropriate
types. For instance, a call to FIRST with an argument of type CONS,
should be inlined in a very efficient and simple C form, ECL_CONS_CAR,
that dereferences a pointer without checking the type of the argument.
But, should it? Well, this was so following with the *CL (KCL, GCL,
EcoLisp, etc) tradition. But it turns out that there is code out there
with incorrect type declarations. For instance, the code in
src/clos/pprint.lsp.

Is this code really incorrect? It depends. It has functions like

(defun pprint-indent (relative-to n &optional stream)
  (declare (type (member :block :current) relative-to)
	   (type real n)
	   (type (or stream (member t nil)) stream)
	   (values null))

The body of the code of this function, stolen from the SBCL tree,
assumes that the arguments have the declared types. However, this is
an ANSI lisp function that might be called with wrong arguments (for
instance in Paul Dietz's tests) leading to unpredictable results if
the compiler assumes the type declarations are correct.

So why is this so? The reason is that SBCL (and probably CMUCL)
automatically inserts type checks at the beginning of this function.
These checks will ensure that the arguments of the function have the
right types or, otherwise, enter the debugger. This is a nonstandard
feature, but it seems that there is some code laying around that
assumes this behavior. In particular, the CLX library shipped with
SBCL and which some people here have tried to build.

So, from now on, ECL will follow this nonstandard behavior and also
generate safety checks if SAFETY >= 1. The optimization policy has
changed. The new settings are displayed in
http://ecls.sourceforge.net/new-manual/ch02.html#ansi.declarations.optimize
(New Manual -> Standards ->Evaluation .. -> OPTIMIZE) The rationale is
as follows:

- If SAFETY >= 1 we assume that user's type declarations are correct.
This includes declarations in functions, in slot definitions, etc.

- If SAFETY = 0, we not only believe type declarations, but do not
check the arguments of function calls at all. Hence CAR will assume
the arguments are always lists, COS will assume the arguments are
numbers, etc.

SAFETY=1 is thus designed to produce relatively safe code which can be
compiled to fast and efficient forms. But note that it is not 100%
safe. One situation where this might lead to wrong results is in
functions but since we generate CHECK-TYPE forms to enforce those
declarations, this is ruled out. What is not discarded is when you
change the structure of an object (i.e. by a new DEFSTRUCT form with
different slots or different slot types), or when a variable is
assigned a value other than it was declared to contain (for instance
you call a function which is expected to output a value of a given
type but returns a different one). For enforced safety use SAFETY >=
2.

I would like to hear your thoughts about this new behavior and / or
whether you would change the defaults

Juanjo

-- 
Facultad de Fisicas, Universidad Complutense,
Ciudad Universitaria s/n Madrid 28040 (Spain)
http://juanjose.garciaripoll.googlepages.com