[Ecls-list] Some improvements

Matthew Mondor mm_lists at pulsar-zone.net
Fri Jul 3 17:12:08 UTC 2009


On Fri, 3 Jul 2009 15:03:29 +0200
Juan Jose Garcia-Ripoll <juanjose.garciaripoll at googlemail.com> wrote:

> - Function calls are very expensive. Currently ecl_aset still has an
> additional function call that involves the unchecked version, and a
> function call to create integers/floats from the array data.

The -fomit-frame-pointer optimization flag may help slightly on x86 but
this also makes debugging harder...  Traditionally macros were used a
lot to avoid gratuitous function calls for small operations, but this
also makes programs harder to debug (and care must be taken as macros
may have side-effects).

Lately, inline functions become more popular (and are standard in C99)
and the compiler may at its discretion generate a real function where
necessary.  An inlined function may still make the code harder to
debug, but inlining may easily be disabled via CFLAGS.  Of course, this
still doesn't help much if the function is in a shared library.  Some
critical functions however could be defined as part of standard API
header files and those would have a chance of getting inlined.  A side
effect of inlining might also be an increase in text footprint, and
cache-thrashing on cache-starved processors, like for loop unrolling.

SBCL has particularily good automatic inlining but it's also because of
its monolithic/static nature where it has knowledge of its whole
world...

> - PIC code is also horribly slow. This kills us sometimes when we use
> constants, or when a function has a call to an error function. In this
> case the function will call a library function to find out the data
> segment address even if the data is _never_ used.

PIC efficiency depends on architecture and on memory model (i.e. it was
pretty nice with m68k and small memory model but might be very poor on
x86).  Is PIC important other than for a few architectures though?  Or
is it important because of ECL's design?

I could be mistaken but I have the impression that with ELF (supporting
relocation) and MMU, mapping shared libraries in the process space on
x86 doesn't really require PIC code on modern unix-like OS.

In any case, does −minline−plt help any?

> - GCC refuses to optimize tail calls to neighboring functions with the
> same # of arguments. I do not know why. Perhaps because of the
> debugging flags.

Was this with −foptimize−sibling−calls still?  But yes it'd also have
to be tested without debugging flags and with -O3

> Some of these things can be circumvented. For instance ecl_aref() has
> now decreased a 25% execution time, based on some trivial benchmarks,
> just from removing a branch jump and reorganizing the error signalling
> code.

Nice
-- 
Matt




More information about the ecl-devel mailing list