[Ecls-list] AREF/SVREF/CHAR/SCHAR/ELT and signals

Fri Mar 18 18:07:39 UTC 2011

Matthew Mondor <mm_lists at pulsar-zone.net>
writes:

> Hello,
>
> When looking at the HyperSpec on those functions, it seems undefined
> what happens when a supplied index is not a "valid array index", thus,
> from 0 below the array size.  The exception is ELT, which should signal
> a condition of type TYPE-ERROR.
>
> In practice however, it seems that implementations attempt to signal an
> error condition for invalid index access.
>
> In SBCL, AREF, SVREF, CHAR, SCHAR will signal a uniform condition of
> type SB-INT:INVALID-ARRAY-INDEX-ERROR:
> Index 4 out of bounds for <insert array type here>,
> should be nonnegative and <<insert size here>.
>
> Although a different behaviour for ELT:
> The index 4 is too large. [SB-KERNEL:INDEX-TOO-LARGE-ERROR]
>
>
> For ECL, the condition type and error message varies, and SVREF even
> behaves differently than the others:
>
> CHAR: 4 is an illegal index to "123". [SIMPLE-ERROR]
>
> SCHAR: 4 is an illegal index to "123". [SIMPLE-ERROR]
>
> AREF: In an anonymous function, the index into the object 4. takes a value #(1 2 3) out of the range (INTEGER 0 2). [SIMPLE-TYPE-ERROR]
>
> SVREF: (this differs in interpreted and compiled mode)
> Interpreted mode: In function SVREF, the index into the object 4. takes a value #(1 2 3) out of the range (INTEGER 0 2). [SIMPLE-TYPE-ERROR]
> Compiled mode: Returns the supplied vector instead of signaling an error.
>
> ELT: 4 is not a valid index into the object #(1 2 3). [SIMPLE-TYPE-ERROR]
>
>
> Surely that a more consistent error signal could be used among these in
> the future; 

Indeed.

> but my main concern is about SVREF; is its special
> behaviour intentional?  

Definitely!

The way they're specified explicitely allows implementations to optimize
more or less these function calls, up to the level of C, if the
customers wanted so.

And similarly, the C standard doesn't specify either what happens when
you reference a slot in a vector beyond (or before) the bounds of the
vector (so called 'array' in C).

It just happens that customers of C compilers tend to expect from their
compiler not to test out of bound errors.

In the case of Common Lisp, there's a declaration that allows the user
to tell the compiler what he'd like:

  (declaim (optimization (safety 0))) ; for no runtime out-of-bound error checking
  (declaim (optimization (safety 3))) ; for runtime out-of-bound error checking

The definition of levels 1 and 2 being left up to the implementation.

> In the resulting C code I was happy to see that
> better optimization was possible than with AREF: inline C array access
> and lack of bound checking (for AREF, a function call is used for every
> access).  On the other hand, it could probably lead to unexpected
> behaviour in some buggy code to return a array instead of an integer?

Some other functions, when (safety 0) is specified, return an invalid
object.  This might be better than either the original vector or an
element of the vector.  However, SVREF is specified in such a way as to
be highly optimizable, so it is acceptable to make it faster and let it
return just what it has in the registers, even if it's the vector.  

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.