[cl-prevalence-devel] long-chains-bug

Sven Van Caekenberghe scaekenberghe at common-lisp.net
Wed Dec 8 12:03:55 UTC 2004


On 27 Nov 2004, at 09:18, Sven Van Caekenberghe wrote:

> James,
>
> Thanks for the feedback. Here is a first reaction, without looking at 
> your code.
>
> On 27 Nov 2004, at 03:58, James Wright wrote:
>
>> The first bug is pretty minor: An attempt to serialize a
>> standard-object will fail if any of its slots are unbound.
>
> This is probably fixed in the CVS version (among other things) - have 
> a look at the mailing list(s).
>
>> The second bug is somewhat mystifying.  Attempting to serialize an
>> object that points to another object that points ... to another object
>> that points to nothing will fail with a stack overflow if the length
>> of the "chain" is any longer than about 1300.  I'm using Lispworks
>> with a stack size of 64000.
>
> I never tried chains that long. But yes there might be a problem here: 
> the serializer tracks every object it encounters, and is itself a 
> recursive process. Although 1300 doesn't seem that large a number. I 
> am using LW myself, and sometimes you have to expand the stack a 
> couple of times for very recursive code to succeed.
>
> I will have a look at your example later on.

It became a lot later - sorry about that.

First of all, thanks James for the excellent test code, I'll reproduce 
your code with my modifications here:

(in-package :cl-user)
(asdf:operate 'asdf:load-op :cl-prevalence)

(defclass bar ()
   ((slot-1 :accessor slot-1)))

(defun unbound-slots-bug (&optional (fname 
#P"/tmp/unbound-slots-bug-1.xml"))
   (let ((obj (make-instance 'bar)))
     ; Try to serialize an object that has an unbound slot
     (with-open-file (out fname :direction :output :if-exists :supersede)
       (s-serialization:serialize-xml obj out))))

(defun long-chains-bug (&optional (depth 2000) (fname 
#P"/tmp/long-chains-bug.xml"))
   (let* ((obj (make-instance 'bar))
          (cur obj))
     ; Make a long chain of objects
     (dotimes (n depth)
       (let ((new (make-instance 'bar)))
         (setf (slot-1 cur) new)
         (setf cur new)))
     ; Attempt to serialize the chain
     (with-open-file (out fname :direction :output :if-exists :supersede)
       (s-serialization:serialize-xml obj out))))

(defun read-long-chain (&optional (fname #P"/tmp/long-chains-bug.xml"))
   ; Just read the chain back in
   (with-open-file (in fname :direction :input)
     (s-serialization:deserialize-xml in)))

(defun goto-depth (&optional (depth 2000))
   (let ((oddp (oddp depth))
         (evenp (evenp depth))
         (plusp (plusp depth))
         (minusp (minusp depth))
         (zerop (zerop depth)))
     (if (not (or oddp evenp)) (invoke-debugger))
     (when zerop (error 'forced))
     (unless (or zerop minusp)
       (assert plusp)
       (goto-depth (1- depth))
       t)))

(defun goto-depth (&optional (depth 2000))
   (unless (zerop depth)
     (goto-depth (1- depth))
     t))

As I said before the unbound slots problem is fixed already (unbound 
slots are skipped).

The long-chain problem does indeed occur, but is solvable by expanding 
the stack (or starting with a larger stack).

By the way, deserializing is even worse than serializing by a factor of 
4 (there are 2 XML tags per level, and each function involved in the 
recursion has a larger stack frame).

Both serializing and deserializing are inherently recursive problems 
that are handled as such, the deeper the object structures, the more 
stack they will need to execute. I do not immediately see a solution 
(like using a more iterative algorithm). I know that in garbage 
collection there are solutions to similar problems, but there the 
solution is using the datastructure itself destructively as a kind of 
stack, hardly a solution in our case.

What is a bit strange with LispWorks is the (apparent) amount of memory 
used for the stack.

The last goto-depth version above (forced to be non-tail-recursive) 
breaks the default listener stack limit with a depth of 17275 (on LWM). 
The other goto-depth version tries to make the stack frame larger by 
using a lot of (useless) local variables. That version breaks the stack 
with a depth of 789 (about the same depth where the long-chains-bug 
breaks). So adding 5 local variables increases the frame size by about 
20, which could be explained by saying that each local variable takes 
32-bit and all sizes are in bytes. It still feels like a lot to me...

Any suggestions, ideas ?

Maybe we should test these limits on other platforms ?

Sven




More information about the Cl-prevalence-devel mailing list