[Ecls-list] Fwd: ECL failure with Pierre Mai's Deflate library

Juan Jose Garcia-Ripoll juanjose.garciaripoll at googlemail.com
Tue Oct 5 21:32:22 UTC 2010


---------- Forwarded message ----------
From: Pierre R. Mai <pmai at pmsf.de>
Date: 2010/10/5
Subject: Re: [Ecls-list] ECL failure with Pierre Mai's Deflate library
To: Zach Beane <xach at xach.com>, Juan Jose Garcia-Ripoll <
juanjose.garciaripoll at googlemail.com>



Am 05.10.2010 um 12:01 schrieb Pierre R. Mai:

>
> Am 04.10.2010 um 21:59 schrieb Zach Beane:
>
>> Juan Jose Garcia-Ripoll <juanjose.garciaripoll at googlemail.com> writes:
>>
>>> Zach, thanks a lot for a self-contained test. Could you please provide
us with
>>> some more information about the platform where you tested ECL? I mean at
>>> least configuration flags, chipset, 32/64 bits, operating system and
value of
>>> *features*
>>
>> Sure, sorry for omitting that important info!
>>
>> I am running ECL from git://ecls.git.sourceforge.net/gitroot/ecls/ecl
>> with the last commit date of "Tue Sep 28 12:14:30 2010 +0200". I am
>> building it on a 64-bit Core 2 Duo Linux (Ubuntu 10.04) system. I am not
>> using any configuration flags. Here is the *FEATURES* list:
>>
>>   (:LINUX :FORMATTER :LONG-LONG :UINT64-T :UINT32-T :UINT16-T
>>    :RELATIVE-PACKAGE-NAMES :DFFI :CLOS-STREAMS :CMU-FORMAT :UNIX
>>    :ECL-PDE :DLOPEN :CLOS :BOEHM-GC :ANSI-CL :COMMON-LISP
>>    :IEEE-FLOATING-POINT :PREFIXED-API :FFI :X86_64 :COMMON :ECL)
>>
>> Thanks,
>> Zach
>
> Hi everyone,
>
> I've had a chance to look at what seems to be happening and found at least
one source of trouble:
>
> The first error occurs due to the code trying to copy a literal block
where it should huffman decode a block, and this seems to happen due to a
problem with bit-stream-read-bits returning 0 where it should return 1 (as
called from decode-block):
>
> (defun bit-stream-read-bits (stream bits)
>  (declare (type bit-stream stream) (type (unsigned-byte 8) bits))
>  "Read single or multiple bits from the given bit-stream."
>  (loop while (< (bit-stream-bit-count stream) bits)
>        do
>     ;; Fill bits
>     (setf (bit-stream-bits stream)
>           (logior (bit-stream-bits stream)
>                   (the (unsigned-byte 29)
>                     (ash (bit-stream-get-byte stream)
>                          (bit-stream-bit-count stream))))
>           (bit-stream-bit-count stream) (+ (bit-stream-bit-count stream)
8)))
>  ;; Return properly masked bits
>  (if (= (bit-stream-bit-count stream) bits)
>      (prog1 (bit-stream-bits stream)
>        (setf (bit-stream-bits stream) 0
>              (bit-stream-bit-count stream) 0))      (prog1 (ldb (byte bits
0) (bit-stream-bits stream))
>        (setf (bit-stream-bits stream) (ash (bit-stream-bits stream) (-
bits))
>              (bit-stream-bit-count stream) (- (bit-stream-bit-count
stream) bit
> s)))))
>
> If I remove the (type (unsigned-byte 8) bits) declaration, the function
suddenly seems to perform as intended.  Given that the declaration is
correct, i.e. bits is always between 0 and 255 (actually it is much
smaller), the removal of the declaration should have no effect, but it does,
which might be indicative of a bug somewhere...
>
> Once this problem is dealt with the code errors out due to huffman decode
corruption, which seems indicative of further problems, but I have not yet
been able to look at that in more detail.

I have now looked at that problem, too, and it seems there are two further
sources of error in that area.  The first is that make-huffman-decode-tree
does not seem to work properly, because its changes of the length-count and
next-code arrays seem to not have any effect (or mutate data outside the
proper position):

(defun make-huffman-decode-tree (code-lengths)
 "Construct a huffman decode-tree for the canonical huffman code with
the code lengths of each symbol given in the input array."
 (let* ((max-length (reduce #'max code-lengths :initial-value 0))
        (next-code (make-array (1+ max-length) :element-type 'fixnum
                               :initial-element 0))
        (code-symbols (make-array (length code-lengths) :element-type
'fixnum
                                  :initial-element 0))
        (length-count (make-array (1+ max-length) :element-type 'fixnum
                                  :initial-element 0)))
   ;; Count length occurences and calculate offsets of smallest codes
   (loop for index from 1 to max-length
         for code = 0 then (+ code (aref length-count (1- index)))
         do
      (setf (aref next-code index) code)
         initially
      ;; Count length occurences
      (loop for length across code-lengths
            do
         (incf (aref length-count length))
            finally
         (setf (aref length-count 0) 0)))
   ;; Construct code symbols mapping
   (loop for length across code-lengths
         for index upfrom 0
         unless (zerop length)
           do
        (setf (aref code-symbols (aref next-code length)) index)
        (incf (aref next-code length)))
   ;; Return result
   (make-decode-tree :length-count length-count :code-symbols
code-symbols)))

The moment I move the looping constructs to their own function, the code
seems to work fine, however:

(defun helper (length-count max-length next-code code-lengths code-symbols)
   ;; Count length occurences and calculate offsets of smallest codes
   (loop for index from 1 to max-length
         for code = 0 then (+ code (aref length-count (1- index)))
         do
      (setf (aref next-code index) code)
         initially
      ;; Count length occurences
      (loop for length across code-lengths
            do
         (incf (aref length-count length))
            finally
         (setf (aref length-count 0) 0)))
   ;; Construct code symbols mapping
   (loop for length across code-lengths
         for index upfrom 0
         unless (zerop length)
           do
        (setf (aref code-symbols (aref next-code length)) index)
        (incf (aref next-code length))))

(defun make-huffman-decode-tree (code-lengths)
 "Construct a huffman decode-tree for the canonical huffman code with
the code lengths of each symbol given in the input array."
 (let* ((max-length (reduce #'max code-lengths :initial-value 0))
        (next-code (make-array (1+ max-length) :element-type 'fixnum
                               :initial-element 0))
        (code-symbols (make-array (length code-lengths) :element-type
'fixnum
                                  :initial-element 0))
        (length-count (make-array (1+ max-length) :element-type 'fixnum
                                  :initial-element 0)))
   (helper length-count max-length next-code code-lengths code-symbols)
   ;; Return result
   (make-decode-tree :length-count length-count :code-symbols
code-symbols)))

This seems very strange, and maybe is indicative of an error in
type-propagation?

Once that problem is resolved, there remain problems with the use of of-type
declarations in read-huffman-code, i.e. the of-type declarations for
length-count and code-symbols have to be removed, otherwise the accesses to
those arrays seem to also be off/wrong (which also seems indicative that the
compiler somehow gets confused on array accesses to fixnum arrays, maybe due
to the discrepancy between declared and upgraded element type fixnum vs.
SI:INTEGER64):

(defun read-huffman-code (bit-stream decode-tree)
 (declare (type bit-stream bit-stream) (type decode-tree decode-tree)
          (optimize (speed 3) (safety 0) (space 0) (debug 0)))
 "Read the next huffman code word from the given bit-stream and
return its decoded symbol, for the huffman code given by decode-tree."
 (loop with length-count of-type (simple-array fixnum (*))
         = (decode-tree-length-count decode-tree)
       with code-symbols of-type (simple-array fixnum (*))
         = (decode-tree-code-symbols decode-tree)
       for code of-type fixnum = (bit-stream-read-bits bit-stream 1)
         then (+ (* code 2) (bit-stream-read-bits bit-stream 1))
       for index of-type fixnum = 0 then (+ index count)
       for first of-type fixnum = 0 then (* (+ first count) 2)
       for length of-type fixnum upfrom 1 below (length length-count)
       for count = (aref length-count length)
       thereis (when (< code (the fixnum (+ first count)))
                 (aref code-symbols (+ index (- code first))))
       finally
    (error 'deflate-decompression-error
           :format-control
           "Corrupted Data detected during decompression: ~
            Incorrect huffman code (~X) in huffman decode!"
           :format-arguments (list code))))

has to be turned into

(defun read-huffman-code (bit-stream decode-tree)
 (declare (type bit-stream bit-stream) (type decode-tree decode-tree)
          (optimize (speed 3) (safety 0) (space 0) (debug 0)))
 "Read the next huffman code word from the given bit-stream and
return its decoded symbol, for the huffman code given by decode-tree."
 (loop with length-count
         = (decode-tree-length-count decode-tree)
       with code-symbols
         = (decode-tree-code-symbols decode-tree)
       for code of-type fixnum = (bit-stream-read-bits bit-stream 1)
         then (+ (* code 2) (bit-stream-read-bits bit-stream 1))
       for index of-type fixnum = 0 then (+ index count)
       for first of-type fixnum = 0 then (* (+ first count) 2)
       for length of-type fixnum upfrom 1 below (length length-count)
       for count = (aref length-count length)
       thereis (when (< code (the fixnum (+ first count)))
                 (aref code-symbols (+ index (- code first))))
       finally
    (error 'deflate-decompression-error
           :format-control
           "Corrupted Data detected during decompression: ~
            Incorrect huffman code (~X) in huffman decode!"
           :format-arguments (list code))))

Attached you can find a patch file detailing the necessary changes to make
it work on ECL.  Given that the declarations and code in question seems to
be perfectly conforming, I'd be very interested in the source of these
problems (note that the code might still have other problems on ECL, which
have not been tickled by the current test case).

Regs, Pierre.

PS: Feel free to forward this mail to the ecl list, since I have no posting
rights for the list.





--
Pierre R. Mai                                        <pmai at pmsf.de>
PMSF IT Consulting Pierre R. Mai                http://www.pmsf.de/
Blumenstr. 4                                          Goethestr. 32
85417 Marzling                                     87724 Ottobeuren
Tel. +49(0)8161/935 35 12                      +49(0)8332/936 69 13
Fax  +49(0)8161/935 35 08                      +49(0)8332/936 69 03
VAT ID / USt-ID Nr: DE 212838159                            Germany





-- 
Instituto de Física Fundamental, CSIC
c/ Serrano, 113b, Madrid 28006 (Spain)
http://juanjose.garciaripoll.googlepages.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.common-lisp.net/pipermail/ecl-devel/attachments/20101005/acc64454/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecl-fixes.patch
Type: application/octet-stream
Size: 3182 bytes
Desc: not available
URL: <https://mailman.common-lisp.net/pipermail/ecl-devel/attachments/20101005/acc64454/attachment.obj>


More information about the ecl-devel mailing list