[elephant-devel] Re: Severely poor performance in some obvious cases

Robert L. Read read at robertlread.net
Thu Nov 29 15:06:17 UTC 2007


On Thu, 2007-11-29 at 16:16 +0200, Alex Mizrahi wrote:
>  AP> Am I missing something really basic here?
> 
> actually it's quite strange situation that you have *many* employees with 
> same name but you want just one (random one). i cannot imagine why one needs 
> this in real world..
> 
> or you're saying that all have different names, but it still does consing? 
> this could be a bug then..
> 

With respect to consing, it is important to point out that our
serializer is very consing (for postmodern and CL-SQL backends.)  This
is because I used base64 to transform the byte-streams into character
strings.

Most relational databases (including Postgres) provide a way of storing 
byte sequences directly.  However, this is not standardized and not
portable.  In fact, I spoke to Kevin Rosenberg, the author of CL-SQL,
and he and CL-SQL don't have a good way to do it.

However, since postmodern is Postgres specific, it could avoid this
step, by using a back-end specific serializer.  I suspect this would
have a huge impact on performance, both by decreasing consing (minor)
and by decreasing the amount of disc I/O that has to be done (major).

(BDB doesn't have this problem, because it natively uses byte-sequences,
not character-sequences.)

Please see the code below, which demonstrates that pushing 1 million
bytes through the serializer (without even going to the database)
creates 8 million bytes of garbage in 0.433 seconds. (This is on a new,
fast, 2 gigabyte 64-bit machine, against postmodern:

asdf:operate 'asdf:load-op :elephant)
(asdf:operate 'asdf:load-op :ele-clsql)
(asdf:operate 'asdf:load-op :postmodern)

(asdf:operate 'asdf:load-op :elephant-tests)
(in-package "ELEPHANT-TESTS")

(setq *default-spec* *testpm-spec*)

(setq teststring "supercalifragiliciousexpialidocious")
(setq testint 42)

(setq totalseriazationload (* 1000 1000))

(setq n (ceiling (/ totalseriazationload (length teststring))))

(open-store *default-spec*)

(time
 (dotimes (x n)
   (in-out-value teststring)))

(close-store)

*****
Results in:
Evaluation took:
  0.433 seconds of real time
  0.172974 seconds of user run time
  0.058991 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  8,731,728 bytes consed.
NIL
ELE-TESTS> 

I personally think making a back-end specific serializer to avoid the
base64 encoding would make a significant performance difference.  This
is not much of an issue for me personally, since I keep everything
cached in memory anyway.


-- 
Robert L. Read, PhD
http://konsenti.com




More information about the elephant-devel mailing list