[Ecls-list] Quadratic behavior of sequence functions when applied to lists

Andy Hefner ahefner at gmail.com
Tue Mar 16 20:01:32 UTC 2010


While making some performance measurements, I observed
[REMOVE,DELETE]-DUPLICATES to take cubic time. This is a bit
ridiculous when the common cases can easily be done in linear time, so
I investigated further. It turns out many of the sequence functions
(namely, those generated by the DEFSEQ macro) have quadratic behavior
when applied to lists due to using ELT inside a loop on array indices.
I measured the runtime of POSITION, COUNT, and FIND with lists as
arguments, and found them to take time quadratic in the size of the
list.

Rather than reworking all the sequence functions, I added a quick hack
to DEFSEQ where, for lists, it records the index of the last element
accessed, and the sublist whose car is that element. So long as the
index increases monotonically, it can walk down the remaining list to
arrive at the new position and guarantee constant-time access. This
makes POSITION and friends linear, at least when :from-end is false.
Requires a modification to NSUBSTITUTE as it modifies the sequence
element. I've attached this patch. I don't claim it's the best
solution, but it gets the job done.

It also fixes a problem in REMOVE-DUPLICATES and DELETE-DUPLICATES.
They contain what appears to be a fast path for the simplest case
involving a list argument and no start or end keywords, but among the
test to activate it was "(null start)", while the start keyword
defaults to 0, so this path is unlikely to be executed. As I read the
Hyperspec, nil isn't a legal value for :start anyway. I changed the
test to (zerop start). There's also a minor typo in an error message
elsewhere fixed in the patch.

I've run the test suite and nothing new seems to break. ECL builds
okay, and I've tried my own software out, which seems to work. I
haven't measured the performance impact of my change when operating on
vectors.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecl-defseq.patch
Type: text/x-patch
Size: 3858 bytes
Desc: not available
URL: <https://mailman.common-lisp.net/pipermail/ecl-devel/attachments/20100317/30d91964/attachment.bin>


More information about the ecl-devel mailing list