[commonqt-devel] recent commonqt changes

Sat Dec 25 17:31:56 UTC 2010

Hi,

Quoting Ivan Shvedunov (ivan4th at gmail.com):
>   Dear colleagues,
>   I've pushed the changes I've accumulated over the last week
> to my github repository ( https://github.com/ivan4th/commonqt/ ).
> Please review them and let me know whether I should push some
> of them to the main repo.

Thanks.  I have gone ahead and pushed f5c8f374 to the main repo.  This
includes all except the caching changes.  (Please speak up if that
partial push leaves anything broken.)

Here are my thoughts on the caching:

The are two code paths to consider: The fast path, which hits the cache,
and the slow path, which needs to go through the full dispatch.  But the
latter currently includes situations that do not use the cache at all,
even though they should, or that are likely to exceed the cache size.

Possible areas of improvement include:

 a. Make the fast path even faster.  This is what I had been working on
    (before I took a pause with CommonQt development).

    There is certainly still room for improvement; depending on the
    number of arguments perhaps somewhere between a 3x or 10x speedup
    on SBCL.  Clozure is perhaps 5x slower than SBCL, which we might
    hopefully improve on.

 b. Make the slow path faster.  In principle a worthy goal:

    Although only some code will truly benefit from this, the potential
    speedup for affected code is tremendous, because the slow path is
    currently extremely slow.  I don't have numbers, but back when
    CommonQt always went through the slow path, it was basically
    unusable on Clozure.

 c. Fix the cases that do not benefit from the cache yet at all:
    There many such cases at the moment.

Your marshalling changes fall under "making the slow path faster".  I
appreciate this goal in principle, but I am somewhat reluctant to push
those changes just yet, for the following reasons:

  1. Benchmarks.  Let's write benchmarks for the slow path before trying
     to improve it.  (My existing micro-benchmarks are only for the fast
     path.)

  2. Thread safety and use of hash tables.  CommonQt is meant to be
     thread-safe for ordinary operations like method invocations.  I
     know that many Lisps support thread-safe hash tables, but I think
     it would be nice if we could build caching without having to depend
     on somewhat unportable features for these data structures.

     The existing caching code reimplements hash tables and takes care
     to atomically replace single elements of a vector for this reason.

     (Why is thread safety needed at all?  Qt only has one GUI thread,
     but additional Lisp threads appear as Qt threads and are useful for
     all non-GUI code, e.g. Qt I/O multiplexing.  BTW, note that
     startup-related code that initializes smoke modules is an
     exception: Initialization is not thread safe at this point.  Users
     are expected to do this only once.)

  3. Duplicate caching: Method lookup would have a cache, which calls
     marshalling functions on cache misses, which would also have a
     (different) cache.  Can't we have one cache that covers both?
     Perhaps this is fundamentally the same argument as 4.:

  4. Finally, a thought that is not an objection, but merely a question
     of priorities:

     Most code should hit the fast path anyway, and I consider it a bug
     that %METHOD-INVOCATION-CALLBACK doesn't do so.

So I'd like to work on c. in preference to b. for the moment.
The analysis/plan is basically this:

  - The current cache is a "call site"-dependent cache.  I used this
    approach because it is the easiest.  It's not a bad idea in
    principle, because it needs "hash table" entries (in the generic
    sense, not the cl:hash-table sense) only for those argument
    signatures actually encountered for a specific call site.

  - It has a big downside, however: It performs badly for functions that
    use different methods and argument signatures every time.  In
    general, if you don't do
      (defun yourfun () (#_yourcall))
    but rather
      (defun yourfun () (helperfunction))
      (defun helperfunction () (#_yourcall))
    the cache moves from yourfun to helperfunction.  If there are other
    callers to helperfunction, they suddenly share the same cache.

The solution is move away from call-site-specific caches, and towards
class-specific and method(-name)-specific caches.  Basically we would
make a pre-allocated lisp array of cache data structures (probably at
smoke module loading time).

One simple model for the run-time dispatch would be to take the class id
as a reference into that array, and compute a hash key out of method
name and signature.

Another similar approach (perhaps closer to the current code) is to
index on the method name first, and then hash on class and argument
signature.

(But note that we need to cache the step that computes the method id, so
we cannot use that as a hash key.)

The second approach is very similar to how CLOS is usually implemented,
I think.  Our problem is very close to CLOS in any case, and rather
unlike vtable dispatch in normal OO languages.  Hopefully our code can
be much simpler than CLOS though; e.g. because we can work out in
advance how large cache table need to be for the usual smoke modules, so
that we don't need to enlarge and re-hash things.

[...]
> 5. UNMARSHALLER now caches unmarshalled closures.
> This seems to improve performance a bit in
> %METHOD-INVOCATION-CALLBACK as it doesn't cache

Side note: The specific code in %METHOD-INVOCATION-CALLBACK is obviously
distinct for ordinary method dispatch, but I think it could use the same
global cache tables.

> anything itself. Also, I've got rid of redundant NONCONST-NAME
> function I've introduced previously. I did some tests
> on a QListView code with custom model with SBCL
> statistical profiler and seems like all this indeed helps
> (will later distill that test code into automatic benchmark).

Thank you, a benchmark would be most helpful.

d.