[commonqt-devel] recent commonqt changes
David Lichteblau
david at lichteblau.com
Sat Dec 25 17:31:56 UTC 2010
Hi,
Quoting Ivan Shvedunov (ivan4th at gmail.com):
> Dear colleagues,
> I've pushed the changes I've accumulated over the last week
> to my github repository ( https://github.com/ivan4th/commonqt/ ).
> Please review them and let me know whether I should push some
> of them to the main repo.
Thanks. I have gone ahead and pushed f5c8f374 to the main repo. This
includes all except the caching changes. (Please speak up if that
partial push leaves anything broken.)
Here are my thoughts on the caching:
The are two code paths to consider: The fast path, which hits the cache,
and the slow path, which needs to go through the full dispatch. But the
latter currently includes situations that do not use the cache at all,
even though they should, or that are likely to exceed the cache size.
Possible areas of improvement include:
a. Make the fast path even faster. This is what I had been working on
(before I took a pause with CommonQt development).
There is certainly still room for improvement; depending on the
number of arguments perhaps somewhere between a 3x or 10x speedup
on SBCL. Clozure is perhaps 5x slower than SBCL, which we might
hopefully improve on.
b. Make the slow path faster. In principle a worthy goal:
Although only some code will truly benefit from this, the potential
speedup for affected code is tremendous, because the slow path is
currently extremely slow. I don't have numbers, but back when
CommonQt always went through the slow path, it was basically
unusable on Clozure.
c. Fix the cases that do not benefit from the cache yet at all:
There many such cases at the moment.
Your marshalling changes fall under "making the slow path faster". I
appreciate this goal in principle, but I am somewhat reluctant to push
those changes just yet, for the following reasons:
1. Benchmarks. Let's write benchmarks for the slow path before trying
to improve it. (My existing micro-benchmarks are only for the fast
path.)
2. Thread safety and use of hash tables. CommonQt is meant to be
thread-safe for ordinary operations like method invocations. I
know that many Lisps support thread-safe hash tables, but I think
it would be nice if we could build caching without having to depend
on somewhat unportable features for these data structures.
The existing caching code reimplements hash tables and takes care
to atomically replace single elements of a vector for this reason.
(Why is thread safety needed at all? Qt only has one GUI thread,
but additional Lisp threads appear as Qt threads and are useful for
all non-GUI code, e.g. Qt I/O multiplexing. BTW, note that
startup-related code that initializes smoke modules is an
exception: Initialization is not thread safe at this point. Users
are expected to do this only once.)
3. Duplicate caching: Method lookup would have a cache, which calls
marshalling functions on cache misses, which would also have a
(different) cache. Can't we have one cache that covers both?
Perhaps this is fundamentally the same argument as 4.:
4. Finally, a thought that is not an objection, but merely a question
of priorities:
Most code should hit the fast path anyway, and I consider it a bug
that %METHOD-INVOCATION-CALLBACK doesn't do so.
So I'd like to work on c. in preference to b. for the moment.
The analysis/plan is basically this:
- The current cache is a "call site"-dependent cache. I used this
approach because it is the easiest. It's not a bad idea in
principle, because it needs "hash table" entries (in the generic
sense, not the cl:hash-table sense) only for those argument
signatures actually encountered for a specific call site.
- It has a big downside, however: It performs badly for functions that
use different methods and argument signatures every time. In
general, if you don't do
(defun yourfun () (#_yourcall))
but rather
(defun yourfun () (helperfunction))
(defun helperfunction () (#_yourcall))
the cache moves from yourfun to helperfunction. If there are other
callers to helperfunction, they suddenly share the same cache.
The solution is move away from call-site-specific caches, and towards
class-specific and method(-name)-specific caches. Basically we would
make a pre-allocated lisp array of cache data structures (probably at
smoke module loading time).
One simple model for the run-time dispatch would be to take the class id
as a reference into that array, and compute a hash key out of method
name and signature.
Another similar approach (perhaps closer to the current code) is to
index on the method name first, and then hash on class and argument
signature.
(But note that we need to cache the step that computes the method id, so
we cannot use that as a hash key.)
The second approach is very similar to how CLOS is usually implemented,
I think. Our problem is very close to CLOS in any case, and rather
unlike vtable dispatch in normal OO languages. Hopefully our code can
be much simpler than CLOS though; e.g. because we can work out in
advance how large cache table need to be for the usual smoke modules, so
that we don't need to enlarge and re-hash things.
[...]
> 5. UNMARSHALLER now caches unmarshalled closures.
> This seems to improve performance a bit in
> %METHOD-INVOCATION-CALLBACK as it doesn't cache
Side note: The specific code in %METHOD-INVOCATION-CALLBACK is obviously
distinct for ordinary method dispatch, but I think it could use the same
global cache tables.
> anything itself. Also, I've got rid of redundant NONCONST-NAME
> function I've introduced previously. I did some tests
> on a QListView code with custom model with SBCL
> statistical profiler and seems like all this indeed helps
> (will later distill that test code into automatic benchmark).
Thank you, a benchmark would be most helpful.
d.
More information about the commonqt-devel
mailing list