[armedbear-devel] some questions about r12503
Mark Evenson
evenson at panix.com
Mon Feb 22 20:20:42 UTC 2010
On 2/22/10 6:21 PM, Ville Voutilainen wrote:
> The separate pathname matcher for jars looks odd to me. I'd expect the
> listing to give whatever it gives, and the
> matching condition in directory.lisp to filter it. Is jar filtering so
> different that it requires a different matcher?
LIST-DIRECTORY lists the jar directory contents including directories,
while MATCH-WILD-JAR-PATHNAME simply uses PATHNAME-MATCH-P to determine
what to return. Since jar entries which are directories always have a
trailing "/" which is not true for pathnames on the filesystem (#p"/tmp"
could be a file or a directory) the two are not always equivalent.
The jar pathname part of LIST-DIRECTORY is currently unused. I
implemented it first, tried to patch the Lisp in "directory.lisp" to use
it, but ran into problems that weren't understandable. I stepped back,
and noticed that the algorithm for wildcard matching for filesystems was
fundamentally different from jar files (see next comment), implemented
that algorithm as MATCH-WILD-JAR-PATHNAME, saw that it worked well
enough, and went with that for a commit.
Overall, I do suspect that the way I implemented jar pathnames is not
totally optimal, but in the last six weeks I have not been able to
improve on the basic design of using a list for DEVICE. Often there are
points in reworking 'Pathname.java' where I felt "Why I am doing this
same sort of code again? Surely this is a sign a fundamental problem in
abstraction." Sometimes I found a better way, sometimes not, but I was
never able to come up with a better basic assumption (to use DEVICE as a
list of pathnames for the jar file, DIRECTORY as the relative path
within that jar). I have come to the conclusion that implementing jar
pathnames the way I did pushes a lot of complexity to the associated
primitives in Pathname.java, but ultimately makes quite a bit easier on
the user of this abstraction. As evidence for this, I would argue that
my approach *has* dramatically simplified the code in 'Load.java' (and
'Lisp.java' and 'AutoloadedFunctionProxy.java'). A weak point is that
code that thinks that the DEVICE field is always a string—or that
(truename (pathname-directory (truename pathname)) always yields a
pathname if (truename pathname) succeeds—fails. Since a lot of PATHNAME
behavior in ANSI is implementation dependent, we are still an ANSI CL,
but we have very different usage of the DEVICE pathname component than
is commonly assumed.
An alternative might have been to subclass PATHNAME as PATHNAME-JAR, but
when I analyzed that approach it seemed to involve a lot more (if
(pathname-jar-p pathanme) option1 option2) than I wanted. If all the
system code taking a PATHNAME as an argument were to be defined with
generic functions this would be considerably more attractive (and
easier). But the dirty secret of CLOS is that it's a bolt-on via
macros, which all CL implementations that I have studied bootstrap after
the base system is in place. CLOS isn't even present in ABCL when the
user gets to "CL-USER>", right?
> Same question applies to the wildcard matching, jar listing seems to
> do the wildcard matching in java,
> rather than in lisp? That's also different from the way directory
> listings are handled.
DIRECTORY involves wildcards for non-trivial use (its non-wildcard use
of actually doesn't even distinguish a directory from a file!) The
algorithm for use of wildcard DIRECTORY is fundamentally different for
the filesystem than a jar as follows. For a filesytem, you have to
branch at each wildcard in the pathname. For a jar file, you are simply
running down the list of all entries in the jar file contents. One
could probably implement the second (jar pathname directory listing) in
terms of the first, but it wouldn't make much sense and wouldn't be
necessary. I couldn't do it easily coming into problems with my
LIST-DIRECTORY implementation, although I did give it about an hour's
effort.
> The list-directory primitive sorely needs to be split into two
> functions (listJar and listDirectory), it's getting long-winded.
> That's not a high-priority issue, but we need to mind function length,
> it's a huge readability issue.
I am a "if the function doesn't fit into one 80x25 Emacs buffer it
should be split" kinda guy", but the ABCL codebase violates that maxim
at so many points (q.v. compiler-pass2.lisp) that I don't try to
religously follow that principle here. I'd be happy to do such
splitting, but would have thought that you of all people would have
jumped on my back about the penalty for a further push to the stack. My
rule of thumb is that for code refactoring like you have done with the
string function where the codepath is used more than once, such
splitting is worth it. But for functions like LIST-DIRECTORY, we should
keep it all in one method call for efficiency. For what its worth, I
*did* try to figure out how to factor the common code between
LIST-DIRECTORY and PATHNAME-MATCH-P out into something separate, but
Pathname.wildcardMatches() was the only thing that looked plausible to
my brain.
Hopefully I understood your questions: push back if I haven't!
yers in cons,
Mark
--
"A screaming comes across the sky. It has happened before, but there
is nothing to compare to it now."
More information about the armedbear-devel
mailing list