[cl-unification-devel] Questions and patches involving cl-unification's MATCH* macros
Marco Antoniotti
marcoxa at cs.nyu.edu
Mon Jan 18 06:53:09 UTC 2010
Hello there.
thanks a lot for all the work. I have read through the messages and I
agree with your analysis with a few comments.
I think all the changes you propose are worth including.
Let me comment on a few of these.
On Jan 16, 2010, at 06:43 , Pixel // pinterface wrote:
> Greetings! I was looking into improving the efficiency of MATCH-
> CASE and
> while working on that I've ended up with a few questions regarding
> some
> implementation decisions, and drafted up some patches both to help
> you see
> my thoughts and for consideration of inclusion if you think they're
> in the
> right direction.
>
> Patches are numbered in the order created, and generally depend on
> lower-numbered patches (not always intentionally--it's hard to avoid
> inter-patch dependencies when making multiple changes to a small
> area). I
> thought I made the patches in a reasonable order while I was making
> them,
> but I find myself referring to them in a different sequence than
> what I made
> them in, so apologies if that causes any confusion.
>
> I've also put up a darcs mirror (in darcs2 format) of the CVS
> repository
> plus the attached patches, in case you find that more convenient.
> You can
> find it at
> http://repo.kepibu.org/cl-unification/
>
>
>
> ** Duplicate code
>
> There's a fair bit of code duplication between MATCH, MATCHF, and
> MATCHING in the form of GENERATE-VAR-BINDINGS and the production of
> the let forms that surround FORMS in those macros. (Nevermind that
> MATCH and MATCHF are identical except in what they do with template.)
Yep. Most of the code in the MATCH-BLOCK file is cut'n'paste.
>
> Patches 0 and 2 help somewhat.
>
> 0) Extract template handling of MATCH[ING] into %TEMPLATE-FOR-MATCH
> Should be fairly self-explanatory; not much of a win in lines.
>
> 2) Extract the bits that wrap forms with bindings for template
> variables
> This patch extracts the assorted flet GENERATE-VAR-BINDINGS into a
> single common function. It's slightly less straightforward than
> that,
> however, because it also extracts the (let ...) forms which actually
> used those bindings.
>
> This results in a decent win for code size, but in some cases it
> swaps the order of execution of %TEMPLATE-FOR-MATCH and
> COLLECT-TEMPLATE-VARS. I'm pretty sure this doesn't have any
> noticable effect, but thorough testing is probably wise.
This is good. However, I would set up a bunch of regression tests
before moving on and change things. I usually use the test suite from
Franz, but I'd be willing to change to another test suite.
>
> ** (unify* ...) and (ignore-errors (unify ...))
>
> The MATCHING macro produces the form (ignore-errors (unify ...)),
> presumably
> because it existed before the convenience function UNIFY*. But I'm
> curious
> about UNIFY* and/or the ignore-errors form in MATCHING. Is ignore-
> errors
> the proper form in either of these places, or was it just easier to
> type
> than
> (handler-case (unify ...)
> (unification-failure () nil))
> ?
Most implementations expand IGNORE-ERRORS in something like that. But
I agree that (as you point out later on) that UNIFY* should be written
as
(defmethod unify* (a b &optional (env (make-empty-environment)))
(handler-case (unify a b env)
(unification-failure () nil))
In this way other conditions are passed on.
>
> Looking at some of the assorted UNIFY methods, I see UNIFICATION-
> FAILUREs
> aren't the only possible condition which might be thrown during
> unification
> (e.g., LAMBDA-LIST-PARSING-ERROR is also a possibility). However,
> while
> MATCHING treats any error as a failure to unify, MATCH, MATCHF,
> MATCH-CASE,
> and MATCHF-CASE all limit themselves to only UNIFICATION-FAILUREs.
> This
> inconsistency leads me to wonder which is correct.
>
> I assume only UNIFICATION-FAILUREs should be treated as such in
> patch 5
> because all ERRORs seems a bit heavy-handed in a language with such an
> excellent condition system, but can see the argument that any error /
> is/ a
> failure to unify.
>
> 1) Use (unify* ...) rather than (ignore-errors (unify ...))
> Same thing, so might as well use the convenience function.
>
> Not particularly exciting. And this bit is changed again in patch 5
> to ignore only UNIFICATION-FAILUREs.
Yep. See above.
> 5) Make MATCHING agree with MATCH[F][-CASE] about the conditions of
> failure
> Rather than skipping to the next clause on any error,
> UNIFICATION-FAILUREs--and /only/ UNIFICATION-FAILUREs--skip to the
> next
> clause.
>
> Under the assumption that UNIFY* is implemented as intended, this
> creates
> a new utility function UNIFY** (terrible name, I know!) which is
> like
> UNIFY* but ignores /only/ UNIFICATION-FAILUREs rather than all
> ERRORs.
>
> This of course results in a library-user-visible change to how
> MATCHING
> operates, which may or may not be considered acceptable.
Yep.
>
>
> ** (matching (otherwise ...))
>
> 3) Fix (matching (otherwise ...))
> (matching (otherwise ...)) expands into (cond (otherwise ...)),
> which
> generates an unbound-variable error when executed, because COND does
> not special-case OTHERWISE as CASE does.
>
> Pretty self-explanatory and not very interesting.
Good catch.
> ** &body vs. &rest
>
> 4) Use &body instead of &rest for (arguably) prettier auto-indentation
> I prefer the smaller indentation provided with using &body instead
> of &rest. Not a big deal either way, but I was in there so I
> figured I'd throw it in the mix. Fortunately, if you disagree,
> this patch shouldn't affect any of the others. :)
I agree. It is better self documentation as well.
> ** Multiple #:UNIFICATION-ENVs in MATCHING
>
> 6) Only use one variable to store the unification environment in
> MATCHING
> Because of the way MATCHING expands, and what UNIFY** returns, each
> (setf #:env (unify** ...))
> call will do one of two things: it will set #:env to NIL or it
> will set
> #:env to an ENVIRONMENT structure.
>
> If #:env is set to NIL--the same value it entered the (setf)
> with!--the
> COND will continue on to the next clause.
>
> If #:env is set to an ENVIRONMENT structure, none of the remaining
> (setf)
> clauses will be evaluated.
>
> Thus, because the variable will only ever be set to a non-nil
> value once,
> this is perfectly safe.
I think this is ok as well.
>
> ** MATCH-CASE
>
> MATCH-CASE has the surprising behavior of
> (ignore-errors
> (match-case ("foo")
> ("foo" (error 'unification-failure))
> (t :default)))
> => :default
> Hardly CASE-like behavior!
>
> I think it sensible to redefine MATCH-CASE in terms of MATCHING,
> rather than
> MATCH. E.g., so the above would expand into something like
> (ignore-errors
> (let ((#:object-var "foo"))
> (matching (:errorp nil :default-substitution nil :matching-
> named nil)
> (("foo" #:object-var) (error 'unification-failure))
> (t :default))))
> which greatly simplifies the definition of MATCH-CASE and makes it
> much more
> CASE-like in behavior.
>
> 7) Redefine MATCH-CASE in terms of MATCHING
> This both greatly simplifies the MATCH-CASE macro as well as its
> expansion.
>
> HOWEVER, this version is *NOT* 100% compatible with the previous
> version. Specifically, UNIFICATION-FAILUREs signalled from within
> clause-forms will /not/ cause the next unification clause to be
> attempted, but will instead propogate outward as the -case name
> suggests they should.
>
> That is,
> (ignore-errors
> (match-case ("foo")
> ("foo" (error 'unification-failure ...))
> (t :default)))
> => :default ;; before patch
> => nil, #<unification-failure> ;; after patch
>
> Patch 7 suggests that MATCHING, MATCH-CASE, MATCHF-CASE, and the
> possible
> future MATCH-COND and MATCHF-COND are all fairly trivial variations
> on each
> other. I'll be happy to explore that possibility in future patches
> if it's
> of interest, but I'm already asking you to look at too many patches
> at once,
> so I'll refrain for now. :)
I agree with this as well. The behavior you suggest is the one that's
needed.
> ** MATCH
>
> Though documented, MATCH's behavior when its body forms signal
> UNIFICATION-FAILURE seems at odds with expectations, for pretty much
> the
> same reason the behavior seems nonsensical in MATCH-CASE. Was it a
> deliberate design decision, or a side-effect of how MATCH was
> implemented?
Never attribute to malice what can be attributed to incompetence. Of
course it is a side-effect of the implementation :) I knew I needed a
test suite :)
>
> If a side effect of implementation, would it make sense to redefine
> MATCH to
> have an expansion more like
> (match ("foo" "foo" :errorp <e?> :error-value <ev>) <body>) =>
> (m-v-b (#:env #:error) (unify** "foo" "foo")
> (cond
> ((and #:error <e?>) (error #:error))
> (#:error <ev>)
> (t (let* (#| template vars |#)
> <body>))))
> so UNIFICATION-FAILUREs inside <body> can be independent of errorp?
>
> (No patch for this one yet.)
>
>
> ** :MATCH-NAMED, :MATCHING-NAMED, :MATCH-CASE-NAMED
>
> It seems a bit redundant to have <macro-name>-named arguments for
> block
> names rather than simply having a :named argument as in, say, LOOP.
> Any
> particular reason it wasn't done that way?
Having the :NAMED option would be preferable, with NIL or the macro
name as fall-back.
> (No patch for this one, either.)
>
>
> ** Apologies
>
> Sorry about the massive brain dump. I /may/ have gotten carried
> away. :)
Absolutely not. All of this is most welcome. I am just a little
swamped (what an euphemism!) and maybe not all that reactive. But
let's keep the discussion going.
all the best
Marco
More information about the cl-unification-devel
mailing list