From hans.huebner at gmail.com Mon Apr 7 11:41:31 2008 From: hans.huebner at gmail.com (Hans Huebner) Date: Mon, 7 Apr 2008 13:41:31 +0200 Subject: [cl-xmlspam-devel] Some comments on cl-xmlspam In-Reply-To: References: Message-ID: Hi, I have finally found the need to parse XML again and gave cl-xmlspam a try. Here are a few comments: - I could not get explicit matching of names with a namespace prefix to work. I'm sorry that I can't even give a clear description of what the problem is, except that I tried "foo:bar" and :foo.bar both with and without namespaces declared to match foo, but never found xmlspam to match my elements. Not specifying the namespace prefix works. Bottom line: Debugging is rather difficult - OPTIONAL-ATTRIBUTE is not documented - The lack of backtracking is a serious restriction - ATTRIBUTE (and ELEMENT, for what it is worth) should either check that the argument is a string or keyword constant and bail out on plain symbols or allow for the specification of variable element or attribute names. - There should be a way to access the KLACKS source so that one can use KLACKS calls to read attribute values (or to process some of the XML using KLACKS) - The macro expansions quickly get rather large which will be a problem for some compilers (SBCL tends to become very slow and allocate huge amounts of memory with large macro expansions). This is not an issue that I have, but it is something that will propably happen rather soon. Take all of this as constructive comments. I might even fix the problems myself if you (rog) don't plan to invest any more time in xmlspam. Thanks! Hans From rogpeppe at gmail.com Fri Apr 11 18:59:37 2008 From: rogpeppe at gmail.com (roger peppe) Date: Fri, 11 Apr 2008 20:59:37 +0200 Subject: [cl-xmlspam-devel] Some comments on cl-xmlspam In-Reply-To: References: Message-ID: hi, thanks for taking a look at it, and making some constructive remarks; it's much appreciated. > - I could not get explicit matching of names with a namespace prefix > to work. I'm sorry that I can't even give a clear description of what > the problem is, except that I tried "foo:bar" and :foo.bar both with > and without namespaces declared to match foo, but never found xmlspam > to match my elements. Not specifying the namespace prefix works. > Bottom line: Debugging is rather difficult the prefix matching should work, but you need to do things within a "with-namespace" form, and that form should be *inside* the with-xspam-source form (i'd like to make it interchangable, but it's not easy given that i'm using lexical scope; the answer, i think, it probably to combine the with-namespace and with-xspam-source macros (so with-xspam-source takes an optional namespace argument), but that requires a bit of juggling which i haven't made time for yet.). note that the namespace prefixes will not be those that you see in the XML file, but your own aliases defined using the with-namespace primitive. i don't know if this is completely Right, but it does feel right (as URIs are definitive, and two semantically identical documents can use different namespace prefixes). the examples should work. if they don't work for you, please let me know! re: debugging. what do you think might help? > - OPTIONAL-ATTRIBUTE is not documented i'll do that. > - The lack of backtracking is a serious restriction yes, i realise this, but unfortunately it's a necessary one if the "strictly streaming" aspect of xspam is to be maintained, because any element that we might need to backtrack to must be stored, and any single element may contain an arbitrary amount of data. one alternative would be to assume a seekable file and record a byte-offset and tag context, (and i have implemented something similar in the past) but cxml doesn't make this easy. i'd like to see examples where this restriction really makes an impact. obviousy it'd be nice to have full RELAX-NG pattern matching, but my hope is that cl-xmlspam hits a bit of a sweet spot without too much complexity. > - ATTRIBUTE (and ELEMENT, for what it is worth) should either check > that the argument is a string or keyword constant and bail out on > plain symbols or allow for the specification of variable element or > attribute names. yeah, i'll do this. > - There should be a way to access the KLACKS source so that one can > use KLACKS calls to read attribute values (or to process some of the > XML using KLACKS) see XSPAM-SOURCE oops actually i see i missed that out from the documentation. i'll do that (not immediately as i'm on holiday in provence) briefly, xspam-source is a function that gives the current klacks source. if you really want backtracking, you can just use this to gather up a DOM element and use xpath on it (i'm presuming there's something around that'll do this job). it might be nice to have a non-streaming equivalent of cl-xmlspam that would do backtracking. > - The macro expansions quickly get rather large which will be a > problem for some compilers (SBCL tends to become very slow and > allocate huge amounts of memory with large macro expansions). This is > not an issue that I have, but it is something that will propably > happen rather soon. interesting issue. when is large large? i could easily make runtime size smalller and use higher-order functions a little more. it's a stylistic issue that doesn't often raise its head in non-lisp languages! cheers, rog.