[closure-devel] How to disregard namespaces

Andrei Stebakov lispercat at gmail.com
Fri Mar 4 18:54:00 UTC 2011


Can it be resolved at (defun rename-attribute (attribute prefix uri) level?
So instead of throwing
    ((zerop (length uri))
     (stp-error "attribute with prefix but no URI"))
It would check some global var like *ignore-namespaces* and just continue?

On Fri, Mar 4, 2011 at 1:15 PM, David Lichteblau <david at lichteblau.com> wrote:
> Quoting Andrei Stebakov (lispercat at gmail.com):
>> Say I need to parse html that I got from some external source and for
>> some reason there are namespaces in the text:
>>
>> (chtml:parse "<a href='someurl.com' somens:url='someurl.com'>text</a>"
>>  (stp:make-builder))
>>
>> The parser will choke on somens namespace since it's not mapped to any url:
>>   0: (CXML-STP:STP-ERROR "attribute with prefix but no URI")[:EXTERNAL]
>>   1: (CXML-STP:RENAME-ATTRIBUTE #<error printing object>)
>>   2: (CXML-STP:MAKE-ATTRIBUTE "someurl.com" "somens:url" "")
>>   3: ((SB-PCL::FAST-METHOD SAX:START-ELEMENT (CXML-STP-IMPL::BUILDER T
>> T T T)) ..)
>
> Indeed, something needs to be done to fix this, since chtml purports to
> fix bogus html without erroring out.
>
> At the moment, chtml liberally accepts these attributes for its own
> internal PT representation, but then accidentally turns PT attributes
> into HAX events (and then SAX events) without further validation.
>
> I think it might be easiest to continue allowing them in PT, but to
> change PT serialization to fix them before constructing hax attribute
> objects.
>
> Here is a simple patch that just discards the attribute (changing its
> name would be another option).  Note that the patch isn't good enough to
> commit it as this point, because it introduces a dependency from chtml
> to cxml.
>
> --- a/src/parse/html-parser.lisp
> +++ b/src/parse/html-parser.lisp
> @@ -98,16 +98,20 @@
>  ;;;                (merge-pathnames (or pathname (pathname input))))))
>        (parse-xstream xstream handler)))))
>
> +(defun good-attribute-name-p (name)
> +  (and (cxml::valid-name-p name)
> +       (not (or (string-equal name "xmlns")
> +               (position #\: name)))))
> +
>  (defun serialize-pt-attributes (plist recode)
>   (loop
>      for (name value) on plist by #'cddr
> -     unless
> -       ;; better don't emit as HAX what would be bogus as SAX anyway
> -       (string-equal name "xmlns")
> +     for n = #+rune-is-character (coerce (symbol-name name) 'rod)
> +            #-rune-is-character (symbol-name name)
> +     ;; don't emit as HAX what would be bogus as SAX anyway
> +     if (good-attribute-name-p n)
>      collect
> -     (let* ((n #+rune-is-character (coerce (symbol-name name) 'rod)
> -              #-rune-is-character (symbol-name name))
> -           (v (etypecase value
> +     (let ((v (etypecase value
>                 (symbol (coerce (string-downcase (symbol-name value)) 'rod))
>                 (rod (funcall recode value))
>                 (string (coerce value 'rod)))))
>
>
>> Is there a way to specify some global variable to turn off namespace
>> processing?
>> I saw *namespace-processing* variable in some other package but it
>> doesn't seem to be relevant in this case.
>
> You could use DOM instead of STP, I suppose.  DOM doesn't do these sorts
> of checks IIRC.
>
> (Personally I strongly prefer STP over DOM, but one reason for that
> preference is that STP is stricter, which is nice when actually working
> with XML.)
>
>
> d.
>




More information about the closure-devel mailing list