[cxml-devel] CDATA doesn't preserve whitespace
David Lichteblau
david at lichteblau.com
Sat Sep 16 07:25:55 UTC 2006
Quoting Sunil Mishra (smishra at sfmishras.com):
> CL-USER(116): (dom:map-document (cxml:make-namespace-normalizer
> (cxml:make-octet-stream-sink *standard-output*)) *)
Note that make-octet-stream-sink defaults to canonical mode for
historical reasons.
> <svg xmlns="http://www.w3.org/2000/svg">
<script
> type="text/css">
</script>
</svg>
> #<MULTIVALENT stream socket connected from localhost/3813 to
> localhost/3817 @ #x205003d2>
Sorry, I don't see a bug. The serializer in canonical mode outputs
character references for the newlines here, but it doesn't output a
CDATA section either in the first place, so that's fine.
If you want to see a CDATA section, use non-canonical mode:
cl-user(43): (dom:map-document
(cxml:make-octet-stream-sink *standard-output* :canonical nil)
(cxml:parse-file "~/graph.xml" (cxml-dom:make-dom-builder)))
<?xml version="1.0" encoding="UTF-8"?>
<svg>
<script type="text/css">
<![CDATA[
]]>
</script>
</svg>
> ``Within a CDATA section, only the CDEnd string is recognized as markup,
> so that left angle brackets and ampersands may occur in their literal
> form; they need not (and cannot) be escaped using "<" and "&".
> CDATA sections cannot nest.''
>
> Can cxml please correctly follow this requirement?
It follows this requirement while parsing.
Only in serialization there is one little "problem" (unrelated to your
question):
A document constructed in memory might include a CDATA section with
characters not representable in a CDATA section. That is a user error,
and CXML should signal an error when told to serialize such a document
in non-canonical mode; right now I believe it does not signal that error
and outputs the user data as-is, resulting in output that isn't
well-formed. (But I'm taking patches. :-))
d.
More information about the cxml-devel
mailing list