From bhyde at pobox.com Mon Jul 15 20:27:21 2013 From: bhyde at pobox.com (Ben Hyde) Date: Mon, 15 Jul 2013 16:27:21 -0400 Subject: [closure-devel] forever stirring the tag soup Message-ID: <81D6E401-0D62-44D7-8EB5-6CD5CB1E5231@pobox.com> Parsing a github project page takes forever. > (setf sgml::*parse-warn-level* 5) > (let ((page (drakma:http-request "https://github.com/rss-sync/corpus"))) (handler-case (bt:with-timeout (10) (chtml:parse page (make-instance 'hax:default-handler))) (condition (c) c))) # Multiple div's appearing in a THead element are the root cause. > (setf sgml::*parse-warn-level* 0) 0 > (chtml:parse "
a
b
" (make-instance 'hax:default-handler)) ;; Parser warning: Line 1, column 26 : **** [-] Saw
in thead -- nuked
. ;; Parser warning: Line 1, column 31 : **** [H] Saw
in thead -- ??? patched (
) -> (
) ;; Parser warning: Line 1, column 31 : **** [-] Saw
in thead -- nuked
. ;; Parser warning: Line 1, column 32 : **** [H] Saw
in thead -- ??? patched (
) -> (
) ;; Parser warning: Line 1, column 32 : **** [-] Saw in thead -- nuked . ;; Parser warning: Line 1, column 38 : **** [H] Saw in thead -- ??? patched ( ) -> ( ) ;; Parser warning: Line 1, column 38 : **** [H] Saw in thead -- ??? patched ( ) -> ( ) ;; Parser warning: Line 1, column 38 : **** [H] Saw in thead -- ??? patched ( ) -> ( ) ? So far, I'm not clever enough to fix this. - ben ps. Thanks for the awesome library. -------------- next part -------------- An HTML attachment was scrubbed... URL: