From ctlaux at gmail.com Sun Jun 8 13:47:51 2014 From: ctlaux at gmail.com (Christopher Laux) Date: Sun, 8 Jun 2014 15:47:51 +0200 Subject: [Closure-devel] Bug in html parsing Message-ID: Hi, I think I've found a bug in closure. If I execute (chtml:parse "

test1

test2

" (chtml:make-lhtml-builder)) > (:HTML NIL (:HEAD NIL) (:BODY NIL (:SMALL NIL) (:P NIL (:SMALL NIL "test1")) (:P NIL "test2"))) I get that incorrect parse tree. The example is taken from a real website and the same happens inside the entire page. That happens both with the current git version and the quicklisp version (which might just be the same one). Any help? Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From bhyde at pobox.com Sun Jun 8 15:16:46 2014 From: bhyde at pobox.com (Ben Hyde) Date: Sun, 8 Jun 2014 11:16:46 -0400 Subject: [Closure-devel] Bug in html parsing In-Reply-To: References: Message-ID: Ah, the joys of HTML - see, for example: http://stackoverflow.com/questions/9852312/list-of-html5-elements-that-can-be-nested-inside-p-element On Jun 8, 2014, at 9:47 AM, Christopher Laux wrote: > Hi, > > I think I've found a bug in closure. If I execute > > (chtml:parse "

test1

test2

" (chtml:make-lhtml-builder)) > > > (:HTML NIL (:HEAD NIL) > (:BODY NIL (:SMALL NIL) (:P NIL (:SMALL NIL "test1")) (:P NIL "test2"))) > > I get that incorrect parse tree. The example is taken from a real website and the same happens inside the entire page. That happens both with the current git version and the quicklisp version (which might just be the same one). > > Any help? > > Chris > > _______________________________________________ > Closure-devel mailing list > Closure-devel at common-lisp.net > http://common-lisp.net/cgi-bin/mailman/listinfo/closure-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From ctlaux at gmail.com Sun Jun 8 15:26:01 2014 From: ctlaux at gmail.com (Christopher Laux) Date: Sun, 8 Jun 2014 17:26:01 +0200 Subject: [Closure-devel] Bug in html parsing In-Reply-To: References: Message-ID: If that is not valid html, should it not raise an error instead of returning an incorrect parse tree? Is there any way of making it parse it the way it is written anyway? Chris Am 08.06.2014 17:16 schrieb "Ben Hyde" : > Ah, the joys of HTML - see, for example: > http://stackoverflow.com/questions/9852312/list-of-html5-elements-that-can-be-nested-inside-p-element > > On Jun 8, 2014, at 9:47 AM, Christopher Laux wrote: > > Hi, > > I think I've found a bug in closure. If I execute > > (chtml:parse "

test1

test2

" > (chtml:make-lhtml-builder)) > > > (:HTML NIL (:HEAD NIL) > (:BODY NIL (:SMALL NIL) (:P NIL (:SMALL NIL "test1")) (:P NIL "test2"))) > > I get that incorrect parse tree. The example is taken from a real website > and the same happens inside the entire page. That happens both with the > current git version and the quicklisp version (which might just be the same > one). > > Any help? > > Chris > > _______________________________________________ > Closure-devel mailing list > Closure-devel at common-lisp.net > http://common-lisp.net/cgi-bin/mailman/listinfo/closure-devel > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal at lisp.pl Wed Jun 4 10:34:46 2014 From: michal at lisp.pl (=?UTF-8?Q?Micha=C5=82_Psota?=) Date: Wed, 04 Jun 2014 10:34:46 -0000 Subject: [Closure-devel] Bug in closure-html, when use with *print-case* set to :downcase Message-ID: Hello, I've noticed that in 2 places strings are not correctly interned when you load closure-html with *print-case* set to :downcase. To make it work, I've changed (intern ?) to (intern (string-upcase ?)). Please find patch attached. Best regards, Micha? Psota -------------- next part -------------- diff -rupN /tmp/closure-html-2010-09-20/src/parse/sgml-parse.lisp /tmp/closure-html/src/parse/sgml-parse.lisp --- /tmp/closure-html-2010-09-20/src/parse/sgml-parse.lisp 2010-09-20 00:07:37.000000000 +0200 +++ /tmp/closure-html/src/parse/sgml-parse.lisp 2014-05-20 12:50:01.817148769 +0200 @@ -860,7 +860,7 @@ (let ((kw-pkg (find-package :keyword))) (defun kintern (x) - (intern x kw-pkg))) + (intern (string-upcase x) kw-pkg))) (defun canon-value (input dtd tag slot value) (let* ((attlist (find-element-attlist dtd tag)) diff -rupN /tmp/closure-html-2010-09-20/src/util/clex.lisp /tmp/closure-html/src/util/clex.lisp --- /tmp/closure-html-2010-09-20/src/util/clex.lisp 2010-09-20 00:07:37.000000000 +0200 +++ /tmp/closure-html/src/util/clex.lisp 2014-05-20 12:53:08.437157801 +0200 @@ -365,7 +365,7 @@ (mapcar #'(lambda (x y) (setf (cdr x) y)) starts (ndfsa->dfsa (mapcar #'cdr starts)))) ;; (print (number-states starts)) - `(DEFUN ,(intern (format nil "MAKE-~A-LEXER" name)) (INPUT) + `(DEFUN ,(intern (format nil "~:@(MAKE-~A-LEXER~)" name)) (INPUT) (LET* ((STARTS ,(loadable-states-form starts)) (SUB-STATE 'INITIAL) (STATE NIL) @@ -538,7 +538,7 @@ (mungle-transitions (state-transitions state)))) (mapcar #'cdr starts)) (format T "~&~D states." n)) - `(DEFUN ,(intern (format nil "MAKE-~A-LEXER" name)) (INPUT) + `(DEFUN ,(intern (format nil "~:@(MAKE-~A-LEXER~)" name)) (INPUT) (LET* ((STARTS ,(loadable-states-form starts)) (SUB-STATE 'INITIAL) (STATE NIL) From look.wangluke at gmail.com Mon Jun 30 07:34:42 2014 From: look.wangluke at gmail.com (lookwong) Date: Mon, 30 Jun 2014 07:34:42 -0000 Subject: [Closure-devel] a bug in read-pcdata (source file: sgml-parse.lisp) Message-ID: In function read-pcdata: *(t * * (setf (aref scratch sp) ch) ;recode character read* * (setf sp (the fixnum (+ sp 1)))* * (cond ((= sp se) ;end of scratch pad reached?* * (enlarge-scratch-pad input)* * (setf scratch (a-stream-scratch input)* * se (length scratch))))))))* Should first enlarge the input buffer, or will cause out of index error. *(t** (cond ((= sp se) ;end of scratch pad reached?* * (enlarge-scratch-pad input)* * (setf scratch (a-stream-scratch input)* * se (length scratch))))* * (setf (aref scratch sp) ch) ;recode character read* * (setf sp (the fixnum (+ sp 1)))**))))* -------------- next part -------------- An HTML attachment was scrubbed... URL: