[cl-typesetting-devel] HTML rendering
Klaus Weidner
klaus at atsec.com
Sun Apr 18 10:04:08 UTC 2004
Hello,
after an all-day hacking session, I now have a proof of concept for XHTML
rendering via cl-typesetting working, see attachment. It's still missing
many tags, but that's mostly a matter of patience, I think the hard work
is largely done.
The following article had inspired me to do this:
http://www-106.ibm.com/developerworks/library/x-xslfo2app/
It covers using XSL-FO stylesheet transforms on XHTML documents. There
you can also find the input document "everything.html" I used, along with
a FOP-rendered PDF version for comparison.
Instead of XSL-FO, I'm using Lisp-based tree transform from XML (read
from a file via the XMLS library) to cl-typesetting commands.
I think this is an important step towards having cl-typesetting behave
more like a normal program with input and output files rather than
exclusively as a library. And all it took was an "eval" in the right
place :-)
I've extended cl-typesetting a bit, to support cross-referencing (needed
for the table of contents and intra-document page references), left/right
page header alternation (containing chapter titles) and some other minor
stuff.
The only change to the core code was adding a :fresh-page token. It works
like :eop, but does not start a new page if the current one is already
empty. (This makes automated generation easier.) All the other extensions
were in client code.
I've attached the :fresh-page patch, and sample code to demonstrate the
new features. The latter is still a bit messy and undocumented, it's a
work in progress.
The XHTML transformer is still too unfinished for distribution, but please
let me know if you're interested in that and I'll work on cleaning it up.
But I'm surprised at how well it works already, considering that most of
my time had been spent figuring out how things work.
Some miscellaneous things I noticed:
- there are a couple of rather grating spelling errors in the source
code, which I'd suggest fixing before the code is in more widespread
use (later changes would be harder) - I can make a patch for those but
wanted to check first if that's okay.
trimable => trimmable
splitable => splittable
ponctuation => punctuation
Also, I'm not sure what a "trie" is (used by hyphenation). And I think
that keyword arguments should not use a "-p" suffix.
- This would also be a good opportunity to document the style parameters
supported by the public interface, since grepping through the source is
suboptimal.
- I don't know how leaders (the dots joining table-of-content entries to
the page numbers) should be implemented. Note that the dots should line
up even if the TOC entries have different lengths, so it's not a simple
matter of sticking periods in boxes.
- PDF hyperlinks aren't handled yet by the typesetting code - how about a
link-box class which acts as a clickable area for the bounding box of
the content?
- The W3 FO specification is a surprisingly good fit to the
cl-typesetting internals, and since XML is isomorphic to s-exprs, I
think it would make sense to stay close to that model where
appropriate. That would make it much easier to reuse existing work,
i.e. for DocBook or OpenOffice rendering.
Comments and feedback would be much appreciated.
-Klaus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: output.pdf
Type: application/pdf
Size: 14871 bytes
Desc: not available
URL: <https://mailman.common-lisp.net/pipermail/cl-typesetting-devel/attachments/20040418/acc18b11/attachment.pdf>
-------------- next part --------------
diff -urN -b orig/cl-typesetting/layout.lisp cl-typesetting/layout.lisp
--- orig/cl-typesetting/layout.lisp Mon Mar 22 03:09:13 2004
+++ cl-typesetting/layout.lisp Sun Apr 18 01:45:08 2004
@@ -221,6 +221,15 @@
(when (eq *h-align* :justified)
(push (make-hfill-glue) line-boxes))
(next-line line-boxes))
+ ((eq box :fresh-page)
+ (unless (or (null text-lines)
+ (and (null (rest text-lines))
+ (every (lambda (box) (or (style-p box)
+ (typep box 'h-spacing)
+ (typep box 'v-spacing)))
+ (boxes (first text-lines)))))
+ (next-line line-boxes)
+ (return-lines 0)))
((eq box :eop)
(next-line line-boxes)
(return-lines 0))
diff -urN -b orig/cl-typesetting/typo.lisp cl-typesetting/typo.lisp
--- orig/cl-typesetting/typo.lisp Wed Mar 10 13:12:49 2004
+++ cl-typesetting/typo.lisp Sun Apr 18 01:36:16 2004
@@ -231,6 +231,9 @@
(defun new-line ()
(add-box :eol))
+(defun fresh-page ()
+ (add-box :fresh-page))
+
(defun new-page ()
(add-box :eop))
@@ -242,6 +245,9 @@
(defmethod insert-stuff ((obj (eql :eol)))
'(new-line))
+
+(defmethod insert-stuff ((obj (eql :fresh-page)))
+ '(fresh-page))
(defmethod insert-stuff ((obj (eql :eop)))
'(new-page))
-------------- next part --------------
(in-package typeset)
;; reference handling
(defvar *ref-table* (make-hash-table :test #'equal))
(defvar *ref-counter* 0)
(defvar *bad-reference* nil)
(defclass ref-mark ()
((id :accessor ref-id :initform nil :initarg :id)
(value :accessor ref-mark-value :initform nil :initarg :value)
(page :accessor ref-mark-page :initform nil)
(x :accessor ref-x :initform nil)
(y :accessor ref-y :initform nil)))
(defmethod stroke ((mark ref-mark) x y)
(setf (ref-mark-page mark) (length (pages pdf:*document*))
(ref-x mark) x
(ref-y mark) y))
(defmacro ref-get (id)
`(gethash ,id *ref-table*))
(defun make-ref-mark (id &optional value)
(let ((mark (or (ref-get id)
(make-instance 'ref-mark
:id id))))
(setf (ref-get id) mark)
(setf (ref-mark-value mark) value)
(add-box mark)))
(defun ref-page (id)
(let* ((ref (ref-get id))
(page (if ref (ref-mark-page ref))))
(cond (page page)
(t (push id *bad-reference*)
999))))
(defun put-ref-page (id)
(put-string (format nil "~D" (ref-page id))))
(defgeneric ref-value (ref))
(defmethod ref-value ((ref ref-mark))
(if ref (ref-mark-value ref)))
(defmethod ref-value ((id t))
(let ((ref (ref-get id)))
(if ref (ref-mark-value ref))))
(defun put-ref-value (id)
(put-string (ref-value id)))
(defun this-page-number ()
(length (pages pdf:*document*)))
(defun make-ref-page-mark (reftype value)
(make-ref-mark (cons reftype (incf *ref-counter*)) value))
(defun get-latest-ref-to (reftype for-page)
(let ((refs nil))
;; Find all references of a type, store unsorted
;; (ordinal page ref) lists in "refs".
(maphash (lambda (key ref)
(if (and (consp key)
(equal reftype (car key)))
(push (list (cdr key)
(or (ref-mark-page ref) most-positive-fixnum)
ref)
refs)))
*ref-table*)
;; Now walk through the reverse sorted references,
;; and get the last matching one on or before the
;; current page.
(third (find-if (lambda (page)
(<= page for-page))
(sort refs #'> :key #'car)
:key #'cadr))))
(defun current-ref-value (reftype)
(ref-value (get-latest-ref-to reftype (this-page-number))))
;; higher-level layout
(defun safe-eval (expr)
(let ((*package* (find-package "TYPESET"))
(*read-eval* nil))
(eval expr)))
;; Note that the tree argument to render-document is a dead list of
;; symbols and strings. This is a prerequisite for being to handle
;; documents that are completely generated at runtime.
(defun render-document (tree &key
(file #P"/tmp/stuff.pdf")
(twosided t)
(paper-size :letter))
"Render the document specified by tree, which is a s-exp containing
recursive typesetting commands. It gets eval'ed here to typeset it."
(do ((*ref-table* (make-hash-table :test #'equal))
(*ref-counter* 0)
(*bad-reference* nil)
(pass 0 (1+ pass)))
((or (> pass 1)
(and (> pass 0)
(not *bad-reference*)))
*bad-reference*)
(setq *bad-reference* nil)
(format t "Pass ~d~%" pass)
(with-document ()
(let ((margins '(72 72 72 50))
(header (lambda (pdf:*page*)
(if (current-ref-value :header-enabled)
(let ((inside (or (current-ref-value :title) "Untitled Document"))
(outside (current-ref-value :chapter)))
(if (and twosided (evenp (this-page-number)))
(compile-text (:font "Times-Roman" :font-size 10)
(hbox (:align :center :adjustable-p t)
(put-string outside)
:hfill
(with-style (:font "Times-Italic")
(put-string inside))))
(compile-text (:font "Times-Roman" :font-size 10)
(hbox (:align :center :adjustable-p t)
(with-style (:font "Times-Italic")
(put-string inside))
:hfill
(put-string outside))))))))
(footer (lambda (pdf:*page*)
(if (current-ref-value :footer-enabled)
(let ((inside (or (current-ref-value :version) ""))
(outside (format nil "Page ~d of ~d"
(this-page-number)
(ref-page "DocumentEnd"))))
(if (and twosided (evenp (this-page-number)))
(compile-text (:font "Times-Roman" :font-size 10)
(hbox (:align :center :adjustable-p t)
(put-string outside)
:hfill
(put-string inside)))
(compile-text (:font "Times-Roman" :font-size 10)
(hbox (:align :center :adjustable-p t)
(put-string inside)
:hfill
(put-string outside)))))))))
(draw-pages (safe-eval (list 'compile-text () tree))
:margins margins :header header :footer footer
:size paper-size)
(when pdf:*page* (finalize-page pdf:*page*))
(pdf:write-document file)))))
;; Example follows.
(defun document-test ()
(render-document
'(with-style ()
(make-ref-page-mark :title "Titled Document")
(make-ref-page-mark :version "Version 1.x")
(make-ref-page-mark :header-enabled nil)
(make-ref-page-mark :footer-enabled nil)
:vfill
(paragraph (:font "Helvetica-Bold" :font-size 24 :h-align :center :bottom-margin 20)
"This is the Document Title")
(paragraph (:font "Helvetica-Bold" :font-size 16 :h-align :center)
"A. N. Author")
:vfill
:eop
(make-ref-page-mark :header-enabled t)
(make-ref-page-mark :footer-enabled t)
(make-ref-mark '(:chapter . 0) "Table of Contents")
(with-style (:font "Helvetica" :font-size 10)
(hbox (:align :center :adjustable-p t)
(put-ref-value '(:chapter . 1))
:hfill
(put-ref-page '(:chapter . 1)))
(hbox (:align :center :adjustable-p t)
(put-ref-value '(:chapter . 2))
:hfill
(put-ref-page '(:chapter . 2))))
:eop
(make-ref-mark '(:chapter . 1) "Introduction")
(paragraph (:font "Times-Roman" :font-size 10 :h-align :left :bottom-margin 7)
"Test with "
(with-style (:font "Times-Bold")
"bold")
" and "
(with-style (:font "Times-Italic")
"italic")
" text.")
(paragraph (:font "Times-Roman" :font-size 10 :h-align :left :bottom-margin 7)
(make-ref-mark "link-from")
"See also stuff on page "
(put-ref-page "stuff")
".")
:eop
(make-ref-mark '(:chapter . 2) "Interesting Stuff")
(paragraph (:font "Courier" :font-size 10 :bottom-margin 7)
(make-ref-mark "stuff")
"Some" :eol "more" :eol "Text." )
(paragraph (:font "Times-Roman" :font-size 10 :h-align :left :bottom-margin 7)
"This is linked to from page "
(put-ref-page "link-from")
".")
(make-ref-mark "DocumentEnd"))))
More information about the cl-typesetting-devel
mailing list