reading s-expressions from a file

Alexandre Rademaker arademaker at gmail.com
Wed Jun 26 19:15:46 UTC 2019


Thank you all! Reporting my results so far, suggestions are welcome! ;-)

My goal is to read KIF files such as 
https://github.com/ontologyportal/sumo/blob/master/Merge.kif


Alessio's suggestion is probably something to explorer, but I will need some time to read the documentation of https://github.com/robert-strandh/Eclector. 

The most directly solution seems to be the suggestion from Luís Oliveira, but I didn’t understand how to use https://github.com/cl-stream/cl-stream/blob/master/line-tracking-stream.lisp. The Ala'a Mohammad link (http://www.chiark.greenend.org.uk/doc/sbcl-doc/html/sbcl.html#Gray-Streams-examples) helped me. But if I understood it right, cl-stream implements only the equivalent of read-char from CL in its stream-read function. Moreover, I didn’t yet figure out how to initialise the ’state’ of the class.

CL-USER> (with-open-file (input "/Users/ar/workspace/sumo/Merge.kif")
	   (let ((counted-stream (make-instance 'cl-stream::line-tracking-input-stream
						:stream input :input-line 0)))
	     (dotimes (x 20) 
	       (print (multiple-value-list (cl-stream:stream-read counted-stream))))))

(#\; NIL) 
(#\; NIL) 
(#\  NIL) 
(#\= NIL) 
(#\= NIL) 
(#\= NIL) 
...

Regarding Steve message:


> On 26 Jun 2019, at 13:49, Steve Haflich <shaflich at gmail.com> wrote:
> 
> The first thing to realize is that "you" cannot do this, but you might be able to write CL-conformant code that does it for you.  But the READ function won't do it by itself.  It's not part of READ's contract, which silently eats Newlines without counting them for you.

Yep, that is what I discovered reading the http://clhs.lisp.se/Body/02_b.htm.

> You might consider some readtable magic, changing Newline to a non-non-breaking macro char that returns nothing but maintains a count somewhere, except such hackery would miss line breaks inside strings, #|comments|#, probably also ;regular comments, and escaped newlines inside symbol names (UGH).

I was trying - without success - this approach. Maybe I didn’t understand how to `change Newline to non-non-breaking macro’. On the other hand, as Steve said, it looks like the line breaks in commented lines are ignored. I was trying

(defparameter *line* 0)

(defun linebreak-reader (stream char)
  (declare (ignore char))
  (incf *line*)
  (read-preserving-whitespace stream t nil t))

CL-USER> (let ((*readtable* (copy-readtable)))
  (setf *line* 0)
  (set-macro-character #\Newline #'linebreak-reader)
  (with-open-file (in "~/workspace/sumo/Merge.kif")
    (dotimes (x 10) 
      (format t "~a ~s ~%" *line* (read-preserving-whitespace in nil nil)))))

0 (INSTANCE INSTANCE BINARYPREDICATE) 
20 (DOMAIN INSTANCE 1 ENTITY) 
21 (DOMAIN INSTANCE 2 SETORCLASS) 
...


> Rather than writing your own entire READ function, if the source is a regular static file that can be reopened multiple times, wrtie a simple alternative MY-READ function that calls FILE-POSITION before calling READ and returns both a multiple values.  Converting a character FILE-POSITION into a line number, when and if necessary, can be accomplished by reopening the file and reading characters until the desired FILE-POSITION counting line breaks along the way.  Taking care to use the same external-format avoids the difficulty of multi-byte characters and the possibility that FILE-POSITION does not necessarily increment by 1 for each char.  It is only guaranteed to increase monotonically.

I am still have to fix some details like dealing with the first expression, improve performance using a smarter strategy to consume the linebreaks position list etc. But here is a first version.

(defstruct expression position line form)

(defun my-read (stream)
  (cons
   (file-position stream)
   (read stream nil nil)))

(defun pos-line (position linebreaks)
  (position-if-not (lambda (n) (> position n)) linebreaks))

(defun linebreaks-in-file (file)
  (with-open-file (kb file)
    (loop for c = (read-char kb nil 'the-end)
	  while (characterp c)
	  when (equal c #\Newline)
	  collect (file-position kb))))

(defun read-kif (file)
  (let ((res nil)
	(linebreaks (linebreaks-in-file file)))
    (with-open-file (kb file)
      (do ((st (my-read kb)
               (my-read kb)))
          ((null (cdr st)) (reverse res))
        (push (make-expression :position (car st) :line (pos-line (car st) linebreaks) :form (cdr st)) res)))))
  
(read-kif "~/workspace/sumo/Merge.kif”)

--
Alexandre Rademaker
http://arademaker.github.io


> On Wed, Jun 26, 2019 at 9:19 AM Alexandre Rademaker <arademaker at gmail.com> wrote:
> 
> The “read” function makes really easy to read a bunch of s-expressions from a file, but how can I keep track of the line number where the expressions were in the file?
> 
> Any ideia? 
> 
> Alexandre 
> Sent from my iPhone




More information about the pro mailing list