[climacs-devel] more context-sensitive parsing

Robert Strandh strandh at labri.fr
Wed Apr 6 06:17:23 UTC 2005


Hello, 

Christophe Rhodes writes:
 > 
 > In prolog, there are so-called operator directives, of the basic form:
 > 
 >   :- op(=@=,100,xfy).
 > 
 > such that this defines a right-associative operator named =@=.
 > 
 > In order to parse the rest of the file properly, it would be good, on
 > parsing this directive, to alter the parser to recognize this kind of
 > construct.  

That is one thing the Earley parsing framework is good for. 

 > I have hooks in prolog-syntax.lisp for this -- the
 > FIND-DEFINED-OPERATOR function -- but given that in principle we could
 > be visiting several prolog files at once, defined operators should
 > really be buffer-local.  So the logical place to stash information
 > about such a user-defined operator would be the syntax, I think, 

Or you could make a copy of the grammar for each buffer and add new
rules to the grammar.  Though that is a bit tricky, because if you
delete that line, you do want the rule to be removed as well.  I
haven't given it enough thought, but it seems like the right way to
handle this an similar on-the-fly grammar modifications is to have
each parse state contain a reference to the grammar to be used for
further parsing, and to make state transitions allow for a modified
version of the grammar to be passed on to the next state.  Such a
mechanism could also be used for switching grammars for situations
such as SQL embedded in C or PHP embedded in HTML. 

 > but I
 > don't have access to the syntax from a grammar rule (or do I?)

No, not at the moment.  You don't even have access to the grammar, I
think.  But that could be fixed. 

 > Any ideas, suggestions?  The other possibility, I suppose, is to make
 > the user do this, maybe through a Climacs command Define Operator.
 > (This probably isn't the most urgent of issues, but I do have a large
 > amount of prolog code with many operator definitions, and no other
 > real code to test my parser against).

Well, if it is not urgent, I would suggest you (and others) start
thinking about the more general idea outlined above.  

The more general problem of switching grammars in the middle of a
buffer has one incompatibility with the incremental lexer that is used
for HTML syntax (and that will be used for CL syntax as well), namely
that the lexer might have to change as well when the grammar changes. 
-- 
Robert Strandh

---------------------------------------------------------------------
Greenspun's Tenth Rule of Programming: any sufficiently complicated C
or Fortran program contains an ad hoc informally-specified bug-ridden
slow implementation of half of Common Lisp.
---------------------------------------------------------------------



More information about the climacs-devel mailing list