[editor-hints-devel] My thoughts about markup and docstrings

Sat Dec 27 10:23:53 UTC 2008

Hi,

Quoting Tobias C. Rittweiler (tcr at freebits.de):
> My stand on docstrings is that docstrings are totally inappropriate for
> documentation purposes. In my view, docstrings should be used to
> summarize what a function does in one, or two sentences for the reader
> of the code, not the user of the code. 
> 
> The rationale behind that thinking is that elaborate docstrings would
> take up too much screen space, and would seperate a) the body of a
> function from its lambda-list, and b) the function definition
> from other function definitions.
> 
> What I want is a DEFINE-DOCUMENTATION macro to define elaborate
> documentation for a function, macro, &c. And I want Slime to bring me to
> a) the rendered documentation of a symbol-at-point, and b) the source of
> that documentation. (Cf. item Source Locations in [TODO].)

that happens to be relatively close to what I'm working on.

While I would personally want to argue for an approach that would
-allow- programmers to use docstrings (this is where we might disagree),
my current project parse-docstrings already supports a macro that
enables this documentation to be stored elsewhere instead, so it doesn't
-force- programmers to write ordinary docstrings anymore.

(This macro is currently called ANNOTATE-DOCUMENTATION, not
DEFINE-DOCUMENTATION.)

Let me explain why I arrived at this solution:

My older documentation extractor called atdoc had special markup syntax
for docstrings that I invented to allow special sections of a docstring
for:
  - explicit descriptions of the arguments
  - explicit descriptions on the return values
  - special cross-references to explain slot accessors, conditions
    signalled, so-called constructor functions, etc.

After releasing atdoc, many people told me that they liked my output,
but wouldn't want to litter their docstrings with all this markup.

Following that advice, I decided to pull all these special annotations
out of the docstring, and into the macro.  So instead of writing this:

(defun foo (a b)                   ;example of the OLD atdoc syntax
  "@arg[a]{This is the important argument of type @class{bar}}
   @arg[b]{This is the @class{baz} being frobbed in some special way}
   @return{random junk of type @class{quuux}}
   @condition{big-mistake-error}

   @short{This function frobs A and B to create random junk.}

   Detailed description here.

   @see{the-other-function}"
   (frob a b))

You could now write:

(defun foo (a b)
  "This functions frobs A and B to create random junk.

   Detailed description here."
  (frob a b))

(annotate-documentation (foo function)
  (:argument a "This is the important argument of type BAR")
  (:argument b "This is the BAZ being frobbed in some special way")
  (:return-value "random junk of type QUUUX")
  (:condition big-mistake-error)
  (:see-function the-other-function))

Or (and here our ideas might converge) you can also override the actual
docstring completely like this:

(defun foo (a b)
  "This isn't part of generated documentation, because we override it
   below."
  (frob a b))

(annotate-documentation (foo function)
  (:argument a "This is the important argument of type BAR")
  (:argument b "This is the BAZ being frobbed in some special way")
  (:return-value "random junk of type QUUUX")
  (:condition big-mistake-error)
  (:see-function the-other-function)
  "This functions frobs A and B to create random junk.

   Detailed description here.")

When writing the macro, I was concerned that users would end up creating
a dependency of their ASDF system to my project just in order to use the
macro.  So I decided to make the macro as small as possible, allowing
users to copy&paste it into their package instead.  This way, users
would only have to depend on my project `parse-docstrings' when parsing
and generating documentation, and not just to load their source code.

Doing so requires some guarantees towards API compatibility, so the
macro should better not change all the time.  In the end, the macro is
seven lines long, including a bad hack for SETF functions.  It comes
with its own documentation as an example.

This is the file that users would copy&paste:
http://repo.or.cz/w/parse-docstrings.git?a=blob;f=annotate-documentation.lisp

And here is the code parsing it:
http://repo.or.cz/w/parse-docstrings.git?a=blob;f=annotation-plist.lisp

In my example above, you might have noticed that I didn't just switch
from a docstring to the annotation macro, I also changed the markup
syntax: The example using the macro is written using SBCL's docstring
format rather than atdoc's syntax.  That's why it's BAR rather than
@class{bar}.

Now, that's not actually a required change, because parse-docstrings
leaves it completely up to users which syntax they want to use in the
strings.

parse-docstrings would ship with two syntax plugins out-of-the-box, one
for SBCL-style docstrings and one for the @-syntax known from atdoc.
But users could write their own parser plugins and parse texinfo,
markdown, clixdoc syntax, or whatever they like.

The parser to use is chosen with an approach somewhat similar to how
hyperdoc finds URLs: We look for an internal symbol in the package being
documented, and if it exists and is FBOUND, we call it to find the
parser function.  (If not, the default is currently SBCL-style syntax.)

And personally I think that all this is the most important part:
parse-docstrings offers -choice-.  It doesn't force a particular
solution on users, so it avoids falling into the trap of a niche
approach that users dismiss as "not invented here".  Instead, it is as
configurable as possible, allowing at least parts of it to be reused.

  - docstring and/or annotation macro?  the user gets to choose.
    (I now like the annotation macro much better than my old atdoc
    stuff, so I would recommend use of the annotation macro to users.)

  - markup syntax?  the user gets to choose or implement his own.

  - output format?  Not addressed in parse-docstrings at all.  There
    will an SBCL-style texinfo-docstrings on top of it, an
    xml-docstrings project writing a big XML file for further
    processing, a replacement for atdoc providing XSL stylesheets, etc.

    (I hope that there would also be a SLIME-based solution.)

In a way, thanks to the documented CLOS API for markup, we don't even
enforce the whole idea where documentation is being "generated".  For
example, Michael Weber blogged about a "reverse docstrings" idea where
docstrings would be pulled out of the texinfo documentation instead.

Parse-docstrings' main API function is DOCUMENTATION*, which is like
DOCUMENTATION, but
  - returns CLOS objects instead of a string
  - uses the annotation macro instead of (just) the docstring

I could see another addition:
  - if a package declares that it wants to pull documentation from
    somewhere else entirely, it could provide a function to do so.  That
    way, users like Michael could present CLOS objects that don't come
    from the source code at all.

d.