[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Source Text Tools
I started on a set of modules which could be extended into a number of
utilities working on the textual representation of modules. The
modules in question are a scanner/parser combo that will convert a
module's source code into an internal data structure. No textual
information is lost in the process. The data structure is an abstract
syntax tree (AST) whose leaves are additionally part of a linear
symbol list. The AST represents the hierarchical structure of the
module, while the symbol list keeps track of comment and pragma
blocks.
The really nice thing here is that this kind of data structure is very
useful for a large number of purposes. Each of the following tools
can be built upon its information:
o a pretty printer: takes a modules source code and transforms it to
conform with certain rules of indentation, comment formatting, etc.
o an interface browser: works like the symbol file browser oob, but
uses the module text as input; specially marked comments are
included in the interface description
o a documentation extractor: an extension of the interface browser;
instead of emitting plain ASCII text it provides nicely formatted
output that can be fed to Texinfo or a HTML browser
o hyper-linked source code; a whole set of modules is converted to
HTML format, with every using occurrence of a identifier turned into
a hyper-link to its original definition
o a cross referencer: lists on request all uses of a variable, type,
constant, etc. in a given set of modules; together with some fancy
Emacs Lisp code one could tell the editor to "visit all writes to
record field R.f" and step through all such occurrences like one
would step through an error list
o a definition finder: given a name and a position in a module, find
the definition associated with this name [Note: Something like this
is already implemented in the Emacs mode, but it's a little bit slow
and sometimes it gets things wrong.]
o possibly an editor could use a specialized subset of the
scanner/parser to implement syntactic highlighting
Obviously I'm in no position to implement half a dozen tools. Instead
I will finish the common base of those tools and continue in the
direction of cross referencer/definition finder with the intention of
integrating these functions into the Emacs mode (ditching an awful lot
of ugly elisp code in the process). I will leave the rest of the
tools to volunteers ;-)
Especially a tool to extract a commented interface definition from a
module is sorely missed. Someone should implement such a beast. In
the meantime I want to start a discussion how "doc comments" should
look like. I have some points I would like to see addressed:
o To which declaration is a doc comment attached? For procedures this
is simple, the comment is placed after the header but in front of the
first local declaration. How is this handled for the other kinds of
declarations?
o Are free floating comments, i.e., comments not attached to a
declaration, permitted and how are they dealt with?
o Should there be special commands available to refer to other
declarations from within a comment? E.g. a procedure doc comment
could refer to its parameter `foo' as @oparam{foo}, and to a type
`Bar' as @otype{Bar}. This kind of reference tags allow consistency
checks (does the referenced declaration exist? is it of the correct
kind?) and permit to insert hyper-links when translating to HTML.
o Should there be additional formatting commands? The simplest
command would be an empty line to signal a paragraph, @samp{...} to
mark example code, and commands like @table / @enumerate / @itemize to
implement tables. This way more flexible targets like Texinfo or HTML
can do better formatting. Of course the set of additional format
commands should be small but powerful. I'm sure Eric will have a
suggestion for a suitable set of commands.
o Should formatting for a given display width be implemented? Doc
comments have to be reformatted for a convenient line width. Having
some comments formatted for a width of 132, while others prefer 72,
just won't do.
I'm sure there are other open questions. I want to see them discussed
on this mailing list. In the end a specification of a definition
extracting tool should evolve. Then a volunteer should implement this
specification...
-- mva