[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Source Text Tools



> Michael van Acken <mvacken@t-online.de> wrote:
> > I started on a set of modules which could be extended into a number of
> > utilities working on the textual representation of modules.  
> 
> Will these be used exclusively for source code manipulation?  Or could these
> also be extended for programatic uses such as "object serialization" or
> reflection (sorta like Java Bean info)?

I don't have the slightest idea what you are talking about.  Therefore
my answer is a careful "no".  Keep in mind that I'm only offering a
set of base modules for text manipulation, I won't implement all these
tools.

[interface extractor / doc comments]
> > o To which declaration is a doc comment attached?  For procedures this
> > is simple, the comment is placed after the header but in front of the
> > first local declaration.  How is this handled for the other kinds of
> > declarations? 
> 
> To keep things consistent, I'd think all doc comments should be handled this
> way: the comment is placed after a declaration.

Possible definition: A doc comment refers to the nearest declared name
that is placed textually before it.  The parameters (or the receiver)
of a formal parameter list and a module's import list don't count as
declarations in this context.  The module header is treated as a
declaration.  That is, any doc comment before the first proper
declaration of a module (or in front of the module header) is attached
to the module itself.

> > o Are free floating comments, i.e., comments not attached to a
> > declaration, permitted and how are they dealt with?
> 
> I can see where free floating comments would be desirable.  To make these
> automatic (rather than having to mark the end of a previous attached comment
> -- like using `@end deftp' in texinfo), would the tool have to look for 
> `END ProcName;', and so forth? 

I don't understand this.  But with my proposed rule above _all_ doc
comments are attached to a declaration, which solves the problem in a
way.

> > o Should there be special commands available to refer to other
> > declarations from within a comment?  E.g. a procedure doc comment
> > could refer to its parameter `foo' as @oparam{foo}, and to a type
> > `Bar' as @otype{Bar}.  This kind of reference tags allow consistency
> > checks (does the referenced declaration exist? is it of the correct
> > kind?) and permit to insert hyper-links when translating to HTML.
> 
> Hmm... I'm not sure about this one.  I think I'd have to see an example of
> how you think these kinds of commands should be used.  Are these specialized
> @ref commands?

Here is examples of declaration descriptions:

1. Description is plain ASCII formatted
PROCEDURE (list: List.List) Append* (node: List.Node);
(**
Adds node `node' to the end of list list `list'.  If `node' is of type
`List.NodeUp', then its field `up' is changed to refer to `list'.
pre: `node' is not part of any list.  
*)

This kind of formatting is cannot be converted adequately to more
powerful typesetting formats like HTML or texinfo.


2. Description is formatted for texinfo
PROCEDURE (list: List.List) Append* (node: List.Node);
(**
Adds node @var{node} to the end of list list @var{list}.  If
@var{node} is of type @code{List.NodeUp}, then its field @code{up} is
changed to refer to @var{list}.

@emph{pre}: @var{node} is not part of any list.  
*)

This could be transferred as is into a texinfo document, and conversion
into HTML is also straight forward.  Note that the comment does not
contain any Oberon-2 specific meta information.


3. Description is formatted with additional @oxxx commands
PROCEDURE (list: List.List) Append* (node: List.Node);
(**
Adds node @opar{node} to the end of list list @opar{list}.  If
@opar{node} is of type @otype{List.NodeUp}, then its field
@ofield{List.NodeUp.up} is changed to refer to @opar{list}.

@empf{pre}: @opar{node} is not part of any list.  
*)

Instead of text formatting commands this version uses commands that
describe the Oberon-2 entity they are referring to.  Conversion to
texinfo would replace @opar with @var, @otype with @code, etc.
Additionally all @oxxx references can be checked for consistency,
i.e., it can be verified that the referenced object exists and is of
the required type.  And when translating to HTML all @oxxx commands
can be turned into hyper links; all types and procedures mentioned in
a description would be just a mouse click away.

> The texinfo commands that I use the most and find the most useful (besides
> the structuring commands like @node, @chapter, @menu, etc.) are mainly for
> formatting and cross referencing.  

Since doc comments only basically a description associated with a
declaration there is no need for any higher level structuring commands
for nodes or chapters.  Likewise all cross reference commands are
useless since (in texinfo) they point to nodes.

I expect that automatically extracted documentation based solely on
doc comments is used for the most part as quick reference, or until a
module's interface is getting stable.  For more ambitious documentation
with chapters and sections, introductory remarks, exhaustive examples,
cross references, indices, etc. one would take the doc comment base
and rework it manually to fit the needs.  This process would _not_ be
done automatically.  Some comments could be dropped in without change,
others would need a rewrite, but the complete framework around them
would have to be written from scratch.

> Here is a list and short description of each: 
> 
> Font and style commands:
> 
>    @cite   -- name of a book (with no cross reference link available)
>    @code   -- syntactic tokens
>    @dfn    -- introductory or defining use of a technical term
>    @emph   -- emphasis; produces *italics* in printout
>    @kbd    -- input to be typed by users
>    @minus  -- generate a minus sign (or `---' for an em-dash)
>    @samp   -- literal example or sequence of characters
>    @strong -- stronger emphasis than @emph; produces *bold* in printout
>    @var    -- metasyntactic variables (e.g., formal procedure parameters)

Implementation of the whole batch is rather simple.  If you implement
one you have implemented 90% of the rest.  I would suggest to replace
@var and (parts of) @code with dedicated commands @ovar, @otype,
@opar, etc.
 
> Lists and tables:
> 
>    @enumerate -- enumerated lists, using numbers or letters
>    @itemize   -- itemized lists with and without bullets
>    @table     -- two-column tables with highlighting

In my opinion we should do all three constructs.  Dropping them would
be too restrictive.

> Paragraph formatting:
> 
>    @example   -- example that is not part of the running text (fixed font)
>       (and @smallexample, which uses a smaller font in `smallbook' format)
>    @format    -- example in the current font that does not narrow the margins
>    @quotation -- excerpt from another (real or hypothetical) printed work
>    @noindent  -- prevents paragraph indentation

IMO we can restrict ourselves to @examples and maybe @format for our
purposes here.

> Cross reference:
> 
>    @url   -- indicate a uniform resource locator (URL)
>    @uref  -- reference to a uniform resource locator (URL)
>    @ref   -- cross reference
> 
>    @pref  -- variation on @ref
>    @xref  -- variation on @ref

I think we can skip these.

> Any other suggestions?  Can any of these be omitted?  Can some of these be
> done automatically?

Judging from my own experience with texinfo the above list is
complete.  I don't see any easy mechanisms for automatical formatting.
E.g. one could try to identify the names of Oberon-2 entities in
descriptions and assign formats to them based on this information.
But this can't be done with 100% success.  Therefore I would prefer
explicit markers.

Some people might complain that writing formatted text with the above
commands is cumbersome.  This is certainly true.  But doc comments
only describe the public interface of a module.  Typically this
interface is quite small.  Persons complaining that writing an
interface's documentation is too complicated should be eyed with
suspicion anyway  ;-)  

> > o Should formatting for a given display width be implemented?  Doc
> > comments have to be reformatted for a convenient line width.  Having
> > some comments formatted for a width of 132, while others prefer 72,
> > just won't do.
> 
> How do texinfo, texi2html, and the other programs do this?  I think I'd
> prefer the various tools allowing display width to be set by the user,
> rather than having an @ command embedded in the doc comment itself.

Texinfo delegates things like line breaking to TeX (which is doing a
wonderful job here) and makeinfo (which keeps things simple by not
introducing hyphenation or page breaks, just like HTML browsers).
Formatting a paragraph for a particular display width is simple,
provided that paragraphs are readily identifiable and fancy stuff like
"ASCII art" is marked properly.

-- mva