[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

The TODO List



[A copy of this text is in ooc/doc/TODO_List.  I think I'll place it
 on the WWW server, too.  --mva] 


This is a list of (possible) additions to the OOC project.  It's
divided into the sections Libraries, Tools, and Optimizers.  Only the
optimizers need an intimate knowledge of the compiler's entrails.
Parts that someone has signed up to are marked with a `working on it'
tag.  There aren't that many yet, volunteers are definitly welcome!

There are sveral ways to contribute to the OOC project.  The easiest
way is to add project to this list.  A larger contribution is to
implement one of the project from this list.  The ultimate
contribution, of course, is to implement a back-end for for a target
architecture of your choice.

Feel free to drop me a mail if you have questions, want to add to this
list, or sign up for a project.

  -- Michael van Acken <acken@informatik.uni-kl.de>




Libraries
------------------------------------------------------------------------
First the stuff in the core lib, i.e. functions that the compiler will
need:

Time
A simple module providing objects and functions on them to handle time
and date stamps.  Includes conversion from/to string, compare, and
simple arithmetic functions.  

Command Line Arguments
Has to support argv-style arguments and (possibly) GUI based
arguments.  Argument syntax should follow the rules of the underlying
OS.

Locators/Channels/Readers/Writers 
[working on it: Sander van der Wal, Michael van Acken]
An interface to arbitrary forms of byte stream input/output.  The most
prominent Locator is a file name, the most prominent channel is a
file.  Additional channels can be added (see the list of channel
implementations below).

--------
Additional stuff (not part of the core):

Channel Implementations
- compressed channel (GNU zip)
- encrypted channel
- socket channel (TCP stream)
- telnet channel (building on socket channel)
- ftp channel (building on socket channel)

Graphical User Interface

[I'm sure you can add some of your pet libraries here.  Don't
 hesitate!] 




Tools
------------------------------------------------------------------------

Cross referencer
A tool to answer queries like: 
- which modules import module M?
- where is variable M.x used (read, changed)?
- where is field T.f used (read, changed)?
- where is procedure M.P called (assigned to a variable, passed to a
  parameter)? 
- which types extend the record M.T?
- where is procedure T.p redefined?
- given a file, a position in it, and a name, find the the declaration
  for this name at this position
- find the declaration of the type-bound procedure T.p 
A implementation of such a tool can be split into several parts: The
first collects information about modules by parsing them and storing
the gathered information.  The second uses this data to answer queries
like the ones in the above list.  The third part could generate
documents like cross reference lists or hyper-linked versions of
source modules (e.g. by turning every identifier into a link to its
declaration).  The final part is a GUI for convenient use of this
tool(s).

Definition extractor [working on it(...?): johnm]
A program that, similar to the symbol file browser, generates an
interface definition for a given module.  Unlike the browser, this
tool uses the module's source code to extract exported declarations
and specially marked comments.  The result is a commented description
of the module's interface.

Pretty printer
Takes a (syntactically valid) module of arbitrary ugliness and
transforms it into a nicely formatted version.  The output will use
your personal indentation style, use a fixed text width (e.g. 79
chars), obsolete symbols like a `;' before an END will be removed, END
symbols are assigned comments that refer to the kind of block they
belong to (e.g. `END (* for *)').  For printing certain text elements
can be hightlighted (e.g. keywords are set as bold text, comments in
italics).
Maybe the most difficult part here is to get the comments `right'.
That is, to move them around resp. to format them in a reasonable way.




Optimizers
------------------------------------------------------------------------
(most of this section is from [Bra95])

Loop Optimization
Includes Stength Reduction, Reassociation, and Loop Invariant Code
Motion.  Strength reduction is an optimization that replaces an
expensive operation by a less-expensive one, usually a multiplication
by an addition within a loop.  Reassociation is a related
transformation exploiting commutative, associative, and distributive
laws to rewrite expressions, so that strength reduction becomes more
profitable.  Loop invariant code motion moves computations that are
constant over all iterations of the loop to a place before the loop,
and is performed as a by-product of the other transformations.
References: [MMZ94], [Bra95] page 86

Lazy Evaluation resp. Partial Dead Code Elimination
It is often encountered that a computation dominates paths on which it
is not used.  By delaying the execution until its need is known (lazy
execution), the execution time on some paths will be reduced.
References: [Bra95] page 94, [KnRS94]

Typeflow analysis
Eliminate runtime tests on the dynamic type of variables, and turn
calls to type-bound procedures into static procedure calls.  The first
part is accomplished by by a typeflow analysis that determines bounds
on the types that objects may posses at runtime.  The second part is a
global optimization that combines this with the knowledge about all
record types and type-bound procedures in the program to replace
lookups of type-bound procedure's address (using type tags and type
descriptors) by a single procedure's address.
References: [CoGo94], [Fer96]

Valueflow(?) analysis
Eliminate runtime tests like index and overflow checks.  This is done
by collecting information about the range of numbers a value may
take.  Sources of this information can be induction variable analysis
(part of loop optimization), operations like MOD and ABS that return
results within given bounds, the data type that holds the value
(e.g. SHORTINT), guards testing for a given range, and previously
executed bound checks.

Redundancy Elimination and Partial Redundancy Elimination
See [Bra95], page 92.

Array Dependence Analysis
See [Bra95], page 94.

Locality-Improving Transformations
See [Bra95], page 94.

Procedure Inlining
Replace the call to a procedure by the body of the procedure.  
see also: Execution Profiles
References: [Bra95], page 74

Loop unrolling
By duplicating the code of the loop body it is possible to reduce the
number of backward jumps, increase the usage of multiple execution
units, and increase pipeline thoughput.
see also: Execution Profiles

Loop Fusion
Knowledge that two loops are executed under the same control
conditions can be used to combine those loops into a single one,
saving the loop control overhead.
References: [Bra95] page 83

Guard Fusion
Knowledge that a region A is executed whenever region B is executed
and vice versa can be used to combine these regions into a single one,
saving branch instructions.


Execution Profiles (not an optimization, but a related topic)
Prior knowledge about the most frequently executed program parts can
increase the effectiveness of optimizations like procedure inlining
and loop unrolling by avoiding these transformations at rarely
executed places.  Profiles are created by building executables with
special profiling support.  Running such executables generates an
execution profile (for the given input) that can contain information
like
- the number of calls to a procedure
- how many times a specific call instruction was executed
- how much execution time was spent in the various procedures
- how many times a loop was entered and how many iterations did on
  average
Now the open questions:
Which information should a profiling run gather?  How can a back-end
generate code for profiling?  In which way is the gathered data
stored?  How can it be evaluated?
References: [Fer96], gprof (the GNU profiler)




#	$Id: Literature,v 1.2 1996/03/22 12:54:36 oberon1 Exp $	

[Bra95] 
  Marc Michael Brandis, 
  Optimizing Compilers for Structured Programming Languages
  Diss. ETH Zuerich Nr. 11024, 1995
  ftp://ftp.inf.ethz.ch/pub/publications/dissertations/th11024.ps.gz

This is the paper this compiler is built upon.  Describes static 
single-assignment form (SSA) and presents guarded single-assignment 
form (GSA) as a means to incorporate data flow and control flow into
a single represenation.  Contains algorithms to build GSA from
structured source programs, optimizations algorithms on the GSA data
structure, and code generation for the PPC604.


[BrMoe95] 
  M. Brandis and H. Moessenboeck
  Single-Pass Generation of Static Single Assigment Form for Structured 
    Languages
  ACM Transactions on Programming Languages and Systems, 1994


[CFRWZ91] 
  R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and F. Zadeck
  Efficiently Computing Static Single Assignment Form and the Control 
    Dependence Graph
  ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, 
    October 1991


[CoGo94]
  Diane Corney, John Gough
  Type Test Elimination using Typeflow Analysis
  ftp://...


[Fer96]
  Maria F. Fernandez
  A Retargatable, Optimizing Linker
  Diss. Princeton University
  ftp://...


[KnRS94]
  J. Knoop, O.Ruehting, and B. Steffen
  Partial Dead Code Elimination
  In Proceedings of the ACM SIGPLAN '94 Conference on Programming
  Language Design and Implementation.  Orlando, Florida.  June 1994. 


[MMZ94]
  P. Markstein, V. Markstein, and K. Zadek
  Strength Reduction
  In Optimization in Compilers, ed. F. Allen, B. Rosen, and K. Zadek.
  To be published by ACM Press.


[MoWi95]
  H. Moessenboeck, N. Wirth
  The Programming Language Oberon-2
  Institut fuer Computersysteme, ETH Zurich, March 1995
  ftp://ftp.inf.ethz.ch/pub/Oberon/Docu/Oberon2.Report.ps.gz
  ftp://ftp.inf.ethz.ch/pub/Oberon/Docu/Oberon2.ChangeList.ps.gz