[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ADT List (2)



To keep traffic low, I collect the mails that appear on the mailing
list over the day until 16:00 local time (currently GMT+2).  Then I
will post a reply at approximately 18:00 (also local time).

PF> >Naming of entities in the module is open for discussion.
PF> 
PF>    I suggest sticking to established "Oberon conventions" as documented eg.
PF> in Moessenboeck's Oberon-2 book. No need to deviate from my point of view.

We have made it the OOC standard as well.  This is laid down somewhere
at the beginning of the reference manual.  I am not making a conscious
effort to adhere to this standard, though.  Feel free to point out any
misnomers to me.

Here is the relevant quote from the OOC RM:

--begin quote--
Also, in order to provide consistency, the OOC library attempts to
follow these naming conventions:

================================================================
Names for              Start with              Examples
----------------------------------------------------------------
Constants, variables   Lower-case noun         version, wordSize
                       Lower-case adjective    full
 
Types                  Upper-case noun         File, TextFrame
 
Procedures             Upper-case verb         WriteString
 
Functions              Upper-case noun         Position
                       Upper-case adjective    Empty, Equal
 
Modules                Upper-case noun         Files, TextFrames
----------------------------------------------------------------
--end quote--

As a side note: module aliases should start with upper-case
characters, too.  _Please_ follow this standard in all library
modules.


PF> >Because other ADTs may need further procedure arguments, `Init' and
PF> >`New' are implemented as normal procedures.  If they would be a
PF> >type-bound procedures, we would be forced to have parameterless `Init'
PF> >and `New' for all out ADTs.
PF> 
PF>    Not true. There is an alternative for simulating "well-behaved"
PF> constructors in Oberon-2 but it does come at a price: the same price as
PF> message passing as used for the GUI parts of the Oberon System.
PF> 
PF>    MODULE StdTypes
PF> 
PF>      TYPE
PF>        Initializer* = RECORD (* some generic data, I don't know *)
PF> 
PF>      PROCEDURE (self: Object) Initialize (init: Initializer);
PF>      BEGIN (* Abstract? *)
PF>      END Initialize;
PF> 
PF>    MODULE Hashtables
PF> 
PF>      IMPORT ST := StdTypes;
PF> 
PF>      TYPE
PF>        Hashtable* = POINTER TO HashtableDesc;
PF>        HashtableDesc* = RECORD (ST.Object) (* or maybe a Container base
PF> type? *)
PF>          data: POINTER TO ARRAY OF Object;
PF>          (* other stuff here *)
PF>        END;
PF> 
PF>        Initializer* = RECORD (ST.Initializer) size*: INTEGER; END;
PF> 
PF>      PROCEDURE (self: Hashtable) Initialize (init: ST.Initializer);
PF>      BEGIN
PF>        WITH init: Initializer DO
PF>          NEW (self.data, NextPrime (init.size));
PF>          (* ... *)
PF>        END;
PF>      END Initialize;
PF> 
PF> I used this approach quite successfully in my old "Nice Class Library"
PF> project which sadly got shelved. If I had the sources here I would gladly
PF> contribute them, but they sit on my Amiga back in Germany. :-(

What would we gain with this approach?  Obviously, it is very
inconvenient for the user to provide this kind of initialization
infrastructure.

PF> >Most operations are destructive.  E.g., `Reverse' and `Filter' act
PF> >upon the list itself, instead of creating a modified copy.  If
PF> >desired, the user can explicitly call `Copy'.
PF> 
PF>    Good policy! Keep the overhead low by making these things explicit.
PF> However the "generalist" in me tends to also see the good things of
PF> creating new structures all the time, eg. the "filtering" would become
PF> easier to code if we would create a new list.

Actually, I am changing the procedure `Filter' to be not destructive.  
I have the nagging feeling that it is more useful this way.  

PF>    However, I don't think that "Reverse" or "Filter" should even be part of
PF> the container class itself. They should rather be provided externally, most
PF> likely using some kind of "Iterator" or "Carrier-Rider" design pattern.

Actually, I am not designing a Container class here, but a List.  See
my ADT characteristics at the beginning of the module source below.  A
list has a number of properties.  Of course, these properties can be
used to make the use of a list more efficient.  That is the reason
why, e.g., the fields `prev' and `next' are visible to the user.

We will also agree, that the most common operations on list should be
provided by the ADT implementation itself.  For convenience and
efficiency.  That is, we don't need a minimal set of procedures, but a
minimal _convenient_ set.

PF>    My "Nice Class Library" (NCL from now on) project seperated Iterators
PF> into a class hierarchy on its own. Container classes exported a method
PF> 
PF>    PROCEDURE (self: Container) NewIterator (): Iterator;
PF> 
PF> that returned an Iterator appropriate for the container class in question.
PF> In cases where it made sense to provide more specialized iterators, a
PF> special container class could also have a special iterator like this:
PF> 
PF>    PROCEDURE (self: Tree) NewTreeIterator (): TreeIterator;
PF> 
PF> where TreeIterator has methods like SetInOrder(), SetPreOrder() and
PF> SetPostOrder(), which would in turn influence the behaviour of Next() which
PF> advances an Iterator the the next element in the collection. You could also
PF> make these paramters to the NewTreeIterator() method of course, or design
PF> an even fancier setup using NewIterator() with a variant of the Initializer
PF> in form of a factory method. Yeah, making it more and more complicated is
PF> easy. :-)

You should provide some information how these iterators of yours are
actually used.  I would also like to see a discussion of the pros and
cons of your approach, compared to the current simplistic one.  I
assume, you have gained some experience with your iterators.  While
you are at it, you can also compare them with the iterators presented
in Moessenboeck's "OOP in O2".

There is no need to disallow simple navigation using the `prev',
`next' fields.  A more sophisticated iterator or rider like mechanism
can be provided as well, but this has to be legitimised by the overall
design of the library.

  > >Should we allow to store the NIL object in a list?
PF>    Since NIL means "no object" most of the time, I would say no.
TT> No. That will complicate handling e.g. when sorting iterating etc..
TT> The user always has the possibility to add non-initialized or specially
TT> flaged objects instead of NIL.

We all agree here.  Great.

PF> >How many specializations should be provide in the module interface?
PF> >E.g., `Append' and `Prepend' are simple specializations of
PF> >`InsertAfter' and `InsertBefore'.  We have `Remove'.  Do we need
PF> >specialized procedures to remove the first/last element?
PF> 
PF>    I would vote for minimal procedure sets. 

A minimal procedure set for our current List would contain Init,
Insert, Remove, and First.  All other procedures can be built in terms
of this basic set, I believe.  Now, do you suggest we restrict the ADT
List to these methods?  Please qualify our "minimal procedure set"
further. 

PF> Other abstractions (eg. a
PF> stack) can use the list class as implementation, but should provide a new,
PF> distinct interface. 

If you extend List, the new type's interface is a superset of the one
of List.  If Stack _uses_ List, then the interfaces can be distinct.

PF> Keep interfaces small! Also certain abstractions, eg.
PF> Stack, should not have iterators or iterator-like constructs: It doesn't
PF> fit!

Correct.  The operations provided by a ADT should be based on its
"characteristics" sheet.

PF> >Currently, ADT List is a direct descendant of the type `Object'.
PF> >Should we define a intermediate (possibly abstract) type
PF> >Container/Collection/Bag?  And what properties should this
PF> >intermediate type have?  [If anyone answers to this with "let's
PF> >introduce xyz" WITHOUT giving a detailed outline and a module
PF> >interface, I am seriously annoyed.]
PF> 
PF>    Oops, sorry to annoy you. :-) The problem is that an abstract class
PF> container can only provide "Empty(): BOOLEAN" or maybe "Elements():
PF> INTEGER" but no "real" methods. 

Abstract classes are not new to the OOC library (see Channel).  The
questions are: should we introduce abstract (or concrete) intermediate
classes, for what purpose, and how should they look like.

PF> Collection could then add "NewIterator" or
PF> whatever, but eg. Stacks would then be derived from "Container" directly,
PF> or possibly a new class "Dispenser" in between (see Meyers class library
PF> for Eiffel).

Actually, I don't want to look at Meyers class library for Eiffel ;-) 
If you want to make a point, I would appreciate it if you state it
explicitly.  [My apologies if this statement seems sarcastic; I may
have volunteered as "godfather" for this project, but this does not
mean I want to evaluate or second-guess any suggestion made on this
mailing list.] 

PF> >Persistence.  Should we integrate generic load/store mechanisms into
PF> >our library?  I think there are three different approaches to this
PF> >problem in the Oberon community: the one presented in Moessenboeck's
PF> >"OOP in O2", the one used by Gadgets, and the version described in
PF> >Templ's diss "Meta-programming in Oberon".  I cannot comment on the
PF> >pros and cons of the approaches, because I do not know Gadgets at all.
PF> 
PF>    If you do, please consider it to the fullest extent possible: You
PF> actually have to be able to linearize arbitrary data structures. Anything
PF> less is not general enough, and it is hard to provide support for it later.
PF> I implemented arbitray linearization for NCL, so maybe you might want to
PF> wait another month until I am back in Germany and can dig out and publish
PF> those old sources.

Of course I want to load/store cyclic graphs.  At the moment I favor
the approach taken by "OOP in O2", because a) I don't know the other
approaches, b) I won't understand them unless someone explains them to
me, and c) Moessenboecks approach is simple enough for me to
understand ;-)

PF> >Another thing: I wondered why we should not provide two
PF> >implementations of certain ADTs, e.g. a "single root" list for
PF> >flexibility and a "low-overhead" list (as presented by MKG at the very
PF> >beginning of this thread) for efficiency.
PF> 
PF>    Interesting! I did the same in NCL, but for different reasons. There was
PF> actually a "Simple Class Library" (SCL) that included very basic data
PF> structures like an efficient list and an efficient hashtable. These were
PF> then used to implement certain NCL services, eg. the hashtable was used as
PF> part of the linearization algorithm.

I assume that you second this proposal.

PF> >MODULE List;
PF> >
PF> >IMPORT
PF> >  StdTypes;
PF> >
PF> >TYPE
PF> >  Node* = POINTER TO NodeDesc;
PF> >  NodeDesc = RECORD
PF> >    next-, prev-: Node;
PF> >    (* The read-only fields `next' and `prev' are provided for efficient
PF> >       navigation.  A value of NIL designates the end of the list. *)
PF> >    obj-: StdTypes.Object;
PF> >    (* Data of the list element.  *)
PF> >  END;
PF> 
PF>    I would no export these. They should only be accessible through the
PF> iterator or rider or whatever the "accessor" is called (Enumeration in
PF> Java).

See my comment above about ADT characteristics and properties.

PF> >PROCEDURE Destroy* (VAR l: List);
PF> >(* Removes all elements from list `l' and destroys the list object itself.
PF> >   The variable parameter `l' is set to NIL.
PF> >   Note: The user is not required to call this procedure.  Resources like
PF> >   unused lists are automatically freed by the garbage collector.  This
PF> >   procedure is provided for convenience, both for the user and the garbage
PF> >   collector.  *)
PF> >  END Destroy;
PF> 
PF>   Reminds me of NCL again: If OOC supports "CLOSE" (or some equivalent
PF> library function) I could actually contribute code to simulate "real"
PF> destructors.

Obviously we don't have CLOSE.  And I wonder what purpose a
simulated(!)  "real"(why "?) destructor has in our current context.

PF> 
PF> >PROCEDURE (l: List) Copy* (): List;
PF> >(* Creates a copy of the given list `l'.  *)
PF> >  END Copy;
PF> 
PF>    Deep of shallow? How could you do a deep copy without Object supporting
PF> a "Clone()" method?

Well, looks like I am not precise enough.  The answer is: as shallow
as possible.  The list structure is duplicated, but no duplicates of
the list data are created.

--------

TT> > this vacancy, this means I am immediately accepted for this job ;-)
TT> 
TT> Congratulations! Where to send the flowers :-)?

Who wants flowers?  Send money! 

TT> > Naming of entities in the module is open for discussion.
TT> 
TT> As metnioned below it is likely to one will several modules that do
TT> implement same datatypes different. All modules should have the type of
TT> the ADT in its name (e.g. 'List'). The 'special feature' of the
TT> specific implementation should be mentioned. These suffixes or prefixes
TT> should be standarizised as much as possible. E.g. 'Dyn' for a dynamic
TT> implementation, 'A' or 'Static' for an array implementation. So you
TT> will have an abstract interface module List.Mod, the classical dynamic
TT> implementation 'DynList' and the more esoteric 'ArrayList' or better
TT> 'StaticList'. Names or open to discussion, the given examples should
TT> only give a hint. Perhaps one should uses Suffixes? 'ListArray.Mod'?
TT> This would give you all implementations beneeth when doing a 'ls'.

This is certainly worth consideration.  Of course, this can only be
decided once the general form of the whole library is less hazy.

TT> > Because other ADTs may need further procedure arguments, `Init' and
TT> > `New' are implemented as normal procedures.  If they would be a
TT> > type-bound procedures, we would be forced to have parameterless `Init'
TT> > and `New' for all out ADTs.
TT> 
TT> Correct. I've done that the same in my implementation. However one can
TT> think of a parameterless Init method that will initialize the datatype
TT> to some kind of default state? Hmm...

I considered that, too.  But this is a duplication of procedures, most
of all because the method need to do a super call, and the procedure
as well.  Since I don't see any benefits with an additional Init
method, I discarded the idea.

TT> > How many specializations should be provide in the module interface?
TT> > E.g., `Append' and `Prepend' are simple specializations of
TT> > `InsertAfter' and `InsertBefore'.  We have `Remove'.  Do we need
TT> > specialized procedures to remove the first/last element?
TT> 
TT> Peter is right when he states, that interfaces should be as small as
TT> possible. On the other hand specialized methods give you much more room
TT> for internal optimisation. It gives you also a feel of comfort. Also
TT> the datatype can always maped spezialized methods to non-specialized.
TT> Will the compiler optimize such things like the following?
TT> 
TT> PROCEDURE (bla : Bla) Append(blub : Blub);
TT> 
TT> BEGIN
TT>   bla.InsertAfter(bla.last,blub);
TT> END;

Yes and no.  The compiler _can_ optimize this if
a) you compile the whole program (and not a module at a time), and
b) you optimize the dynamic call of the method into a static one, and
c) you do procedure inlining.
None of the three is implemented yet :-)

TT> > Currently, ADT List is a direct descendant of the type `Object'.
TT> > Should we define a intermediate (possibly abstract) type
TT> > Container/Collection/Bag?  And what properties should this
TT> > intermediate type have?  [If anyone answers to this with "let's
TT> > introduce xyz" WITHOUT giving a detailed outline and a module
TT> > interface, I am seriously annoyed.]
TT> 
TT> I implemented this aproach in the published sources analog to some
TT> implemention as part of the AmigaOberon compiler. Implementing some
TT> type Collection gives you the possibility to switch within your sources
TT> between differnet container types (List, Tree, Hashtable). However
TT> there are problems: The container defines onyl simple acces methods
TT> like Add and Remove. Exchangablility is only given, if you do reduce
TT> yourself to calling non-specialized methods. Most baseclass methods
TT> like "Add" need parameter. The baseclass will define them of type
TT> Object. Derived classes (e.g. Trees) need a specialized object with
TT> f.e. does implement comparison methods. This results in heavy casting
TT> when implementing such methods in derived classes. You can study this
TT> in my published sources.

For now, we can ignore any ADTs that need a half order between
objects.  

Also, your statements confuse me.  Is it worthwhile to introduce
intermediate classes?  Should they be abstract or concrete?  How
should these classes look like? 

TT> > Persistence.  Should we integrate generic load/store mechanisms into
TT> 
TT> Persisstance should be supported at least in a general way. 

Two votes for persistence, none against.

TT> > Another thing: I wondered why we should not provide two
TT> 
TT> Definitely. There should be e.g an abstract interface for
TT> implementations of the datatype List. This interface should implement
TT> basic methods common for all lists. 

This is not what I meant.  Please check MKGs postings again,
especially his first posting with code for a module "LinkedList" (or
so).

TT> Different Lists (dynamic-, array
TT> implementation) should derive from List and implement the needed
TT> methods.

Based on the current List outline, how would you define the common
base class of all lists?  Or would you suggest to create the base
class by simply defining ListDesc to the empty record, and replace
the Element data types with generic navigation methods?  Mh, maybe you
should provide a module interface!

TT> This results into the need
TT> for Iterators. Note that iterators also have other advantages since
TT> they allow us to define generic operations like deep-copying and
TT> sorting on every datatype.
TT> 
TT> > TYPE
TT> >   Object* = POINTER TO ObjectDesc;
TT> >   ObjectDesc* = RECORD
TT> >   END;
TT> 
TT> Type object is a difficult thing. It must be small, since they may be
TT> many instances around but on the other hand any features it misses will
TT> need double the code somewhere else to implement (persistance...). As a
TT> result it should define a number of methods but at least as possible
TT> members.
TT> 
TT> VisualObero (which defines Object, too) implements the following
TT> things:
TT> 
TT> Methods:
TT>   Init, Load, Store, Free (abstract datatypes should always call free
TT> when possible, thus we can f.e. add file handles and that like to an
TT> container). 

If we do persistence, we will need to implement it at a very low level
of the library.  

The current Destroy does "shallow" freeing, it does not touch the
stored objects at all.  This is "safer", because an object may appear
in more than one data structure.  Besides the naming, should we
provide a "deep" Free() as well, say Free() and FreeAll()?  If so, how
are pointer cycles handled?  Could you draw up the necessary specs for
type-bound procedures Free() and FreeAll() in type Object()?
Preferably the specs should use the style I am using in the List
module, otherwise I would need to do tedious reformatting.

TT> > PROCEDURE (l: List) IsEmpty* (): BOOLEAN;
TT> >   END IsEmpty;
TT> 
TT> We should specify if methods should have suffixes like "is", "has" and
TT> that like where it makes sense or if they should be generally left out.

I believe this is covered by our naming standards.  Therefore I am
changing IsEmpty to Empty.

TT> > PROCEDURE (l: List) Length* (): LONGINT;
TT> >   END Length;
TT> 
TT> What about Size? Is Length possibly to specific? What about methods
TT> "Clear" and Count" as name alternatives.

The name is kind of natural for lists.

--------

This should conclude the first round of proposal/counter proposal.  I
was hoping for feedback from more people.  We need to address many
topics, which makes the mails a bit unwieldy.  I don't think we can
split the topics at this early stage, though.

The most important topics are
 - iterators (Peter)
 - abstract base classes (Tim)
 - persistence (oh well, that's me)

There is an early ETH report dealing with various Oberon related
topics.  Among other things, it contains a chapter on the
linearization of arbitrary graphs for load/store.  It can't find my
copy of this report (my guess is it's a few hundred km away), but I
need to know how they assign indices to objects.  Please, could
someone mail me a summary of this?

Final remark: If someone throws an idea at me without providing enough
sample code to get an impression of how it works, and without an
evaluation of the benefits and disadvantages, then I can do nothing
except throwing the idea back to the original author, asking for
further information.  This may seem strange, but please believe me,
this is no attempt at creative sarcasm from my side.

-- mva

------------------------------------------------------------------------
(*	$Id: List.Mod,v 1.2 1998/06/02 15:45:59 acken Exp $	*)
MODULE List;

(*
    $Log: List.Mod,v $
    Revision 1.2  1998/06/02 15:45:59  acken
    Added ADT characteristics section.

    Can't insert object NIL into the list.

    New method Nth().

    Renamed method Filter() to Select(); the method is not destructive,
    like the other iterators.

    Renamed InsertAfter to InsertNext, and InsertBefore to InsertPrev.

    Renamed IsEmpty() to Empty(), to comply with OOC naming standards.

    Be more precise with Copy().

*)

(* characteristics of the ADT List:

  - a list holds a finite number of objects; the number of objects
    is its length
  
  - for every list element a predecessor and successor exists; the
    first element of the list has the special predecessor NIL, the
    last element the successor NIL

  - the predecessor relation is transitive, likewise the successor 
    relation

  - there is no natural ordering relation between data objects; that is,
    objects cannot be compared, and the list cannot be sorted

  - there is no direct access of list elements by indices, as known
    from arrays; that is, arbitrary elements cannot be addressed in O(1) 
    time like array elements

  - each list element can be assigned a position number: pos(head(list))=0,
    pos(succ(elem))=pos(elem)+1

*)

IMPORT
  StdTypes;
  
TYPE
  Node* = POINTER TO NodeDesc;
  NodeDesc = RECORD
    next-, prev-: Node;
    (* The read-only fields `next' and `prev' are provided for efficient 
       navigation.  A value of NIL designates the end of the list. *)
    obj-: StdTypes.Object;
    (* Data of the list element.  *)
  END;
  
  List* = POINTER TO ListDesc;
  ListDesc* = RECORD
    (StdTypes.ObjectDesc)
    head, tail: Node;
  END;


(* Constructors and Deconstructors
   ======================================================================== *)
   
PROCEDURE Init* (l: List);
(* Initializes the instance of `List' to the empty list.  *)
  BEGIN
    l. head := NIL;
    l. tail := NIL
  END Init;

PROCEDURE New*(): List;
(* Creates an instance of `List', and initializes it to the empty list.  *)
  VAR
    l: List;
  BEGIN
    NEW (l);
    Init (l);
    RETURN l
  END New;

PROCEDURE Destroy* (VAR l: List);
(* Removes all elements from list `l' and destroys the list object itself.
   The variable parameter `l' is set to NIL.
   Note: The user is not required to call this procedure.  Resources like 
   unused lists are automatically freed by the garbage collector.  This
   procedure is provided for convenience, both for the user and the garbage
   collector.  *)
  END Destroy;

PROCEDURE (l: List) Copy* (): List;
(* Creates a copy of the given list `l'.  That is, a new list is created, 
   with new instances of `List.Element', that contains the same data objects, 
   in the same sequence, as the orginal list.  Note that no copies of objects 
   are created in the process.  *)
  END Copy;
  

(* Node Insertion
   ======================================================================== *)

PROCEDURE (l: List) Append* (obj: StdTypes.Object);
(* pre: obj # NIL *)
  END Append;
  
PROCEDURE (l: List) Prepend* (obj: StdTypes.Object);
(* pre: obj # NIL *)
  END Prepend;

PROCEDURE (l: List) InsertNext* (node: Node; obj: StdTypes.Object);
(* pre: obj # NIL *)
  END InsertNext;
  
PROCEDURE (l: List) InsertPrev* (node: Node; obj: StdTypes.Object);
(* pre: obj # NIL *)
  END InsertPrev;
  
PROCEDURE (l: List) AppendList* (src: List);
(* Afterwards, `src' is reduced to the empty list.  *)
  END AppendList;


(* Node Removal
   ======================================================================== *)
   
PROCEDURE (l: List) Remove* (node: Node);
(* Note: Operations on a removed node are undefined.  In practice, the fields 
   of a removed node are set to NIL, to limit the scope of potential unwanted
   side effects, and to simplify debugging.  *)
  END Remove;
  
PROCEDURE (l: List) Replace* (node: Node; obj: StdTypes.Object);
(* pre: obj # NIL *)
  END Replace;
  

(* List Permutations
   ======================================================================== *)
   
PROCEDURE (l: List) Reverse*;
  END Reverse;

PROCEDURE (l: List) MoveAfter* (dest, node: Node);
  END MoveAfter;
  
PROCEDURE (l: List) MoveBefore* (dest, node: Node);
  END MoveBefore;
  

(* Node Retrieval
   ======================================================================== *)

PROCEDURE (l: List) First* (): Node;
  END First;
  
PROCEDURE (l: List) Last* (): Node;
  END Last;
  
PROCEDURE (l: List) Get* (obj: StdTypes.Object): Node;
  END Get;
  
PROCEDURE (l: List) Nth* (pos: LONGINT): Node;
  END Nth;


(* Iterators
   ======================================================================== *)

PROCEDURE (l: List) Apply* (op: StdTypes.Operator);
(* Successively applies operator `op' to the elements of `l'.  *)
  END Apply;

PROCEDURE (l: List) Map* (fct: StdTypes.Function): List;
(* Successively applies the function `fct' to the elements of `l', and collects
   the results into a new list.  The list `l' is not destroyed.  *)
  END Map;

PROCEDURE (l: List) Select* (pred: StdTypes.Predicate);
(* Successively applies the predicate `fct' to the elements of `l'.  All nodes,
   for which `pred' evaluates to TRUE, are collected into a new list.  The
   original list `l' is not modified.  *)
  END Select;


(* Miscellaneous
   ======================================================================== *)

PROCEDURE (l: List) Empty* (): BOOLEAN;
  END Empty;

PROCEDURE (l: List) Length* (): LONGINT;
  END Length;

END List.