[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
ADT Lib (4)
Thanks to all of you for your responses. With them I can turn the
container/iterator business into something more detailed.
First some general stuff:
MG> I'm interested in an ADT library -- I'm just going to let you guys
MG> work out the details. Ever hear the expression: "Too many cooks spoil
MG> the broth". I'll be happy with whatever you all come up with. If
MG> anything comes up that I have a strong opinion about, I'll definitely
MG> let you known.
Thank you for your vote of confidence. Your confidence may be
misplaced, though, for three reasons:
First: In the current discussion my postings draw from my own
experience (of course), but I have no experience whatsoever with class
libraries like Smalltalk, C++ STL, or Java API. I need to draw onto
other people's experience to create a solid base for the library.
Second: The direction of the basic design of the ADT lib primarily
follows my own priorities. If you have other priorities, you need to
push them on the mailing list. If you want to have them implemented,
you need to convince me that the idea is good, and then you need to
provide help with integrating it into the library outline.
Third: I am trying to coordinate the design of an ADT lib, I am _not_
writing one. I will not implement the stuff. Others will have to
provide the Oberon-2 code to fill the modules once the general outline
has been established. I fear, that people are not willing to devote
time to implement parts of the library they consider not useful, or
whose design they consider to be flawed. In this case, we may end up
with a library outline that will never turn into a library, which
would turn this whole discussion into an exercise of futility.
For this reasons I prefer a somewhat broader discussion on the mailing
list.
TT> > Obviously I am starting with a concrete implementation of a particular
TT> > List. I am not claiming any general properties, nor am I propagating
TT> > a framework of classes. This needs to be derived from the agreement
TT> > we reach on the mailing list.
TT>
TT> We should discuss another, "orthogonal" datatype then, too, to be sure
TT> we found a solution suitable for all/many datatypes. I would suggest a
TT> tree or a hashtable as next datatype.
Good suggestion. If someone presents a data type outline that can
server as a base for discussion.
TT> > I prefer the name `Rider' over `Iterator', because it is firmly
TT> > established in the Oberon community, and extends also more readily to
TT> > other areas. We already have riders in the form of the Reader and
TT>
TT> Yes. Wirth also used the Name Rider for hist Array-Iterators used in
TT> his hardware-near subset of Oberon-2 (forgot the name) AFAIK. Howver
TT> while beeing common to the Oberon-community their name leaves out all
TT> other users which are conditioned to "Iterators". Are we introvertive
TT> or extrovertive ;-)?
The term is widely used. With Sather the language provides a "real"
iterator construct (in comparison to the "fake" iterator we are using
here). When comparing our "iterator" with the iterator concept of
Sather, the term `rider' is very appropriate.
TT> [rider outline]
TT> We should not only define one class but a class hierachie.
and
BH> MvA> In particular, no assumptions are made over the particular order in
BH> MvA> which data elements are visited. The only guarantee is this: as long
BH> MvA> as the container is not modified between rider init and the rider
BH> MvA> reaching the end, all elements are enumerated.
BH>
BH> This makes sense for the base Iterator class - there's no reason to presume
BH> that a particular order exists when you don't even know what you're looking
BH> at. This doesn't mean that this lack of a defined order need be inherited
BH> by the decendents of this class. Optimization is not the only reason to
BH> derive new iterator classes; you can also define them to iterate in a
BH> particular direction or using a particular method.
BH>
BH> For example:
BH> MODULE Trees; (* Binary tree *)
BH> IMPORT StdTypes, Iterators;
BH> TYPE
BH> Tree* = POINTER TO TreeDesc;
BH> TreeDesc* = RECORD (* Insert implementation here... *) END;
BH> InOrderIterator* = POINTER TO InOrderIteratorDesc;
BH> InOrderIteratorDesc* = RECORD (IteratorDesc) T: Tree END;
BH> ...
BH> PROCEDURE (T: Tree) NewInOrderIterator: InOrderIterator;
BH> ...
BH> (* Ditto for PreOrder, etc. *)
BH> END Trees.
BH>
BH> The iterator defined for hash tables (or sets) can keep the lack of defined
BH> order of the base Iterator class. A doubly-linked list could add a Previous
BH> method to its iterator. An array-based list could add a SetPos method.
This all is true. Peter mentioned that very same thing in one of his
first mails. When I wrote "class of iterator", I meant an iterator
class with different semantics. A tree iterator of type "in order" is
just an implementation of the general iterator class, not a new class
of iterators with new semantics. I wrote down the specs for one class
of iterators in the attached Container module (attached to this mail).
Do we need a additional general iterator concept with different
semantics? E.g. one Rider2, with the additional property that it can
move backwards?
For every implementation of a container there are uncountable ways to
enumerate its elements: forward, backward, sideways, random, etc. If
we distinguish between classes of enumerations, we better chose a
criterion whose semantics are firmly established in the library.
BH> TT> Sure? I'm not sure, but I think that itertaors can be applied to trees
BH> TT> without problem if you reduce the functionality of the iterator to be a
BH> TT> serial one. I must think about that, I admit. Tree are long ago :-)
BH> TT> A serial iterator lets you just go once from the start to the end, you
BH> TT> cannot skip nor you can go back.
BH>
BH> MvA> Yes. Check your trees again. And I want to remind you, that both
BH> MvA> of you stated, that a module interface should not contain any things
BH> MvA> unnatural to the type. The sequential iterator/rider we are
BH> MvA> discussing here is unnatural to, say, a binary tree, and to include
BH> MvA> it into the tree's module interface is very questionable.
BH>
BH> Actually, if you are using a generic iterator interface with specific
BH> versions related to your specific container, there is no reason that
BH> you would not be able to define one for a binary tree. In fact, you
BH> should define several kinds. Each kind would traverse the tree in a
BH> different direction, such as preorder, inorder, postorder.
True. But the cost for a simple iteration using this mechanism may be
high. Take e.g. the most simple binary tree with links to two
descendants, but no link towards the root. For such a tree the
iterator/rider has to keep a stack of visited nodes. In the worst
case this stack may grow to a length of O(n), where n is the number of
nodes in the tree.
[rider discussion deleted...]
BH> I have a question about the Write method mentioned earlier. Does it
BH> replace the current element or insert a new element at the current
BH> location? I think that insertion semantics would be better implemented
BH> by an Insert method, but even that would only be appropriate to attach
BH> to an iterator if the item's position within the container means
BH> something outside of the context of the iterator's trip through the
BH> container, _and_ insertions can be made at any position. This won't
BH> always be the case.
I have the same reservations about rider base insertion/removal.
Therefore Container.Rider (see below) only provides the means to
replace the value of an existing element with a new value. I don't
see a way to use a rider to do (e.g.) insertion in the _general_
case -- in the special case of a double linked list it is not a
problem, of course.
BH> Even if Write has replacement semantics, how would that affect
BH> containers that have an inherent reordering semantics to their
BH> elements (SortedList, Balanced- Tree)? Does the iterator work on a
BH> snapshot of the state of the container as of the time that it was
BH> created (like Interbase) or can Write potentially send the iterator
BH> into an endless loop? Is the "next position" that Write sends you to
BH> based on the old value of the previous position or the new value?
Yes. :-)
This is the first argument for an distinction between "reader" and
"writer" riders I have seen on this mailing list. I agree, that
anything "sorted" (but also other ADTs) cannot even have a replacing
rider. If we do not split riders into two classes, we could have the
`WriteX' procedures return an error code like "operation not
permitted".
BH> Would all this just depend on the particular Container/Iterator definition?
Possibly. Some can, others cannot.
BH> Am I just making things difficult? :)
No. You make perfect sense. Thank you for valuable comments.
BH> Next, persistence... (Insert maniacal laugh here)
Now, how should I understand this particular line? Anyway, I am still
waiting for any opinions on this particular topic.
TT> > And you have not answered my question: Why do we need a single-linked
TT> > list in addition to a double-linked one?
TT>
TT> A single list needs less ressources, which is always good if one can
TT> work with the reduced interface. Also a single list - in the special
TT> form of the LISP-like recursive list with no spcial head element - has
TT> a different interface than a "normal" double list. It is in fact
TT> a (slightly) different ADT.
TT>
TT> Gting, Datenstrukturen und Algorithmen implements defined two ADTs
TT>
TT> list1:
TT> algebra list1
TT> sorts list, elem
TT> opts empty : -> list
TT> first : list -> elem
TT> rest : list -> list
TT> append : list x elem ->list
TT> concat : list x list -> list
TT> isEmpty: list -> bool
Could you write a module outline for this data type, once you are back
from your holiday? With `List.Mod' and `Container.Mod' we have enough
of a framework to make this worthwhile.
IR> 2. A RemoveAll function would also be convenient. It would return the list
IR> to
IR> an empty state without destroying any objects.
It is part of Container.
IR> 3. Riders need to traverse the list in a known order -- the order of the
IR> items in
IR> the list! MVA's suggestion that
IR>
IR> MVA>In particular, no assumptions are made over the particular order in
IR> MVA>which data elements are visited. The only guarantee is this: as long
IR>
IR> seems to make sense only for unordered data types like sets.
This is true for specializations of the general `Rider' class for
specific implementations. The basic `Rider' does not impose any
ordering constraints. It only requires that a specific Container's
implementation enumerates all items.
IR> 4. Riders should have a Reset function that moves them back to the
IR> beginning
IR> of the list. Much more convenient and efficient than having to create a
IR> new
IR> rider if you need to traverse a list several times.
I have put it into Container.Mod. By now it is halfway on its way out
again. Reason: The LISP-style list implementation "list1", described
by Tim. With these kinds of lists, all list elements are lists of
their own, and a list element can be part of more that one list.
Under these circumstances, it is impossible to do a proper reset.
Because we do not gain much with this method, I am in favor of
removing it again.
I suggest we wait with a decision until Tim gives us a module for
LISPish lists.
IR> 5. Some solution needs to be found for destructive traversal of lists using
IR> Riders.
IR> Often one needs to do something like
IR>
IR> //remove all items in list matching 'target'
IR> r :=mylist.NewRider();
IR> obj := r.Next();
IR> WHILE (obj # NIL) DO (* for each element.. *)
IR> IF obj.foo(blah) = target THEN
IR> mylist.Remove(obj);
IR> END;
IR> obj := r.Next()
IR> END;
[As a side note: There is still a `Select' procedure in my list. Its
specs matches this example quite closely, while being more efficient
and more general.]
IR> Probably this should be done by a sub-type so as not to
IR> burden the ordinary Rider which only supports non-destructive traversal.
IR>
IR> Perhaps NewRider should be follow a factory pattern that creates a
IR> variety of rider types based on its arguments. Then the user could
IR> specify a fast rider or a 'safe' rider (for non-destructive traversal).
IR>
IR> PROCEDURE (l: List) NewRider(whichType: INTEGER): Rider;
Brian has pointed out the problems of destructive riders, and I have
also tried to address them in Containers.Mod. My current estimate is
this: for particular implementations you can do specialized riders
with "insert here" or "remove this" capabilities. But generalizing
this in such a way, that one kind of iterator is applicable to many
data types, may prove difficult. It is even possible that you will
never be able to do more specialized iterators that operator on
something different than a list.
Please note: this iterator business is one part of the library I
consider "low-priority" for myself. This means, someone else has to
take over active coordination for that part, or it will never grow
beyond the minimal stuff in Container.Mod. Suggestion: Create an
abstract class `List' (an ancestor to the double-linked list I posted
earlier), and with it more specialized iterators (i.e., with more
properties than Container.Rider).
-- mva
------------------------------------------------------------------------
MODULE Container;
(* Characteristics of the abstract(!) class Container:
- a container holds a finite number of objects; the number of objects
is called its size
- objects can be added to and removed from the container; an object
can be added to a container multiple times, each addition will
create a new, distinct, container element
- no assumptions are being made how objects are stored in a container;
in particular, there is no concept of a sequence of objects at this
level
- access to objects in a container is provided by means of an iterator;
Please keep in mind, that a specialization of `Container', e.g. in form of
a list or a tree, will implement its own storage arrangement, and will
possibly also impose its own concept of an order of elements.
Characteristics of the container iterator:
(Note: the library uses the term `Rider' for implementations of
iterators.)
- an instance of `Rider' can be attached to a container; a rider is
positioned at a container element, or at the special element NIL
- methods are provided to read and (maybe?) write the data of the
element the rider is currently positioned at
- a rider can be moved to the next container element
- a method is provided to reposition a rider to the "first" element
of the container (i.e., the element at which a rider starts when
it is newly created) [Can this work at all with LISP-style
lists? If not, this property has to be removed. --mva]
- if a rider is attached and then moved over the container elements,
it will visit every element in the container exactly once, before
reaching the special element NIL
- structural changes to an instance of a container (like adding a
new element, removing an old one, or changing the order of
elements) invalidates all riders operating on the container;
afterwards their behaviour is undefined
A concrete implementation will most likely provide additional access
methods. For example, the double linked list exports the `prev' and
`next' fields of its elements for efficient list traversal. A
specialization of `Container' may also provide more specialized
iterators.
*)
IMPORT
StdTypes;
TYPE
Container* = POINTER TO ContainerDesc;
ContainerDesc* = RECORD
(StdTypes.ObjectDec)
END;
(* Note: The interface of riders closely matches that proposed by
Templ in his dissertion "Meta-programming in Oberon". Instead of
restricting the interface to access of data of type `Object', I prefer
to use the more general approach at this point of time. We can always
revise this decision later. *)
TYPE
Rider = RECORD
res-: INTEGER; (* result of last operation *)
type-: INTEGER; (* type of object "beneath" rider, see `typeXXX' below *)
END;
CONST
typeUndef* = 0;
typeBool* = 1; typeChar* = 2; typeShortInt* = 3;
(* ... symbolic names for all basic types, and (with
meta-programming capabilits) also for type constructors *)
typeObject* = 99;
(* Constructors and Deconstructors
======================================================================== *)
PROCEDURE Init* (c: Container);
END Init;
PROCEDURE New*(): Container;
END New;
PROCEDURE Destroy* (VAR c: Container);
END Destroy;
(* Insertion and Removal
======================================================================== *)
PROCEDURE (c: Container) Add* (obj: StdTypes.Object);
END Add;
PROCEDURE (c: Container) Remove* (obj: StdTypes.Object);
(* Removes a element referring to `obj' from the container. If `obj' appears
in more than one element, it is undefined which one is actually removed.
Nothing is done, if no matching element exists. *)
END Remove;
PROCEDURE (c: Container) RemoveAll*;
(* Equivalent to doing `Remove' for all elements of `c'. Afterwards
the container is empty. Unlike `Destroy', the *)
END RemoveAll;
(* Miscellaneous
======================================================================== *)
PROCEDURE (c: Container) Empty* (): BOOLEAN;
END Empty;
PROCEDURE (c: Container) Size* (): LONGINT;
END Size;
(* Riders
======================================================================== *)
PROCEDURE (c: Container) NewRider*(): Rider;
(* Creates new rider, positioned at the "beginning" of the container. *)
END NewRider;
PROCEDURE (r: Rider) ReadBool* (VAR b: BOOLEAN);
END ReadBool;
(* ... variants for all basic types *)
PROCEDURE (r: Rider) ReadObj*(VAR obj: Object);
END ReadObj;
(* Gets value at current position and moves rider forward. If the rider
is not positioned on a data element of the requested type, the field
`r. res' is changed to indicate an error. *)
PROCEDURE (r: Rider) WriteBool* (b: BOOLEAN);
END WriteBool;
(* ... variants for all basic types *)
PROCEDURE (r: Rider) WriteObj* (obj: Object);
END WriteObj;
(* Sets value at current position and moves rider forward. If the rider
is not positioned on a data element of the requested type, the field
`r. res' is changed to indicate an error. In general, a container
cannot be extended by using a `Write' procedure. If the rider is
positioned at the end of the container, calling `Write' will report
an error. Note that some specializations of `Container' do not
allow to replace element values this way (e.g., to maintain order
invariants). *)
PROCEDURE (r: Rider) Forward* (n: LONGINT);
(* Moves `r' forward over `n' objects. *)
END Forward;
PROCEDURE (r: Rider) Reset*;
(* Resets the rider to the element at which a newly created riders
start. *)
END Reset;
END Container.