[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ADT Lib: Persistence



  For the scope of this document an object is a record on the heap,
accessed through a pointer variable.  A data structure composed of
objects can be envisioned as a directed graph: the records are nodes,
and the pointers are directed edges.  A data structure (or graph) is
persistent, if we can dump the whole thing into a file (externalizing,
or storing), and later on restore it from the file (internalizing, or
loading).  The restored graph has the same contents as the original,
and the same edge structure.  The memory addresses of objects may have
changed, but all restored pointers refer to the object they referred
to in the original.

  Things get a little bit complicated when we try to store a cyclic
graph.  A directed graph is cyclic, if its undirected counterpart
contains a sequence of edges that forms a closed path.  This means
that at least one object appears in more than one pointer variable in
the data structure.  When restoring the graph, this situation has to
be handled properly.

Load/store algorithms are discussed in the ETH technical report 156.
This paper is available online.

Preconditions for the load/store algorithms:
 
 - there is a way to get a type id for an object; e.g., the id can be
   the name of the record type, as provided by Types.TypeOf

 - there is a way to create a new instance of an object from a given
   type id; e.g., Types.NewObj creates an object from the type name

 - every object has a type-bound procedure to store its own data to
   file, and likewise a symmetric procedure to load the data

 - at the time an object is stored, the object has a numeric id unique
   with respect to all other objects in the stored graph; the id
   serves as replacement for the object's pointer value

I have put a first sketch of a load/store mechanism into module
StdTypes.  The sketch compiles, but does not work.  It is for
discussion purposes only.

1. There is no procedure to a reset for writing (that is, to unmark
all objects and initialize the counter), nor for reading (that is, to
initialize the counter and the object table).

2. Type names are written uncompressed.

3. The algorithm requires that every instance of `Object' has a field
`mark'.  Alternative: like reading, writing could use a object table
and do table lookups to determine the marker.  This would keep
`Object' small, but takes longer because of the table lookups.

See the attached module for details.  Obviously, a lot of things have
to be fixed before we have a working persistence mechanism.  Any
comments?

-- mva

------------------------------------------------------------------------

MODULE StdTypes;
(* NOTE: The implementation of ReadObject/WriteObject is just a scetch.  It
         does not work.  Currently it is for discussion purposes only! *)

(* 
    $Log$
*)

IMPORT
  BinaryRider, Kernel, Types;


TYPE
  Object* = POINTER TO ObjectDesc;
  ObjectDesc* = RECORD
    mark: LONGINT;  (* used by ReadObject/WriteObject *)
  END;
  
TYPE
  Operator* = POINTER TO OperatorDesc;
  OperatorDesc* = RECORD
  END;

TYPE
  Function* = POINTER TO FunctionDesc;
  FunctionDesc* = RECORD
  END;

TYPE
  Predicate* = POINTER TO PredicateDesc;
  PredicateDesc* = RECORD
  END;


CONST
  unmarked = 0;  (* used by WriteObject *)
VAR
  objCounter: LONGINT;  (* used by WriteObject and ReadObject *)
VAR
  nodeTab: ARRAY 999(*this should be an open array*) OF Object;



PROCEDURE (op: Operator) Apply (obj: Object);
  END Apply;

PROCEDURE (fct: Function) Apply (obj: Object): Object;
  BEGIN
    RETURN NIL
  END Apply;

PROCEDURE (pred: Predicate) Apply (obj: Object): BOOLEAN;
  BEGIN
    RETURN FALSE
  END Apply;



PROCEDURE (obj: Object) Store* (w: BinaryRider.Writer);
(* Stores data of `obj' to `w'.  Nested record pointers are stored by calling 
   WriteObject.
   pre: This procedure is either activated by a super call, or from the
   procedure `WriteObject'.  *)
  BEGIN
  END Store;

PROCEDURE (obj: Object) Load* (r: BinaryRider.Reader);
(* Loads data of `obj' from `r'.
   pre: This procedure is either activated by a super call, or from the
   procedure `ReadObject'.  *)
  BEGIN
  END Load;

PROCEDURE ReadObject* (r: BinaryRider.Reader; VAR obj: Object);
  VAR
    mark: LONGINT;
    name: ARRAY 256(*should be unlimited*) OF CHAR;
    module: Kernel.Module;
    type: Types.Type;
  BEGIN
    r. ReadNum (mark);
    IF (mark <= 0) THEN
      obj := nodeTab[mark]
    ELSE
      r. ReadString (name);
      module := Kernel.modules;
      WHILE (module. name^ # name) DO
        module := module. next
      END;
      (* for now: trap if `name' does not exist *)
      r. ReadString (name);
      type := Types.This (module, name);
      (* for now: trap if `name' does not exist *)
      Types.NewObj (obj, type);
      nodeTab[mark] := obj;
      obj. Load (r)
    END
  END ReadObject;

PROCEDURE WriteObject* (w: BinaryRider.Writer; obj: Object);
  VAR
    type: Types.Type;
  BEGIN
    IF (obj = NIL) THEN
      w. WriteNum (0)
    ELSIF (obj. mark # unmarked) THEN
      w. WriteNum (-obj. mark)
    ELSE  (* first occurence of this object *)
      INC (objCounter);
      obj. mark := objCounter;
      w. WriteNum (objCounter);
      type := Types.TypeOf (obj);
      w. WriteString (type. module. name^);
      w. WriteString (type. name^);
      obj. Store (w)
    END
  END WriteObject;
  
BEGIN
  objCounter := 0;
  nodeTab[0] := NIL
END StdTypes.