Prototypical ML Implementation of "An Algebraic Dexter-Based Hypertext Reference Model"

In this text we are going to explain how the specification in the report was translated into our prototypical ML implementation.

Implementation of Specification Modules

Roughly speaking, each module of the specification is implemented as an ML structure. If a module of the specification has a parameter that was not implemented ad hoc, then this specification module is implemented as an ML functor instead of an ML structure. E.g., the specification modules ANCHOR and HD are implemented as functors, whereas the following are implemented as structures: LINK, MO, PAGE, HMD.

Additionally to this text there is an alphabetical list of all implementation modules and you can use the following dependency graph to navigate into this list. Note that doubled-line boxes indicate ML functors, while simple box indicate ML structures. In the postscript format, the parameter signatures have dotted boxes.

The modules ATT_ANCHOR, ATT_LINK, ATT_PAGE, ATT_HD ATT_HMD, ATT_FSD, ATT_FSD and ATT_SITE are all implemented as a single module.

As an ML functor cannot have more than one parameter, the parameter ENTRY had to be duplicated into, on the one hand, the ML signature extended_equal_p for single occurrence and, on the other hand, the ML signature domainandrangep for double occurrence; moreover, we had to add functions for equality testing and string conversion to these parameters.

The equality tests that were implicit in the specification are explicitly implemented for all types. The standard name for the equality test is "equal" and you usually have to prefix it with the structure name and a dot because overloading is not possible in ML, which required to suffix some equality test names with an underscore and type name. The name of the string conversion which we added for most of the types is usually "tostring". In some modules the tostring function gets some extra integer first first argument describing the style of the string. If this argument is zero, the result should read exactly as in the specification. Otherwise, a more readable string is produced. The string conversion is used to implement functions named "displaystrict" and "display" that print out their argument. If both display functions occur in a module, "displaystrict" corresponds to "tostring 0" while "display" produces a more readable result.

The following modules were not implemented until now: PAGE_SYMBOLS, HMD_ADDR, CHAPTER_SYMBOLS, CHAPTER, FSD_ADDR, BOOK_SYMBOLS, BOOK, SITE_ADDR.

Data Types and Abstract Types

The constructors of visible sorts were implemented as ML data types, whereas the constructors of hidden sorts were implemented as ML abstract types. Moreover, we have added a constructor for variables to all sorts that are in the focus of our specification in order to be able to have some undefined parts of a data type for lazy definition in a top-down construction of data objects. The details will be explained somewhere else. For the logical intention of the variables cf. the papers "On Notions of Inductive Validity for First-Order Equational Clauses" and "Full First-Order Sequent and Tableau Calculi With Preservation of Solutions and the Liberalized delta-Rule but Without Skolemization".

Implementation Style

Note that we have curried all functions (except data type constructors) for simplicity and compatibility with other functional programming languages like Haskell. Thus the declaration for the equality test in our current top module HMD has the declaration
equal: hmd -> hmd -> bool
and not
equal: hmd * hmd -> bool

Problems with ML

One of our reasons to choose ML as implementation language was our hope to realize each specification module as a single ML module in a very straightforward fashion. The following problems arose during the implementation. If you think that you can give us some advice on this, please send e-mail to Claus-Peter Wirth.
  1. One problem was that we were not able to import the functor ANCHOR into the functor HD without binding the formal parameter with signature DOCUMENT_P of ANCHOR. We solved this by binding this formal parameter to the formal parameter with signature DOCUMENT_P of HD. Slightly simplified, this looks like:
    functor Hd (Document_p : DOCUMENT_P): HD 
    = 
    struct
       structure Anchor_in_Hd = Anchor (Document_p);
       open Anchor_in_Hd;
       ...
    end(*struct functor Hd*);
    

    The price we had to pay for this was that we had to open ANCHOR in HD because this was the only way to get access to the proper instance of ANCHOR without a very ugly double prefixing. The name conflicts that arouse had to be removed by hand, so that some names of the implementation of ANCHOR now differ from those of the specification by the suffix "_anchor". E.g., while the equality test for HMD has the intended name "Hmd.equal", the name of the equality test for anchors is "Hmd.equal_anchor" instead of "Anchor.equal". Similarly, "Hmd.get_att_anchor" is used instead of "Anchor.get_att". As you can see from the dependency graph, the module Anchortable is actually between the above collision of name spaces. Therefore, a function that normally would have the name say "Anchortable.apply" actually has the name "Hmd.apply_anchortable".

  2. The following problem was more serious:

    In a first step we implemented sets as a polymorphic ML type. But we have sets (like "link set") whose elements are not ML equality types, and in ML there it is neither possible to overload user defined functions nor to extend the overloading of built-in overloaded functions nor to write a function call like

    'a.equal
    in the body of a function
    insert : 'a -> 'a set -> 'a set.
    Note that in Haskell we would have both possibilities of overloading.

    In ML, however, we had to provide the equality test as an additional argument to most functions on sets. This procedure is ugly and error-prone. When we then came to the implementation of string conversion for sets, we had to accept that in ML the only proper way to implement sets is to implement sets as an ML functor. In the current implementation, only the functor implementation of sets is used.

    We are not happy with this solution for the following reasons:

    1. We did not want to have set as a functor but as a polymorphic type. In our intention, the difference between polymorphic types and functors is the same as the difference between structural and non-structural modules discussed in the draft "Improving ASF+", namely that each instantiation of a functor (even with the same actual parameter) provides us with a new copy, while a structural module (polymorphic type) shares all its occurrences referring to the same actual parameter.
    2. We had to split several specification modules into a multitude of ML structures or functors because (when Set is a functor) we cannot define the type of sets of elements in the same structure where the type of these elements themselves is defined. Moreover, we cannot define the type of sets of elements in the module using elements (as was the case in our specification) because the use of this kind of sets in different modules would then become incompatible. E.g., we had to split the specification module LINK into separate ML structures for links, for specifiers, and for sets of specifiers. Similarly, we had to split the parameterized specification module HD into three modules, namely one for the functor for hyperdocuments, one for the functor for functions from anchor identifiers to anchors, and one for the structure for sets of links.
    3. The worst consequence of all is that the functor for function objects is unable to return its domain or range or similar data in form of a set but must return it in form of a list instead. This is because a set of integers returned from a function object from integers to integers would be of different type than any set of integers outside the functor and thus useless.