[Prev][Next][Index][Thread]

Re: Type names vs type structure



Kim Bruce wrote:
> The fact that in Java, types with upper bounds do not have least upper bounds
> seems to be yet another argument for the use of structural subtyping rather
> than "by declaration" subtyping in OO languages.

Now that the types mailing list has been sparked into life (where has
everybody been?) I would be interested to know your views on
name-equivalence vs structure-equivalence in a different language: SML.
[I'm particularly hoping that the ML2000 group will have some input
here]. In SML, many types can be described anonymously (that is, by
structure): these include product types, arrow types, and reference
types. But recursive types and labelled sum types must be introduced by
datatype declarations. This creates an asymmetry between record types
(labelled products) and variant types (labelled sums), which in my view
is arbitary. We can describe record types anonymously so why not
variants too? Similarly, type theorists have an anonymous notation for
recursive types (\mu). Is it possible to use something similar in a
programming language so that equivalence of recursive types is based on
structure and not on name? 

In fact, SML's datatype construct can be decomposed into four more
primitive ideas:

(1) type constructors, aka parameterised types;
(2) variant types (labelled sums);
(3) recursive types (in fact, datatypes in ML allow one to define
recursive (type constructors) that cannot be expressed as (recursive
type) constructors, but this is something of an anomaly since without
polymorphic recursion they're of little use);
(4) `new' types, distinct from any other.

Now we could instead provide separate syntax for each of these,
resulting in a simplified, more uniform, yet more expressive language:

(1) We've already got a syntax for parameterised types in SML, namely 
type declarations.

(2) We've got the syntax 

      { lab_1 : ty_1, ..., lab_n : ty_n }

for labelled products so why not introduce an anonymous syntax for
labelled sums too? The existing datatype syntax could be adapted, though
there is the problem of a suitable term construct for sum-introduction
(datatypes introduce a new type _and_ a set of pseudo-functions for
creating values of the type). Also, some kind of polymorphism/subtyping
on variant types is essential to avoid ambiguities analogous to the
"unresolved flex records" of SML.

(3) The existing syntax for recursive datatypes is cumbersome when used
to create (recursive type) constructors, as the parameters to the type
constructors must be repeated on each occurrence in the definition.
Perhaps we could have something like the \mu syntax traditionally used
by type theorists:

    e.g. integer lists would be anonymously described by
            rec intlist (nil | cons of int * intlist)

    and a parameterised type constructor for lists would be defined by
            type 'a list = rec thislist (nil | cons of 'a * thislist)
    (perhaps with some sugar to avoid the list/thislist confusion).

Mutually-recursive types are more of a challenge. And type inference
becomes more difficult: assuming some contractiveness conditions (to
outlaw types like rec t (t) ) is it even possible?

Why would one want anonymous recursive types? Well, one of my colleagues
wrote the following function (in real code implementing SGML
stylesheets):

   fun query f n = f ([], f, n)

and was puzzled at the error message "operator is not a function
[circularity]". A recursive datatype fixed the problem but the wrapping
and unwrapping required to coerce into and out of the datatype seems
unneccessary. Even an explicit (recursive) type constraint in the above
definition would be an improvement if type inference wasn't up to the
job.

(4) In my view, we don't need a `newtype' construct as the concept of an
abstract data type does the same job (existential types in type theory,
opaque signature matching in the SML module system). Instead of using a
datatype declaration to create a new type distinct from any other, one
can make the type abstract but provide conversion functions in both
directions:

e.g. instead of 

  datatype EmployeeNum = EmployeeNum of int

write 

  structure Emp :> 
  sig 
    type EmployeeNum
    val toInt : EmployeeNum -> int
    val fromInt : int -> EmployeeNum
  end =
  struct
    type EmployeeNum = int
    fun toInt x = x
    fun fromInt x = x    
  end


OK. Perhaps I'm being idealistic and type inference just won't work
without the hints provided by naming types. What do other people think?

-- Andrew.
----------------------------------------------------------------------
Dr Andrew Kennedy, Research Scientist | fax:   +44 1223 322501
Persimmon IT, 1st floor, Block C      | phone: +44 1223 578770
The Westbrook Centre, Milton Road     | email: andrew@persimmon.co.uk    
Cambridge, CB4 1YG, UK                |