\documentclass{article} \usepackage{amssymb,xy} \xyoption{all} \sloppy \def\C{{\cal C}} \def\U{{\cal U}} \def\defining#1{{\bf #1}} \def\dom{\mathop{dom}} \def\id{\mathop{id}} \def\N{{\mathbb N}} \def\code#1{\texttt{#1}} \newtheorem{definition}{Definition} \begin{document} \begin{center} {\large\bf Conciliating multimethods with implicit coercion} \vspace{0.6cm} Armin Rigo \end{center} One of the main problems with multimethods is that there is no canonical way to define which multimethod a message must be dispatched to in the presence of inheritance. In practice, the choice is based on ``closest-match'' algorithms (see e.g.\ \cite{Lisp} \cite{Dylan} for run-time dispatching, \cite{Cpp} \cite{Java} for compile-time dispatching). This approach only scales as far as the programmer keeps in mind a global picture of his program, because adding new multimethods for an existing message might globally change or ambiguates existing code. \section{Introduction} In programming parlance, a \emph{type} is a set of terms in some formal system -- i.e.\ a family of value manipulated by a programming language. As an interesting abstraction step, a number of programming languages have introduced some kind of reasoning about the types themselves. This direction is most advanced in languages like ML, whose module system can handle fully abstract types, i.e.\ types that are no longer viewed as sets of values but instead as abstract entities with no element on their own. In mathematics, the abstraction step from sets as containers to set-like objects for their own sake is called Topos theory: a collection of set-like objects, together with functions between these sets, forms a category with specific properties. In the sequel we will present a different category-theoretical abstraction of types, with in mind a modelization of data handling by a computer. This step is performed by generalizing types into \emph{concepts.} Intuitively, a concept captures some point of view about a collection of values which might be formally defined or not. However, the precise definition of \emph{concept} is simply an object of a suitable category. It has no internal structure on its own: there is no value to be ``instance'' of the concept. This should not be confused with the Object-Oriented notion of abstract base class, which can be regarded as the set of all instances of its concrete subclasses. What we have in mind is more abstract: while we should intuitively think about concepts as families of related values, these values may have no formal existence. Our preferred example of concept is ``image''. An image is, well, you know, some rectangular and colorful visual impression that's displayed on your screen. Thus we can make this ``image'' concept an object of our category, without having to define a whole abstract interface to work with it a priori. We did not even specify that an image is a matrix of pixels: most image file formats store more or different information, which all make up what we think an ``image'' is. \section{Representation theory} We assume that the reader is familiar with the basic definition of \defining{category}. The following definitions are valid in any category $\C$. Its \emph{objects} are what we call \defining{concepts} (some of which can be concrete types); its \emph{arrows} are mappings from an input to an output concept (which between types are concrete functions). For the examples, we will assume that $\C$ contains at least all the inductive types. (In practice a good choice for $\C$ is a cartesian closed category.) \subsection{Concept representation} \begin{definition}\label{def_repr_obj} Let $X$ be a concept. A \defining{representation} (or \emph{concept representation}) of $X$ is an arrow $r: X'\rightarrow X$. The concept $X'=\dom(r)$ is called the \defining{domain} of the representation. \end{definition} The name \emph{representation} comes from the fact that $r$ allows the ``values'' in $X$, or at least some of them (the ones that are in the image of $r$), to be ``represented'' by an element of $X'$. A ``value'' of $X'$ represents its ``image'' in $X$ through $r$. We use quotes for these words because $X$ and $X'$ might not be concrete types, and $r'$ might not be any kind of computable function, but just an arrow in the category $\C$. Given enough concreteness we can say that an $x'\in X'$ represents $r(x')\in X$. By analogy we can say in general that $x': X'$ represents $r(x'): X$, though in this case $x'$ and $r(x')$ are just abstract terms with no associated value. For example, starting with the trivial ones: % \begin{itemize} \item The universal representation $\id_X: X \rightarrow X$ represents any object as itself. \item If $X$ is a type and $x$ is a value of type $X$, then we have the constant representation $c_x: \{\cdot\} \rightarrow X$, whose domain is a set with just one (arbitrary) element ``$\ \cdot\ $'', whose image $c_x(\cdot)$ is precisely $x$. It represents the specific value $x$. \item If $X'$ is a subtype of a type $X$, then the inclusion $i: X'\hookrightarrow X$ is a representation. The values of $X'$ stand for their own values seen as elements of the supertype $X$. \item Suppose that $X$ is the set of all first-class objects of an object-oriented programming language, and $X'$ is the set of machine-sized words. Then $r$ maps a machine word to the corresponding integer object in the programming language. This representation might be non-trivial because the interpreter or the compiler might associate meta-data to integer objects. \item A boolean value has the type $\{True,False\}$; some compilers represent it as a full machine-sized word, with a representation $r$ that maps $0$ to $False$ and all other values to $True$. \item Finally: $X$ is the ``image'' concept, and $X'$ is the set of matrices of pixels, with $r$ representing an image by a matrix of pixels. \end{itemize} We blur the classical distinction between concrete and abstract types. One would say in the last example above that ``image'' is an abstract type, whereas ``matrix of pixels'' is a concrete type; but then the latter is itself represented, e.g.\ as a C array, which is itself represented as a raw sequence of bytes, which is itself represented in the state of the RAM of the computer. On the other hand, ``image'' is a perfectly concrete concept from the point of view of the user who doesn't care about how the image is encoded. A representation simply maps between two abstraction levels, and in the hardware and software of a typical computer one can easily count half a dozen stacked abstraction levels. The only precise boundary we could trace would be between the concepts that are formally specified and the ones that are not; however, this categorization will not be meaningful in the sequel, so we will not introduce it.\footnote{There is a deeper reason: a formal specification of a concept only goes as far as mapping it into another pre-existing formal system. A formal language is given semantical meaning by mapping it into another one, building on top of that meaning. The process is not endless, and at some point we just have to resort to english descriptions and hand-waving for the ``first'' formal system.} \subsection{Arrow representation} \begin{definition} Let $f: X\rightarrow Y$ be an arrow. A \defining{representation} (or \emph{arrow representation\footnote{We use the word ``representation'' for both concepts and arrows: an arrow representation is exactly a concept representation in the arrow category.}}) of $f$ is an arrow $f': X'\rightarrow Y'$ together with two concept representations $r: X'\rightarrow X$ and $s: Y'\rightarrow Y$ such that $s\circ f'=f\circ r$: $$\xymatrix{ \txt{X}\ar[r]^f & \txt{Y} \\ \txt{X'}\ar[r]^{f'}\ar[u]^r & \txt{Y'}\ar[u]_s }$$ We say that $f'$ is a representation from $r$ to $s$; $r$ is the \defining{input representation} and $s$ the \defining{output representation}. A \defining{partial representation} is a partial arrow $f'$ with $r$ and $s$ as above, where the commutativity relation holds only where $f'$ is defined. \end{definition} Recall that a \emph{partial arrow} between two concepts $A$ and $B$ is an arrow that is actually only defined on a part of $A$; in other words, it is a (normal) arrow between $A'$ and $B$, with $A'\hookrightarrow A$. For example, if $f: \N\rightarrow\N$ is a mathematical function, it could be partially represented by a partial function $f': M\rightarrow M$ implemented in assembly language, where $M$ is the set of machine-sized words and $r=s: M\rightarrow\N$ represents not-too-large integers using (say, unsigned) machine words. This example also shows how representation expresses relationships between levels of abstractions: $r$ is not an inclusion of a subtype into a type; the type $M$ is much lower-level than a type like $\N$ that one can expect to find in high-level programming languages. \subsection{Conversions} Central to our analysis is the notion of ``conversion'', or ``coercion''. Converting between, say, two different file formats or programming interfaces is usually regarded as a practical problem that requires practical tools, but which is theoretically irrelevant, as much as mathematicians generally consider objects ``up to isomorphism''. On the other hand, it is a major hassle of software engineering (not to mention end users). Automatic conversions must generally be carefully limited in scope; too much implicitness can give unexpected results, like transforming a color image into a black-and-white format and back -- or much worse, composing two software components just because their interfaces have a set of methods with the same names and signatures. Let us analyse the following question: what makes a particular file transformation a conversion between, say, the \code{png} and the \code{jpeg} image formats? Obviously, the output has to satisfy some formally checkable specification of the \code{jpeg} format, as long as the input file satisfies that of the \code{png} format. But so does a transformation that always produces a blank, 1x1 result. Then maybe we should say that the result must ``depend on'' the input; it should probably not loose information. But not only is our conversion function actually loosing some information when encoding the \code{jpeg} file; there are a lot of transformations that do not loose any information and still are not what we expect from a conversion -- you could even take the hexadecimal representation of the input file, display it on your screen, and do a screenshot \code{:-)} Obviously, what we want is that the output is the ``same image'' as the input. Good, we have an ``image'' concept as an object in our category. So when we look at the operation at the ``image'' abstraction level, the operation is doing nothing; it is the identity arrow. This is easily formalized in our framework: \begin{definition} A \defining{conversion} is a representation of an identity. \end{definition} In this way, we are not converting just from a sequence of bytes to another sequence of bytes; we are converting a sequence of bytes \emph{representing an image in some way} into another sequence of bytes \emph{representing the same image in another way:} $$\vcenter{\xymatrix{ \txt{Image}\ar[r]^{\id} & \txt{Image} \\ \txt{File}\ar[r]^{convert}\ar[u]^{png} & \txt{File}\ar[u]_{jpeg} }} \hskip 20pt\txt{or equivalently}\hskip 20pt \vcenter{\xymatrix@C=1pt{ & \txt{Image} & \\ \txt{File}\ar[rr]^{convert}\ar@/^/[ur]^{png} & & \txt{File}\ar@/_/[ul]_{jpeg} }}$$ \end{document}