Main
Logics for linguistic structures
Logics for linguistic structures
Kepser, Stephan, Fritz Hamm, Stephan Kepser
The contributions collected in this voume address central topics in theoretical and computational linguistics, such as quantification, types of context dependence and aspects concerning the formalisation of major grammatical frameworks, among others GB, DRT and HPSG. All contributions have in common a strong preference for logic as the major tool of analysis. The first main issue concerns the combination of DRT and HPSG styles of analysis into a single system for natural language processing. The second central issue concerns the logical and automata  theoretical foundations of descriptive formalisms presently in the focus of attention, for instance minimalism. A third issue is the significance of context and locality within an algorithmic notion of meaning. The last topic addressed concerns subclasses of empirically highly significant quantificational devices like proportionality quantifiers and quantifiers which give rise to sound and complete logics for nontrivial fragments of English. The volume will be of great benefit for theoretical and computational linguists, computer scientists, philosophers, and logicians.
Categories:
Linguistics
Year:
2008
Publisher:
Mouton de Gruyter
Language:
english
Pages:
188
ISBN 13:
9783110211788
ISBN:
311020469X
Series:
Trends in linguistics 201
File:
PDF, 719 KB
Download (pdf, 719 KB)
Preview
 Checking other formats...
 Convert to EPUB
 Convert to FB2
 Convert to MOBI
 Convert to TXT
 Convert to RTF
 Converted file can differ from the original. If possible, download the file in its original format.
 Please login to your account first

Need help? Please read our short guide how to send a book to Kindle.
The file will be sent to your email address. It may take up to 15 minutes before you receive it.
The file will be sent to your Kindle account. It may takes up to 15 minutes before you received it.
Please note you need to add our NEW email km@bookmail.org to approved email addresses. Read more.
Please note you need to add our NEW email km@bookmail.org to approved email addresses. Read more.
You can write a book review and share your experiences. Other readers will always be interested in your opinion of the books you've read. Whether you've loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them.
1

2

Logics for Linguistic Structures ≥ Trends in Linguistics Studies and Monographs 201 Editors Walter Bisang Hans Henrich Hock (main editor for this volume) Werner Winter Mouton de Gruyter Berlin · New York Logics for Linguistic Structures Edited by Fritz Hamm Stephan Kepser Mouton de Gruyter Berlin · New York Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Walter de Gruyter GmbH & Co. KG, Berlin. 앝 Printed on acidfree paper which falls within the guidelines 앪 of the ANSI to ensure permanence and durability. Library of Congress CataloginginPublication Data Logics for linguistic structures / edited by Fritz Hamm and Stephan Kepser. p. cm. ⫺ (Trends in linguistics ; 201) Includes bibliographical references and index. ISBN 9783110204698 (hardcover : alk. paper) 1. Language and logic. 2. Computational linguistics. I. Hamm, Fritz, 1953⫺ II. Kepser, Stephan, 1967⫺ P39.L5995 2008 401⫺dc22 2008032760 ISBN 9783110204698 ISSN 18614302 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.dnb.de. ” Copyright 2008 by Walter de Gruyter GmbH & Co. KG, D10785 Berlin All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording or any information storage and retrieval system, without permission in writing from the publisher. Cover design: Christopher Schneider, Berlin. Printed in Germany. Contents Introduction Fritz Hamm and Stephan Kepser 1 Type Theory with Records and uniﬁcationbased grammar Robin Cooper 9 Oneletter automata: How to reduce k tapes to one Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz 35 Two aspects of situated meaning Eleni Kalyvianaki and Yiannis N. Moschovakis 57 Further excursions in natural logic: The MidPoint Theorems Edward L. Keenan 87 On the logic of LGB type structures. Part I: Multidominance structures Marcus Kracht 105 Completeness theorems for syllogistic fragments Lawrence S. Moss 143 List of contributors 175 Index 179 Introduction Fritz Hamm and Stephan Kepser Logic has long been playing a major role in the formalization of linguistic structures and linguistic theories. This is certainly particularly true for the area of semantics, where formal logic has been the major tool ever since the Fregean program. In the area of syntax it was the rising of principles based theories with the focus shifting away from the generation process of structures to deﬁning general wellformedness conditions of structures that opened the way for logic. The naturalness by which many types of wellformedness conditions can be expressed in some logic or other led to different logics being proposed and used in diverse formalizations of syntactic theories in general and the ﬁeld of model theoretic syntax in particular. The contributions collected in this volume address central topics in theoretical and computational linguistics, such as quantiﬁcation, types of context dependence and aspects concerning the formalisation of major grammatical frameworks, among others GB, DRT and HPSG. All contributions have in common a strong preference for logic as the major tool of analysis. Two of them are devoted to formal syntax, three to aspects of logical semantics. The paper by Robin Cooper contributes both to syntax and semantics. We therefore grouped the description of the papers in this preface in a syntactic and a semantic section with Cooper’s paper as a natural interface between these two ﬁelds. The contribution by Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz belongs to the ﬁeld of ﬁnite state automata theory and provides a method how to reduce multi tape automata to single tape automata. Multi tape ﬁnite state automata have many applications in computer science, they are particularly frequently used in many areas of natural language processing. A disadvantage for the usability of multi tape automata is that certain automata constructions, namely composition, projection, and cartesian product, are a lot more complicated than for single tape automata. A reduction of multi tape automata to single tape automata is thus desirable. The key construction by Ganchev, Mihov, and Schulz in the reduction is the deﬁnition of an automaton type that bridges between multi and single tape automata. The authors introduces socalled oneletter automata. These 2 Fritz Hamm and Stephan Kepser are multi tape automata with a strong restriction. There is only one type of transitions permitted and that is where only a single letter in the ktuple of signature symbols in the transition differs from the empty word. In other words, all components of the tuple are the empty word with the exception of one component. Ganchev, Mihov, and Schulz show that oneletter automata are equivalent to multi tape automata. Interestingly, oneletter automata can be regarded as single tape automata over an extended alphabet which consists of complex symbols each of which is a ktuple with exactly one letter differing from the empty word. One of the known differences between multi dimensional regular relations and one dimensional regular relations is that the latter are closed under intersection while the former are not. To cope with this difference, Ganchev, Mihov, and Schulz deﬁne a criterion on essentiality of a component in a kdimensional regular relation and show that the intersection of two kdimensional regular relations is regular if the two relations share at most one essential component. This result can be extended to essential tapes of oneletter automata and the intersection of these automata. On the basis of this result Ganchev, Mihov, and Schulz present automata constructions for insertion, deletion, and projection of tapes as well as composition and cartesian product of regular relations all of which are based on the corresponding constructions for single tape automata. This way the effectiveness of oneletter automata is shown. It should hence be expected that this type of automata will be very useful for practical applications. The contribution by Marcus Kracht provides a logical characterisation of the datastructures underlying the linguistic frameworks Government and Binding and Minimalism. Kracht identiﬁes socalled multidominance structures as the datastructures underlying these theories. A multidominance structure is a binary tree with additional immediate dominance relations which have a restricted distribution in the following way. All parents of additional dominance relations must be found on the path from the root to the parent of the base dominance relation. Additional dominance relations provide a way to represent movement of some component from a lower part in a tree to a position higher up. The logic chosen by Kracht to formalize multidominance structures is propositional dynamic logic (PDL), a variant of modal logic that has been used before on many occasions by Kracht and other authors to formalize linguistic theories. In this paper, Kracht shows that PDL can be used to axiomatize multidominance structures. This has the important and highly desirable Introduction 3 consequence that the dynamic logic of multidominance structures is decidable. The satisﬁability of a formula can be decided in in 2EXPTIME. In order to formalize a linguistic framework it is not enough to provide an axiomatisation of the underlying datastructures only. The second contribution of this paper is therefore a formalisation of important grammatical concepts and notions in the logic PDL. This formalisation is provided for movement and its domains, single movement, adjunction, and crossserial dependencies. In all of these, care is taken to ensure that the decidability result of multidominance structures carries over to grammatical notions deﬁned on these structures. It is thereby shown that large parts of the linguistic framework Government and Binding can be formalized in PDL and that this formalisation is decidable. The contribution by Robin Cooper shows how to render uniﬁcation based grammar formalisms with type theory using record structures. The paper is part of a broader project which aims at providing a coherent uniﬁed approach to natural language dialog semantics. The type theory underlying this work is based on set theory and follows Montague’s style of recursively deﬁning semantic domains. There are functions and function types available in this type theory providing a version of the typed λcalculus. To this base records are added. A record is a ﬁnite set of ﬁelds, i.e., ordered pairs of a label and an object. A record type is accordingly a ﬁnite set of ordered pairs of a label and a type. Records and record types may be nested. The notions of dependent types and subtype relations are systematically extended to be applicable to record types. The main contribution of this paper is a type theoretical approach to uniﬁcation phenomena. Feature structures of some type play an important role in almost all modern linguistic frameworks. Some frameworks like LFG and HPSG make this rather explicit. They also provide a systematic way to combine two feature structures partially describing some linguistic object. This combination is based on ideas of uniﬁcation even though this notion need no longer be explicitely present in the linguistic frameworks. In a type theoretical approach, records and their types render feature structures in a rather direct and natural way. The type theoretical tools to describe uniﬁcation are meet types and equality. Cooper assumes the existence of a meet type for each pair of types in his theory including record types. He provides a function that recursively simpliﬁes a record type. This function is particularly applicable to record types which are the result of the construction of a meet record type and should be interpreted as the counterpart of uniﬁcation in type 4 Fritz Hamm and Stephan Kepser theory with records. There are though important differences to feature structure uniﬁcation. One of them is that type simpliﬁcation never fails. If the meet of incompatible types was constructed, the simpliﬁcation will return a distinguished empty type. The main advantage of this approach is that it provides a kind of intensionality which is not available for feature structures. This intensionality can be used, e.g., to distinguish equivalent types such as the source of grammatical information. It can also be used to assign different empty types with different ungrammatical phrases. This may provide a way to support robust parsing in that ungrammatical phrases can be processed and the the consistent parts of their record types may contain useful informations for further processing. Type theory with records also offers a very natural way to integrate sematic analyses into syntactic analyses based on feature structures. The paper by Eleni Kalyvianaki and Yiannis Moschovakis contains a sophisticated application of the theory of referential intensions developed by Moschovakis in a series of papers (see for instance (Moschovakis 1989a,b, 1993, 1998)), and applied to linguistics in (Moschovakis 2006). Based on the theory of referential intensions the paper introduces two notion of context dependent meaning, factual content and local meaning, and shows that these notions solve puzzles in philosophy of language and linguistics, especially those concerning the logic of indexicals. Referential intension theory allows to deﬁne three notions of synonymy, namely referential synonymy, local synonymy, and factual synonymy. Referential synonymy, the strongest concept, holds between two terms A and B iff there referential intensions are the same; i.e., int(A) = int(B). Here the referential intension of an expression A, int(A) is to be understood as the natural algorithm (represented as a set–theoretical object) which computes the denotation of A with respect to a given model. Thus referential synonymy is a situation independent notion of synonymy. This contrasts with the other two notions of synonymy which are dependent on a given state a. Local synonymy is synonymy with regard to local meaning where the local meaning of an expression A is computed from the referential intension of A applied to a given state a. It is important to note that for the constitution of the local meaning of A the full meanings of the parts of A have to be computed. In this respect the concept of local meaning differs signiﬁcantly from the notion factual content and for this reason from the associated notion of synonymy as well. This is best explained by way of an example. Introduction 5 If in a given state a her(a) = Mary(a) then the factual content of the sentence John loves her is the same as the factual content of John loves Mary. The two sentences are therefore synonymous with regard to factual content. But they are not locally synonymous since the meaning of her in states other than a my well be different from the meaning of Mary. The paper applies these precisely deﬁned notions to Kaplan’s treatment of indexicals and argues for local meanings as the most promising candidates for belief carriers. The paper ends with a brief remark on what aspects of meaning should be preserved under translation. The paper by Edward L. Keenan tries to identify inference patterns which are speciﬁc for proportionality quantiﬁers. For instance, given the premisses (1a), (1b) in (1) we may conclude (1c). (1) a. b. c. More than three tenths of the students are athletes. At least seven tenths of the students are vegetarians. At least one student is both an athlete and a vegetarian. This is an instance of the following inference pattern: (2) a. b. c. More than n/m of the As are Bs. At least 1 − n/m of the As are Cs. Ergo: Some A is both a B and a C. Although proportionality quantiﬁers satisfy inference pattern (2), other quantiﬁers do so as well, as observed by Dag Westerståhl. Building on (Keenan 2004) the paper provides an important further contribution to the question whether there are inference patterns speciﬁc to proportionality quantiﬁers. The central result of Keenan’s paper is the MidPoint Theorem and a generalization thereof. The MidPoint Theorem Let p, q be fractions with 0 ≤ p ≤ q ≤ 1 and p + q = 1. Then the quantiﬁers (BETW EEN p AND q) and (MORE T HAN p AND LESS T HAN q) are ﬁxed by the postcomplement operation. The postcomplement of a generalized quantiﬁer Q is that generalized quantiﬁer which maps a set B to Q(¬B). The following pair of sentences illustrates this operation: 6 Fritz Hamm and Stephan Kepser (3) a. b. Exactly half the students got an A on the exam. Exactly half the students didn’t get an A on the exam. The midpoint theorem therefore guarantees the equivalence of sentences (4a) and (4b); and analogously the equivalence of sentences formed with MORE T HAN p AND LESS T HAN q). (4) a. b. Between one sixth and ﬁve sixth of the students are happy. Between one sixth and ﬁve sixth of the students are not happy. However, this and the generalization of the midpoint theorem are still only partial answers to the question concerning speciﬁc inference patterns for proportionality quantiﬁers, since nonproportional determiner exist which still satisfy the conditions of the generalized midpointtheorem. The paper by Lawrence S. Moss studies syllogistic systems of increasing strength from the point of view of natural logic (for a discussion of this notion, see Purdy (1991)). Moss proves highly interesting new completeness results for these systems. More speciﬁcally, after proving soundness for all systems considered in the paper the ﬁrst result states the completeness of the following two axioms for L (all) a syllogistic fragment containing only expressions of the form All X are Y: All X are X All X are Z All Z are Y All X are Y In addition to completness the paper studies a further related but stronger property, the canonical model property. A system which has the canonical model property is also complete, but this does not hold vice versa. Roughly, a model M is canonical for a fragment F , a set Γ of sentences in F and a logical system for F if for all S ∈ F , M = S iff Γ S. A fragment F has the canonical model if every set Γ ⊆ F has a canonical model. The canonical model property is a rather strong property. Classical propositional logic, for instance, does not have this property, but the fragment L (all) has it. Some but not all of the systems in the paper have the canonical model property. Other system studied in Moss’ paper include Some X are Y, combinations of this system with L (all) and sentences involving proper names, systems with Boolean combinations, a combination of L (all) with There are at least s many X as Y, logical theories for Most and Most + Some. The largest logical system for which completeness is proved adds ∃≥ to the theory Introduction 7 L (all, some, no, names) with Boolean operations, where ∃≥ (X ,Y ) is considered true in case X contains more elements than Y . Moss’ paper contains two interesting digressions as well. The ﬁrst is concerned with sentences of the form All X which are Y are Z, the second with most. For instance, Moss proves that the following two axioms are complete for most. Most X are Y Most X are X Most X are Y Most Y are Y Moreover, if Most X are Y does not follow from a set of sentences Γ then there exists a model of Γ with cardinality ≤ 5 which falsiﬁes Most X are Y. All papers collected in this volume grew out of a conference in honour of Uwe Mönnich which was held in Freudenstadt in November 2004. Since this event four years elapsed. But another important date is now imminent, Uwe’s birthday. Hence we are in the lucky position to present this volume as a Festschrift for Uwe Mönnich on the occasion of his 70th birthday. Tübingen, July 2008 Fritz Hamm and Stephan Kepser References Keenan, Edward L. 2004 Excursions in natural logic. In Claudia Casadio, Philip J. Scott, and Robert A.G. Seely, (eds.), Language and Grammar: Studies in Mathematical Linguistics and Natural Language. Stanford: CSLI. Moschovakis, Yiannis 1989a The formal language of recursion. The Journal of Symbolic Logic 54: 1216–1252. 8 Fritz Hamm and Stephan Kepser 1989b A mathematical modeling of pure recursive algorithms. In Albert R. Meyer and Michael Taitslin, (eds.), Logic at Botik ’89, LNCS 363. Berlin: Springer. 1993 Sense and denotation as algorithm and value. In Juha Oikkonen and Jouko Väänänen, (eds.), Logic Colloquium ’90. Natick, USA: Association for Symbolic Logic, A.K. Peters, Ltd. 1998 On founding the theory of algorithms. In Harold Dales and Gianluigi Oliveri, (eds.), Truth in Mathematics. Oxford University Press. 2006 A logical calculus of meaning and synonymy. Linguistics and Philosophy 29: 27–89. Purdy, William C. 1991 A logic for natural language. Notre Dame Journal of Formal Logic 32: 409–425. Type Theory with Records and uniﬁcationbased grammar Robin Cooper Abstract We suggest a way of bringing together type theory and uniﬁcationbased grammar formalisms by using records in type theory. The work is part of a broader project whose aim is to present a coherent uniﬁed approach to natural language dialogue semantics using tools from type theory. 1. Introduction Uwe Mönnich has worked both on the use of type theory in semantics and on formal aspects of grammar formalisms. This paper suggests a way of bringing together type theory and uniﬁcation as found in uniﬁcationbased grammar formalisms like HPSG by using records in type theory which provide us with feature structure like objects. It represents a small offering to Uwe to thank him for many kindnesses over the years sprinkled with insights and rigorous comments. This work is part of a broader project whose aim is to present a coherent uniﬁed approach to natural language dialogue semantics using tools from type theory. We are seeking to do this by bringing together Head Driven Phrase Structure Grammar (HPSG) (Sag et al. 2003), Montague semantics (Montague 1974), Discourse Representation Theory (DRT) (Kamp and Reyle 1993; van Eijck and Kamp 1997, and much other literature), situation semantics (Barwise and Perry 1983) and issuebased dialogue management (Larsson 2002) into a single typetheoretic formalism. A survey of our approach to the semantic theories (i.e., Montague semantics, DRT and situation semantics) and HPSG can be found in (Cooper 2005b). Other work in progress can be found on http://www.ling.gu.se/˜ cooper/records. We give a brief summary here: Record types can be used as discourse representation structures (DRSs). Truth of a DRS corresponds to there being an object of the appropriate record type and this gives us the effect of simultaneous binding of discourse referents (corresponding to labels in records) familiar from the 10 Robin Cooper semantics of DRSs in (Kamp and Reyle 1993). Dependent function types provide us with the classical treatment of donkey anaphora from DRT in a way corresponding to the type theoretic treatment proposed by Mönnich (1985), Sundholm (1986) and Ranta (1994). At the same time record types can be used as feature structures of the kind found in HPSG since they have recursive structure and induce a kind of subtyping which can be used to mimic uniﬁcation. Because we are using a general type theory which includes records we have functions available and a version of the λcalculus. This means that we can use Montague’s λcalculus based techniques for compositional interpretation. From the HPSG perspective this gives us the advantage of being able to use “real” variable binding which can only be approximately simulated in pure uniﬁcation based systems. From the DRT perspective this use of compositional techniques gives us an approach similar to that of Muskens (1996) and work on λDRT (Kohlhase et al. 1996). In this paper we will look at the notion of uniﬁcation as used in uniﬁcationbased grammar formalisms like HPSG from the perspective of the type theoretical framework. This work has been greatly inﬂuenced by work of Jonathan Ginzburg (for example, Ginzburg in prep, Chap. 3). In Section 2 we will give a brief informal introduction to our view of type theory with records. The version of type theory that we discuss has been made more precise in (Cooper 2005a) and in an implementation called TTR (Type Theory with Records) which is under development in the Oz programming language. In Section 3 we will discuss the notion of subtype which records introduce (corresponding to the notion of subsumption in the uniﬁcation literature). We will then, in Section 4, propose that linguistic objects are to be regarded as records whereas feature structures are to be regarded as corresponding to record types. Type theory is “functionbased” rather than “uniﬁcationbased”. However, the addition of records to type theory allows us to get the advantages of uniﬁcation without having to leave the “functionbased” approach. We show how to do this in Section 5 treating some classical simple examples which have been used to motivate the use of uniﬁcation. Section 6 deals with the way in which uniﬁcation analyses are used to allow the extraction of linguistic generalizations as principles in the style of HPSG. The conclusion (Section 7) is that by using record types within a type theory we can have the advantages of uniﬁcationbased approaches together with an additional intensionality not present in classical uniﬁcation approaches and without the disadvantage of leaving the “functionbased” approach which is necessary in order to deal adequately with semantics (at least). TTR and uniﬁcationbased grammar 11 2. Records in type theory In this section1 we give a very brief intuitive introduction to the kind of type theory we are employing. A more detailed and formal account can be found in (Cooper 2005a) and work in progress on the project can be found on http://www.ling.gu.se/˜ cooper/records. While the type theoretical machinery is based on work carried out in the MartinLöf approach (Coquand et al. 2004; Betarte 1998; Betarte and Tasistro 1998; Tasistro 1997) we are making a serious attempt to give it a foundation in standard set theory using Montague style recursive deﬁnitions of semantic domains. There are two main reasons for this. The ﬁrst is that we think it important to show the relationship between the Montague model theoretic tradition which has been developed for natural language semantics and the prooftheoretic tradition associated with type theory. We believe that the aspects of this kind of type theory that we need can be seen as an enrichment of Montague’s original programme. The second reason is that we are interested in exploring to what extent intuitionistic and constructive approaches are appropriate or necessary for natural language. For example, we make important use of the notion “propositions as types” which is normally associated with an intuitionistic approach. However, we suspect that our Montaguelike approach to deﬁning the type theory to some extent decouples the notion from intuitionism. We would like to see type theory as providing us with a powerful collection of tools for natural language analysis which ultimately do not commit one way or the other to philosophical notions associated with intuitionism. The central idea of records and record types can be expressed informally as follows, where T (a1 , . . . , an ) represents a type T which depends on the objects a1 , . . . , an . If a1 : T1 , a2 : T2 (a1 ), . . . , an : Tn (a1 , a2 , . . . , an−1 ), a record: ⎤ ⎡ = a1 l1 ⎢ l2 = a2 ⎥ ⎥ ⎢ ⎥ ⎢ ... ⎥ ⎢ ⎣ ln = an ⎦ ... is of type: ⎡ : T1 l1 ⎢ l2 : T2 (l1 ) ⎢ ⎣ ... : Tn (l1 , l2 , . . . , ln−1 ) ln ⎤ ⎥ ⎥ ⎦ 12 Robin Cooper A record is to be regarded as a ﬁnite set of ﬁelds , a , which are ordered pairs of a label and an object. A record type is to be regarded as a ﬁnite set of ﬁelds , T which are ordered pairs of a label and a type. The informal notation above suggests that the ﬁelds are ordered with types being dependent on previous ﬁelds in the order. This is misleading in that we regard record types as sets of ﬁelds on which a partial order is induced by the dependency relation. Dependent types give us the possibility of relating the values in ﬁelds to each other and play a crucial role in our treatment of both feature structures and semantic objects. Both records and record types are required to be the graphs of functions, that is, if , α and , β are members of a given record or record type then = . A record r is of record type R just in case for each ﬁeld , T in R there is a ﬁeld , a in r (i.e., with the same label) and a is of type T . Notice that the record may have additional ﬁelds not mentioned in the type. Thus a record will generally belong to several record types and any record will belong to the empty record type. This gives us a notion of subtyping which we will explore further in Section 3. Let us see how this can be applied to a simple linguistic example. We will take the content of a sentence to be modelled by a record type. The sentence a man owns a donkey corresponds to a record type: ⎤ ⎡ x : Ind ⎥ ⎢ c1 : man(x) ⎥ ⎢ ⎥ ⎢ y : Ind ⎥ ⎢ ⎣ c2 : donkey(y) ⎦ c3 : own(x,y) A record of this type will be: ⎤ ⎡ x = a ⎢ c1 = p1 ⎥ ⎥ ⎢ ⎢ y = b ⎥ ⎥ ⎢ ⎣ c2 = p2 ⎦ c3 = p3 where a, b are of type Ind, individuals p1 is a proof of man(a) p2 is a proof of donkey(b) p3 is a proof of own(a, b). TTR and uniﬁcationbased grammar 13 Note that the record may have had additional ﬁelds and still be of this type. The types ‘man(x)’, ‘donkey(y)’, ‘own(x,y)’ are dependent types of proofs (in a convenient but not quite exact abbreviatory notation – we will give a more precise account of dependencies within record types in Section 3). The use of types of proofs for what in other theories would be called propositions is often referred to as the notion of “propositions as types”. Exactly what type ‘man(x)’ is depends on which individual you choose in your record to be labelled by ‘x’. If the individual a is chosen then the type is the type of proofs that a is a man. If another individual d is chosen then the type is the type of proofs that d is a man, and so on. What is a proof? MartinLöf considers proofs to be objects rather than arguments or texts. For nonmathematical propositions proofs can be regarded as situations or events. For useful discussion of this see (Ranta 1994, p. 53ff). We discuss it in more detail in (Cooper 2005a). There is an obvious correspondence between this record type and a discourse representation structure (DRS) as characterised in (Kamp and Reyle 1993). The characterisation of what it means for a record to be of this type corresponds in an obvious way to the standard embedding semantics for such a DRS which Kamp and Reyle provide. Records (and record types) are recursive in the sense that the value corresponding to a label in a ﬁeld can be a record (or record type)2 . For example, ⎡ ⎡ ⎤ ⎤ ff = a f = ⎢ f = ⎣ ⎦ ⎥ gg = b ⎢ ⎥ ⎥ r=⎢ ⎢ ⎥ g = c ⎣ ⎦ g = a g = h = h = d is of type ⎡ ⎢ f ⎢ R=⎢ ⎢ ⎣ g ⎡ : : ⎣ f g h : ff gg : T3 g : h : : : : T1 T2 T1 T4 ⎤ ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎦ given that a : T1 , b : T2 , c : T3 and d : T4 . We can use pathnames in records and record types to designate values in particular ﬁelds, e.g. ⎡ ⎤ ff = a f = ⎦ gg = b r.f = ⎣ g = c R.f.f.ff = T1 14 Robin Cooper The recursive nature of records and record types will be important later in the paper when we use record types to correspond to linguistic feature structures. Another important aspect of the type theory we are using is that types themselves can also be treated as objects.3 A simple example of how this can be exploited is the following representation for a girl believes that a man owns a donkey. This is a simpliﬁed version of the treatment discussed in (Cooper 2005a). ⎡ ⎤ x : Ind ⎢ c1 : girl(x) ⎥ ⎢ ⎡ ⎤ ⎥ ⎢ ⎥ y : Ind ⎢ ⎥ ⎢ c3 : man(y) ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎥) ⎥ ⎢ c2 : believe(x, ⎢ z : Ind ⎢ ⎥ ⎥ ⎢ ⎣ c4 : donkey(z) ⎦ ⎦ ⎣ c5 : own(y, z) The treatment of types as ﬁrst class objects in this way is a feature which this type theory has in common which situation theory and it is an important component in allowing us to incorporate analyses from situation semantics in our type theoretical treatment. The theory of records and record types is embedded in a general type theory. This means that we have functions and function types available giving us a version of the λcalculus. We can thus use Montague’s techniques for compositional interpretation. For example, we can interpret the common noun donkey as a function which maps records r of the type x:Ind (i.e. records which introduce an individual labelled with the label ‘x’) to a record type dependent on r. We notate the function as follows: λr: x:Ind ( c:donkey(r.x) ) The type of this function is P = ( x:Ind )RecType This corresponds to Montague’s type e,t (the type of functions from individuals (entities) to truthvalues). In place of individuals we use records introducing individuals with the label ‘x’ and in place of truthvalues we use record types which, as we have seen above, correspond to an intuitive notion of proposition (in particular a proposition represented by a DRS). Using the power of the λcalculus we can treat determiners Montaguestyle as functions which take two arguments of type P and return a record type. For example, we represent the indeﬁnite article by TTR and uniﬁcationbased grammar 15 λR1 :( x:Ind )RecType λR2 :( x:Ind )RecType ⎡ par : (⎣ restr : scope : x : Ind R1 @ par R2 @ par ⎤ ⎦) Here we use F @ a to represent the result of applying function F to argument a. The type theory includes dependent function types. These can be used to give a classical treatment of universal quantiﬁcation corresponding to DRT’s ‘⇒’. For example, an interpretation of every man owns a donkey can be the following record type: ⎤ ⎤ ⎡ ⎡ y : Ind ⎣ f : (r : x : Ind ) ⎣ c2 : donkey(y) ⎦ ⎦ c1 : man(x) c3 : own(r.x,y) Records of the type x : Ind (r : c1 : man(x) ⎡ y ) ⎣ c2 c3 : : : ⎤ Ind donkey(y) ⎦ own(r.x,y) map records r of type x : Ind c1 : man(x) to records of type ⎤ ⎡ y : Ind ⎣ c2 : donkey(y) ⎦ c3 : own(r.x,y) Our interpretation of every man owns a donkey requires that there exist a function of this type. Why do we use the record type with the label ‘f’ rather than the function type itself as the interpretation of the sentence? One reason is to achieve a uniform treatment where the interpretation of a sentence is always a record type. Another reason is that the label gives us a handle which can be used to anaphorically refer to the function. This can, for example, be exploited in socalled paycheck examples (Karttunen 1969) such as Everybody receives a paycheck. Not everybody pays it into the bank immediately, though. The ﬁnal notion we will introduce which is important for the modelling of HPSG typed feature structures as record types is that of manifest ﬁeld. 16 Robin Cooper This notion is introduced in (Coquand et al. 2004). It builds on the notion of singleton type. If a : T , then Ta is a singleton type and b : Ta iff b = a. A manifest ﬁeld in a record type is one whose type is a singleton type, e.g. x : Ta written for convenience as x=a : T This notion allows record types to be “progressively instantiated”, i.e. intuitively, for values to be speciﬁed within a record type. A record type that only contains manifest ﬁelds is completely instantiated and there will be exactly one record of that type. We will allow dependent singleton types, where a in Ta can be represented by a path in a record type. Manifest ﬁelds are important for the modelling of HPSGstyle uniﬁcation in type theory with records. 3. Dependent types and the subtype relation We are now in a position to give more detail about the treatment of dependencies in record types. Dependent types within record types are treated as pairs consisting of functions and sequences of pathnames providing corresponding to the arguments required by the functions. Thus the type on p. 12 corresponding to a man owns a donkey is in ofﬁcial, though less readable, notation: ⎤ ⎡ x : Ind ⎥ ⎢ c1 : λv :Ind(man(v)), x ⎥ ⎢ ⎥ ⎢ y : Ind ⎥ ⎢ ⎦ ⎣ c2 : λv :Ind(donkey(v)), y c3 : λv :Ind(λw :Ind(own(v, w))), x,y This enables us to give scope to dependencies outside the object in which the dependency occurs. Thus if, on the model of the type for a girl believes that a man owns a donkey on p. 14, we wish to construct a type corresponding to a girl believes that she owns a donkey (where she is anaphorically related to a girl), this can be done as follows: ⎤ ⎡ x : Ind ⎥ ⎢ c1 : λv :Ind(girl(v)), x ⎢ ⎤ ⎥ ⎡ ⎥ ⎢ z : Ind ⎥ ⎢ ⎢ c2 : λu :Ind(believe(u, ⎣ c4 : λv :Ind(donkey(v)), z ⎦)), ⎥ ⎥ ⎢ ⎦ ⎣ c5 : λv :Ind(own(u, v)), z x TTR and uniﬁcationbased grammar 17 There are two kinds of pathnames which can occur in such types: relative pathnames which are constructed from labels, 1 . . . . .n , and absolute pathnames which refer explicitly to the record, r in which the pathname is to be evaluated, r.1 . . . . .n . A dependent record type is a set of pairs of the form , T where is a label and T is either a type, a dependent record type or a pair consisting of a function and a sequence of pathnames as characterized above. An anchor for a dependent record type T of the form ⎡ ⎤ 1 : T1 ⎣ ... ⎦ : Tn n is a record, h, such that for each Ti of the form f , π1 , . . . , πm , πi is either an absolute path or a path deﬁned in h, and for each Ti which is a dependent record type, h is also an anchor for Ti . In addition we require that the result of anchoring T with h as characterized below is welldeﬁned, i.e., that the anchor provides arguments of appropriate types to functions and provides objects of appropriate types for the construction of singleton types as required by the anchoring. The result of anchoring T with h, T [h] is obtained by replacing 1. each Ti in T of the form f , π1 , . . . , πm with f (π1 [h]) . . . (πm [h]) (where πi [h] is the value of h.πi if πi is a relative pathname and the value of πi if πi is an absolute pathname) 2. each Ti in T which is a dependent record type with Ti [h] 3. each basic type which is the value of a path π in T and for which h.π is deﬁned, with Ta , where a is the value of h.π, i.e. the singleton type obtained from T and the value of h.π. A dependent record type T is said to be closed just in case each path which T requires to be deﬁned in an anchor for T is deﬁned within T itself. It is the closed dependent record types which belong to our type universe. If T is a closed dependent record type then r : T if and only if r : T [r]. Let us return to our type above for a girl believes that she owns a donkey. An anchor for this type is x = m (where m is an object of type Ind) and the result of anchoring the type with this record is 18 Robin Cooper ⎡ x=m ⎢ c1 ⎢ ⎢ ⎢ ⎣ c2 ⎤ : : Ind girl(m) : z believe(m, ⎣ c4 c5 ⎡ : : : Ind λv :Ind(donkey(v)), z λv :Ind(own(m, v)), z ⎥ ⎤ ⎥ ⎥ ⎥ ⎦) ⎦ Notice that the anchor has no effect on dependencies with scope within the argument type corresponding to that she owns a donkey but only the dependency with scope external to it. We now turn our attention to the subtype relation. Record types introduce a notion of subtype which corresponds to what is known as subsumption in the uniﬁcation literature. The subtype relation can be characterized model theoretically as follows: If T1 and T2 are types, then T1 is a subtype of T2 (T1 case {a  a :M T1 } ⊆ {a  a :M T2 } for all models M. T2 ) just in where the righthand side of this equivalence refers to sets in the sense of classical set theory and models as deﬁned in (Cooper 2005a). These models assign sets of objects to the basic types and sets of proofs to prooftypes constructed from predicates with appropriate arguments, e.g. if k is a woman according to model M then M will assign a nonempty set of proofs to the type woman(k). Such models correspond to sorted ﬁrstorder models. If this notion of subtype is to be computationally useful, we need some way of computing whether two types stand in the subtype relation without having to compute the sets of objects which belong to those types in all possible models. Thus we deﬁne another relation c which is computed without reference to the models. The approach taken to this in the implementation TTR is to instantiate (dependent) record types, R, recursively as an anchor for R introducing arbitrary formal objects guaranteed to be of the appropriate type. Basic types and types constructed with predicates are instantiated to arbitrary formal objects guaranteed to be of the type (in the implementation, pairings of gensym atoms with the type); singleton types are instantiated to the object used to deﬁne the type; record type structures are instantiated to records containing the instantiations of the types (or anchors for the dependent record types) in each ﬁeld; similar instantiations are given for other complex types. Thus the instantiation of the record type TTR and uniﬁcationbased grammar 19 ⎡ f ⎢ g=b ⎢ ⎣ h : : T1 T 2 : ⎤ i j can be represented as: ⎡ f = a0#T1 ⎢ g = b ⎢ ⎣ i = h = j = : : r1 (f,g) r2 (g,f) ⎥ ⎥ ⎦ ⎤ a1#r1 (a0#T1 , b) a2#r2 (b, a0#T1 ) ⎥ ⎥ ⎦ We use Inst(T ) to represent such an instantiation of type T . T1 c T2 just in case Inst(T1 ) : T2 . One advantage of this approach is that the computation of the subtype relation will be directly dependent on the oftype relation. If T1 is a record type containing a superset of the ﬁelds of the record type T2 then T1 c T2 as desired for the modelling of subsumption in uniﬁcation systems. Thus, for example, ⎤ ⎡ f:T1 ⎣g:T2 ⎦ c f:T1 g:T2 h:T3 This method of computing the subtype relation appears to be sound with respect to the models but not necessarily complete since it does not take account of the logic associated with predicates. For example, later in the paper we will make use of an equality predicate. The predicate eq is such that a : eq(T, x, y) iff a = x, y , x, y : T , and x = y. Now consider that the type ⎡ ⎤ x:T ⎣y:T ⎦ c:r(x) is not a subtype of ⎡ ⎤ x:T ⎣y:T ⎦ c:r(y) whereas according to the model theoretic deﬁnition ⎡ ⎤ ⎡ ⎤ x:T x:T ⎢y:T ⎥ ⎢ ⎥ ⎣y:T ⎦ ⎣c:r(x) ⎦ c:r(y) d:eq(T ,x,y) 20 Robin Cooper since anything of the ﬁrst type must also be of the second type. The instantiation of the ﬁrst type ⎡ ⎤ x=a0#T ⎢y=a1#T ⎥ ⎢ ⎥ ⎣c=a2#r(a0#T ) ⎦ d=a3#eq(T, a0#T, a1#T ) will not, however, be computed as being of the second type unless we take account of the import of the equality predicate. This is easily ﬁxed, for example by normalizing the instantiation so that all arbitrary objects which are required to be identical are represented by the same symbols, in this case, for example, substituting a0 for all occurrences of a1: ⎡ ⎤ x=a0#T ⎢y=a0#T ⎥ ⎢ ⎥ ⎣c=a2#r(a0#T ) ⎦ d=a3#eq(T, a0#T, a0#T ) This will then have the desired effect on the computation of the subtype relation. However, there is no guarantee that it will always be possible to give a complete characterization of the subtype relation if the logic of the predicates is incomplete. But we should not let this stop us exploiting those inferences about subtyping which we can draw in a computational implementation. 4. Records as linguistic objects We will consider linguistic objects to be records. Here is a simple linguistic object which might correspond to the word man. ⎡ ⎤ phon = [man] ⎢ cat ⎥ = n ⎢ ⎡ ⎤ ⎥ ⎢ ⎥ num = sg ⎢ ⎥ ⎣ ⎦ ⎣ agr ⎦ gen = masc = pers = third It is a record with three ﬁelds. The ﬁeld for phon(ology) has as value a (singleton) list of words (following the traditional HPSG simplifying assumption about phonology). For the cat(egory) ﬁeld we will use atomic categories like n(oun), although nothing in our approach excludes the complex categories normally used in HPSG analyses. We include three agr(eement) features: num(ber), in this case with the value s(in)g(ular), gen(der), in this TTR and uniﬁcationbased grammar 21 case with the value masc(uline) and pers(on) in this case with the value third (person). Not all words correspond to a single linguistic object. For example, the English word ﬁsh can be singular or plural and masculine, feminine or neuter. This means that there will be six records corresponding to the single record for man. Here are three of them: ⎡ ⎤ phon = [ﬁsh] ⎢ cat ⎥ = n ⎢ ⎡ ⎤ ⎥ ⎢ ⎥ num = sg ⎢ ⎥ ⎣ ⎦ ⎣ agr ⎦ gen = neut = pers = third ⎡ ⎤ phon = [ﬁsh] ⎢ cat ⎥ = n ⎢ ⎤ ⎥ ⎡ ⎢ ⎥ num = pl ⎢ ⎥ ⎣ agr = ⎣ gen = neut ⎦ ⎦ pers = third ⎡ ⎤ phon = [ﬁsh] ⎢ cat ⎥ = n ⎢ ⎡ ⎤ ⎥ ⎢ ⎥ num = sg ⎢ ⎥ ⎣ ⎦ ⎣ agr ⎦ gen = masc = pers = third Now let us consider types of linguistic objects. Nouns correspond to objects which have a phonology, the category n and the agreement features for number, gender and person. That is we can deﬁne a record type Noun as follows: ⎡ ⎤ phon : Phon ⎢ cat=n : Cat ⎥ ⎢ ⎡ ⎤ ⎥ ⎢ ⎥ num : Number Noun ≡ ⎢ ⎥ ⎣ agr : ⎣ gen : Gender ⎦ ⎦ pers : Person where: Phon ≡ [Lex] (i.e. type of list of objects of type Lex) the, a, ﬁsh, man, men, swim, swims, swam, . . . : Lex n, det, np, v, vp, s, . . . : Cat sg, pl : Number masc, fem, neut : Gender ﬁrst, second, third : Person We can further deﬁne types for determiners, verbs and agreement: 22 Robin Cooper ⎡ phon ⎢ cat=det ⎢ Det ≡ ⎢ ⎢ ⎣ agr ⎡ phon ⎢ cat=v ⎢ V≡⎢ ⎢ ⎣ agr ⎡ num ⎣ Agr ≡ gen pers : : : : : Phon Cat ⎡ num ⎣ gen pers Phon Cat ⎡ num ⎣ gen pers : : : : : : : ⎤ ⎤ : : : ⎥ ⎤ ⎥ ⎥ Number ⎥ ⎦ ⎦ Gender Person ⎤ ⎥ ⎤ ⎥ ⎥ Number ⎥ ⎦ ⎦ Gender Person Number Gender ⎦ Person Now we can deﬁne the type man. ⎡ phon=[man] ⎢ cat=n ⎢ Man ≡ ⎢ ⎢ ⎣ agr of linguistic objects corresponding to the word : : : Phon Cat ⎡ num=sg ⎣ gen=masc pers=third ⎤ : : : ⎥ ⎤ ⎥ ⎥ Number ⎥ ⎦ ⎦ Gender Person This type identiﬁes a unique linguistic object (namely the record corresponding to man which we introduced above). It is a singleton (or fully speciﬁed) type. It is also a subtype (or speciﬁcation) of Noun in the sense that if an object is of type Man it is also of type Noun. We deﬁne the type for the plural men in a similar way. ⎤ ⎡ phon=[men] : Phon ⎥ ⎢ cat=n : Cat ⎢ ⎡ ⎤ ⎥ ⎥ ⎢ num=pl : Number Men ≡ ⎢ ⎥ ⎣ ⎦ ⎦ ⎣ agr gen=masc : Gender : pers=third : Person The type Fish corresponding to the noun ﬁsh is a less speciﬁed type, however: ⎡ ⎤ phon=[ﬁsh] : Phon ⎢ cat=n ⎥ : Cat ⎢ ⎡ ⎤ ⎥ ⎥ num : Number Fish ≡ ⎢ ⎢ ⎥ ⎣ ⎦ ⎣ agr ⎦ gen : Gender : pers=third : Person TTR and uniﬁcationbased grammar 23 The objects which are of this type will be the six records which we identiﬁed earlier. Fish is also a subtype of Noun. We can also deﬁne two types IndefArt and DefArt corresponding to the indeﬁnite and deﬁnite articles which have different degrees of speciﬁcation: ⎡ phon=[a] ⎢ cat=det ⎢ IndefArt ≡ ⎢ ⎢ ⎣ agr ⎡ phon=[the] ⎢ cat=det ⎢ DefArt ≡ ⎢ ⎢ ⎣ agr : : Phon Cat ⎡ num=sg ⎣ gen pers=third : : : Phon Cat ⎡ num ⎣ gen pers=third : ⎤ ⎥ ⎤ ⎥ ⎥ Number ⎥ Gender ⎦ ⎦ Person ⎤ : : : ⎥ ⎤ ⎥ ⎥ Number ⎥ ⎦ ⎦ Gender Person : : : Both of these are subtypes of Det. IndefArt is speciﬁed for both number and person whereas DefArt is only speciﬁed for person. Similar differences in speciﬁcation arise in verbs:4 ⎡ phon=[swims] ⎢ cat=v ⎢ Swims ≡ ⎢ ⎢ ⎣ agr ⎡ phon=[swim] ⎢ cat=v ⎢ Swim ≡ ⎢ ⎢ ⎣ agr ⎡ phon=[swam] ⎢ cat=v ⎢ Swam ≡ ⎢ ⎢ ⎣ agr : : : : : Phon Cat ⎡ num=pl ⎣ gen pers=third : : : : ⎤ Phon Cat ⎡ num=sg ⎣ gen pers=third Phon Cat ⎡ num ⎣ gen pers=third These three types are all subtypes of V. : : : ⎥ ⎤ ⎥ ⎥ Number ⎥ ⎦ ⎦ Gender Person ⎤ ⎥ ⎤ ⎥ ⎥ Number ⎥ Gender ⎦ ⎦ Person ⎤ : : : : : : ⎥ ⎤ ⎥ ⎥ Number ⎥ ⎦ ⎦ Gender Person 24 Robin Cooper 5. A type theoretical approach to uniﬁcation phenomena The types that we introduced in Section 4 lay the basis for a kind of uniﬁcation phenomenon in language which has been discussed in the classical literature on uniﬁcation approaches to natural language grammar (e.g. Shieber 1986). The sentence (1) is underspeciﬁed with respect to number. (1) The ﬁsh swam. Note, however, that either all the words are singular or all the words are plural. It cannot be the case that ﬁsh is regarded as singular while swam is regarded as plural, for example. This is because there are requirements that the determiner and the noun agree in number and that the subject nounphrase and the verb agree in number. In a uniﬁcationbased grammar this is expressed by requiring that the number features of the relevant phrases unify. In the terms of our previous discussion it means that there are two linguistic objects corresponding to (1) rather than eight. Note that the sentences in (2) are all singular. (2) a. a ﬁsh swam. b. the man swam. c. the ﬁsh swims. However, the “source” of the singularity is different in each case. It is the fact that a, man, and swims respectively are speciﬁed for singular, together with the requirements that all the number features unify which have as a consequence that all the words are speciﬁed for singular in the single linguistic object corresponding to each of these sentences. Uniﬁcation is regarded as a useful tool in the linguistic analysis because it reﬂects the lack of “directionality” of the agreement phenomenon, that is, as long as one of the words is speciﬁed for number they all have to be speciﬁed for the same number. Uniﬁcation is traditionally regarded as partial, that is, it can fail and this is used to explain why the strings of words in (3) are not sentences of English, that is, they do not correspond to linguistic objects allowed by the grammar of English. (3) a. *a ﬁsh swim. b. *the man swim. c. *a men swims. TTR and uniﬁcationbased grammar 25 On our type theoretical view the intuitive notion of uniﬁcation is related to meet (as in the meet, or conjunction, of two types) and equality. In the deﬁnition of our type theory in Cooper(2005b) we introduce meettypes in the following way: If T1 and T2 are types, then T1 ∧ T2 is also a type. a : T1 ∧ T2 iff a : T1 and a : T2 If T1 and T2 are record types then there will always be a record type (not a meet) T3 which is equivalent to T1 ∧ T2 (in the sense that a : T3 iff a : T1 ∧ T2 ). Let us consider some examples: f:T1 f:T1 ∧ g:T2 ≈ g:T2 f:T1 ∧ f:T2 ≈ f:T1 ∧ T2 Below we present some informal pseudocode for a function μ which will simplify meets of records types, returning an equivalent record type. The algorithm is similar in essential respects to the graph uniﬁcation algorithm used in classical implementations of feature based grammar systems (Shieber 1986). One important respect in which it differs from the classical uniﬁcation algorithm is that it never fails. In cases where the corresponding uniﬁcation would have failed it will return a record type which is equivalent to the distinguished empty type ⊥.5 Another way in which it differs from the classical uniﬁcation algorithm is that it applies to all types, reducing meets of records types to nonmeet types and recursively performing this reduction within record types and otherwise returning the original type. The algorithm that is informally presented here is a simpliﬁcation of the one that is implemented in TTR which has some additional special cases and also has an additional level of complication for the handling of dependent types, using the technique of environments which we referred to in Section 3. In order to understand the intention of this pseudocode it is important to remember that record types are considered to be ﬁnite sets of ordered pairs (representing the ﬁelds) as described above in Section 2. When we write Map(T , λl, v[Φ]) we mean that each ﬁeld , T in T is to be replaced by the result of applying the function λl, v[Φ] to and T . When we say that T. is deﬁned we mean that for some T , , T ∈ T . We assume that the type theory will deﬁne an incompatibility relation which holds between certain basic types such that if T1 and T2 are incompatible then there will be no a such that a : T1 and a : T2 . For example, one might require that all basic types are pairwise incompatible. 26 Robin Cooper μ(T ) = if for some T1 , T2 , T = T1 ∧ T2 then let T1 = μ(T1 ) T2 = μ(T2 ) in T2 then T1 if T1 elseif T2 T1 then T2 elseif T1 and T2 are incompatible, then ⊥ elseif T1 and T2 are record types then Map(T1 , λl, v[if T2 .l is deﬁned then l, μ(v ∧ T2 .l) else l, v ]) ∪ (T2 − {l, v ∈ T2  T1 .l is deﬁned}) else T1 ∧ T2 end end elseif T is a record type, then Map(T , λl, v[l, μ(v) ]) else T end If we know a : T1 and b : T2 and in addition know a = b then we know a : T1 ∧ T2 and a : μ(T1 ∧ T2 ). Intuitively, equality of objects corresponds to meet (or “uniﬁcation”) of types. This can be expressed in terms of the following rules of inference. a : T1 b : T2 a=b a : T1 ∧ T2 a : T1 ∧ T2 a : μ(T1 ∧ T2 ) We can exploit this in characterizing a type NP which allows nounphrases consisting of a determiner and a noun which agree in number. TTR and uniﬁcationbased grammar 27 NP ≡ ⎡ ⎤ phon=append(daughters.ﬁrst.phon, daughters.rest.ﬁrst.phon):Phon ⎢cat=np:Cat ⎥ ⎢ ⎥ ⎡ ⎤ ⎢ ⎥ ﬁrst : Det ⎢ ⎥ ⎢daughters:⎣ ⎥ ⎦ ﬁrst : Noun ⎢ ⎥ rest : ⎢ ⎥ rest=nil : [Sign] ⎢ ⎥ ⎣agr=daughters.rest.ﬁrst.agr:Agr ⎦ c:eq(Number, daughters.ﬁrst.agr.num, daughters.rest.ﬁrst.agr.num) In the deﬁnition of NP, Sign is to be thought of as a recursively deﬁned type deﬁned by: 1. if a : Det then a : Sign 2. if a : Noun then a : Sign 3. if a : NP then a : Sign . . . (similarly for other word and phrase types) n. no object is of type Sign except as required by the above clauses The predicate eq is as deﬁned in Section 3. The type NP can be further speciﬁed to a type where the ﬁrst daughter is of type DefArt and the second daughter is of type Man since these types are subtypes of Det and Noun respectively. ⎤ ⎡ phon=append(daughters.ﬁrst.phon, daughters.rest.ﬁrst.phon):Phon ⎥ ⎢cat=np:Cat ⎢ ⎤⎥ ⎡ ⎤ ⎡ ⎥ ⎢ phon=[the]:Phon ⎥ ⎢ ⎥ ⎥ ⎢cat=det ⎥ ⎢ ⎢ :Cat ⎥⎥ ⎢ ⎢ ⎢ ⎡ ⎤⎥ ⎥ ⎥ ⎥ ⎢ ﬁrst:⎢ ⎢ num : Number ⎥ ⎥⎥ ⎢ ⎢ ⎢ ⎥⎥ ⎢ ⎢ ⎣agr : Gender ⎦⎦ :⎣ gen ⎥⎥ ⎢ ⎢ ⎥⎥ ⎢ ⎢ pers=third : Person ⎢ ⎢ ⎡ ⎡ ⎤⎤⎥⎥ ⎥⎥ ⎢daughters:⎢ phon=[man]:Phon ⎥⎥ ⎢ ⎢ ⎥ ⎢ ⎢cat=n:Cat ⎥ ⎥⎥ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎡ ⎤⎥⎥⎥⎥ ⎥⎥⎥ ⎢ ⎢ ⎢ num=sg : Number ⎥ :⎢ ⎢ rest:⎢ﬁrst ⎢ ⎢ ⎥⎥⎥⎥ ⎢ ⎢ ⎢ ⎣agr:⎣ gen=masc : Gender ⎦⎦⎥⎥⎥ ⎥⎥⎥ ⎢ ⎢ ⎢ ⎢ ⎦⎦⎥ ⎣ ⎣ pers=third : Person ⎥ ⎢ ⎥ ⎢ rest=nil:[Sign] ⎥ ⎢ ⎦ ⎣agr=daughters.rest.ﬁrst.agr:Agr c:eq(Number, daughters.ﬁrst.agr.num, daughters.rest.ﬁrst.agr.num) Note that this type represents that the singularity of the phrase has its source 28 Robin Cooper in man. Similarly we can create the type corresponding to a ﬁsh where the source of the singularity is the determiner. ⎡ ⎤ phon=append(daughters.ﬁrst.phon, daughters.rest.ﬁrst.phon):Phon ⎢cat=np:Cat ⎥ ⎢ ⎡ ⎡ ⎤ ⎤⎥ ⎢ ⎥ phon=[a]:Phon ⎢ ⎥ ⎢ ⎢ ⎢cat=det :Cat ⎥ ⎥⎥ ⎢ ⎢ ⎢ ⎥⎥ ⎡ ⎤⎥ ⎢ ⎢ ﬁrst:⎢ ⎥⎥ num=sg : Number ⎥ ⎢ ⎢ ⎢ ⎥ ⎥⎥ ⎢ ⎢ ⎥⎥ ⎣ ⎦ ⎣agr ⎦ gen : Gender : ⎢ ⎢ ⎥⎥ ⎢ ⎢ ⎥⎥ pers=third : Person ⎢ ⎢ ⎡ ⎡ ⎤⎤⎥⎥ ⎢daughters:⎢ ⎥⎥ phon=[ﬁsh]:Phon ⎢ ⎢ ⎥⎥ ⎢ ⎢ ⎢ ⎢cat=n:Cat ⎥⎥⎥⎥ ⎢ ⎢ ⎢ ⎢ ⎡ ⎤⎥⎥⎥⎥ ⎢ ⎢ ⎢ ⎥⎥⎥ num : Number ⎥ :⎢ ⎢ ⎢ rest:⎢ﬁrst ⎢ ⎥⎥⎥⎥ ⎢ ⎢ ⎢ ⎥⎥ ⎣agr:⎣ gen : Gender ⎦⎦⎥ ⎢ ⎢ ⎢ ⎥⎥⎥ ⎢ ⎣ ⎣ ⎦⎦⎥ pers=third : Person ⎢ ⎥ ⎢ ⎥ rest=nil:[Sign] ⎢ ⎥ ⎣agr=daughters.rest.ﬁrst.agr:Agr ⎦ c:eq(Number, daughters.ﬁrst.agr.num, daughters.rest.ﬁrst.agr.num) A difference between our record types and feature structures is that the record types preserve the information of the source of the number information in these examples whereas in feature structures this information is lost once the feature structures have been uniﬁed. An additional difference is that we are able to form types corresponding to ungrammatical phrases such as *a men. ⎤ ⎡ phon=append(daughters.ﬁrst.phon, daughters.rest.ﬁrst.phon):Phon ⎥ ⎢cat=np:Cat ⎢ ⎤⎥ ⎡ ⎤ ⎡ ⎥ ⎢ phon=[a]:Phon ⎥ ⎢ ⎥⎥ ⎢cat=det :Cat ⎥ ⎢ ⎢ ⎥⎥ ⎢ ⎢ ⎢ ⎡ ⎤⎥ ⎥⎥ ⎢ ﬁrst:⎢ ⎢ num=sg : Number ⎥ ⎥⎥ ⎢ ⎥ ⎢ ⎢ ⎥⎥ ⎢ ⎢ ⎣ gen ⎦⎦ ⎣agr : Gender : ⎥⎥ ⎢ ⎢ ⎥⎥ ⎢ ⎢ pers=third : Person ⎢ ⎢ ⎡ ⎡ ⎤⎤⎥⎥ ⎥⎥ ⎢daughters:⎢ phon=[men]:Phon ⎥⎥ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢cat=n:Cat ⎥⎥⎥⎥ ⎢ ⎢ ⎢ ⎢ ⎡ ⎤⎥⎥⎥⎥ ⎥⎥⎥ ⎢ ⎢ ⎢ num=pl : Number ⎥ :⎢ ⎢ ⎥⎥⎥⎥ ⎢ rest:⎢ﬁrst ⎢ ⎢ ⎢ ⎢ ⎣agr:⎣ gen=masc : Gender ⎦⎦⎥⎥⎥ ⎥⎥⎥ ⎢ ⎢ ⎢ ⎢ ⎦⎦⎥ ⎣ ⎣ pers=third : Person ⎥ ⎢ ⎥ ⎢ rest=nil:[Sign] ⎥ ⎢ ⎦ ⎣agr=daughters.rest.ﬁrst.agr:Agr c:eq(Number, daughters.ﬁrst.agr.num, daughters.rest.ﬁrst.agr.num) TTR and uniﬁcationbased grammar 29 This is a wellformed type but one that cannot have any elements since sg and pl are not identical. This type is thus equivalent to ⊥. Such types might be usefully exploited in robust parsing. Note that even though the type is empty it contains a great deal of information about the phrase. In particular if our types were to include information about the meaning or content of a phrase it might be possible to extract information about the meaning of a phrase even though it does not actually correspond to any wellformed linguistic object. This could potentially be exploited in a way similar to that suggested by Fouvry (2003) for weighted feature structures. 6. Using uniﬁcation to express generalizations Uniﬁcation is also exploited in uniﬁcation grammars to extract generalities from individual cases. For example, the nounphrase agreement phenomenon that we discussed in Section 5 requires that the agreement features on the noun be the same as those on the NP. This is an instance of the head feature principle which requires that the agreement features of the mother be the same as those of the head daughter. If we identify the head daughter in phrases then we can extract this principle out by creating a new type HFP which corresponds to the head feature principle. HFP ≡ hddaughter agr=hddaughter.agr : : Sign Agr We also deﬁne a new version of the type NP, NP , which identiﬁes the head daughter but does not contain the information corresponding to the head feature principle. NP ⎡ ≡ ⎤ phon=append(daughters.ﬁrst.phon, daughters.rest.ﬁrst.phon) :Phon ⎢cat=np :Cat ⎥ ⎢ ⎥ ⎢hddaughter=daughters.rest.ﬁrst :Noun ⎥ ⎢ ⎥ ⎡ ⎤ ⎢ ⎥ ﬁrst : Det ⎢ ⎥ ⎢daughters:⎣ ⎥ ⎦ ﬁrst : Noun ⎢ ⎥ rest : ⎢ ⎥ rest=nil : [Sign] ⎢ ⎥ ⎣agr :Agr ⎦ c:eq(Number, daughters.ﬁrst.agr.num, daughters.rest.ﬁrst.agr.num) The record type that characterizes nounphrases is now μ(NP ∧ HFC). 30 Robin Cooper 7. Conclusions We have shown how a type theory with records gives us a notion of subtyping corresponding to subsumption in the uniﬁcation literature and a way of reducing meets of record types to record types which is similar to the graph uniﬁcation used in uniﬁcationbased grammar formalisms. Using record types instead of feature structures gives us a kind of intensionality which is not available in feature structures. This intensionality allows us to distinguish equivalent types which preserve information which is lost in the uniﬁcation of feature structures, such as the source of grammatical information associated with a phrase. This intensionality can also be exploited by associating empty types with ungrammatical phrases. Such types may contain information which could be used in robust parsing. While it may appear odd to refer to this property of as “intensionality” in the context of parsing, we do so because it is the same kind of intensionality which is important for our approach to the semantic analysis of attitudes such as know and believe. The type theory provides us with a level of abstraction which permits us to make generalizations across phenomena in natural language that have previously been treated by separate theories. Finally, this approach to uniﬁcation is embedded in a rich “functionbased” type theoretical framework which provides us with the kind of tools that are needed for semantics while at the same time allowing us to import uniﬁcation into our semantic analysis. Acknowledgements This work was supported by Swedish Research Council projects numbers 20024879 Records, types and computational dialogue semantics and 20054211 Librarybased Grammar Engineering. I am grateful to Thierry Coquand, Dan Flickinger, Jonathan Ginzburg, Erhard Hinrichs, Bengt Nordström and Aarne Ranta for discussion in connection with this work and to an anonymous referee for this volume for making a number of useful suggestions. Notes 1. This section contains revised material from (Cooper 2005a). 2. There is a technical sense in which this recursion is nonessential. These records could also be viewed as nonrecursive records whose labels are sequences of atomic labels. See (Cooper 2005a) for more discussion. TTR and uniﬁcationbased grammar 31 3. In order to do this safely we stratify the types. We deﬁne the type system as a family of type systems of order n for each natural number n. The idea is that types which are not deﬁned in terms of other types are of order 0 and that types which are deﬁned in terms of types of order n are of order n + 1. We will not discuss this in detail here but rely on the discussion in (Cooper 2005a). In this paper we will suppress reference to order in the speciﬁcation of our types. 4. We are making the simplifying assumption that all the verb forms represented here are third person. 5. Any record type which has ⊥ in one of its ﬁelds will be such that there are no records of that type and thus the type will be equivalent to ⊥. References Barwise, Jon and John Perry 1983 Situations and Attitudes. Bradford Books. Cambridge, Mass.: MIT Press. Betarte, Gustavo 1998 Dependent Record Types and Algebraic Structures in Type Theory. Ph.D. thesis, Department of Computing Science, Göteborg University and Chalmers University of Technology. Betarte, Gustavo and Alvaro Tasistro 1998 Extension of MartinLöf’s type theory with record types and subtyping. In Giovanni Sambin and Jan Smith, (eds.), TwentyFive Years of Constructive Type Theory, number 36 in Oxford Logic Guides. Oxford: Oxford University Press. Cooper, Robin 2005a Austinian truth, attitudes and type theory. Research on Language and Computation 3: 333–362. 2005b Records and record types in semantic theory. Journal of Logic and Computation 15(2): 99–112. Coquand, Thierry, Randy Pollack, and Makoto Takeyama 2004 A logical framework with dependently typed records. Fundamenta Informaticae XX: 1–22. Fouvry, Frederik 2003 Constraint relaxation with weighted feature structures. In IWPT 03, International Workshop on Parsing Technologies. Nancy (France). Gabbay, Dov and Franz Guenthner, (eds.) 1986 Handbook of Philosophical Logic, Vol. III. Dordrecht: Reidel. Ginzburg, Jonathan in prep Semantics and interaction in dialogue. Draft available from http://www.dcs.kcl.ac.uk/staff/ginzburg/papers.html. 32 Robin Cooper Kamp, Hans and Uwe Reyle 1993 From Discourse to Logic. Dordrecht: Kluwer. Karttunen, Lauri 1969 Pronouns and variables. In Robert I. Binnick, Alice Davison, Georgia M. Green, and Jerry L. Morgan, (eds.), Papers from the Fifth Regional Meeting of the Chicago Linguistic Society, 108–115. Department of Linguistics, University of Chicago, Chicago, Illinois. Kohlhase, Michael, Susanna Kuschert, and Manfred Pinkal 1996 A typetheoretic semantics for λDRT. In Paul Dekker and Martin Stokhof, (eds.), Proceedings of the 10th Amsterdam Colloquium, 479–498. ILLC, Amsterdam. Larsson, Staffan 2002 Issuebased Dialogue Management. Ph.D. thesis, University of Gothenburg. Mönnich, Uwe 1985 Untersuchungen zu einer konstruktiven Semantik für ein Fragment des Englischen. Habilitationsschrift, Universität Tübingen. Montague, Richard 1974 Formal Philosophy: Selected Papers of Richard Montague. New Haven: Yale University Press. Ed. and with an introduction by Richmond H. Thomason. Muskens, Reinhard 1996 Combining Montague semantics and discourse representation. Linguistics and Philosophy 19(2): 143–186. Ranta, Aarne 1994 TypeTheoretical Grammar. Oxford: Clarendon Press. Sag, Ivan A., Thomas Wasow, and Emily M. Bender 2003 Syntactic Theory: A Formal Introduction. Stanford: CSLI Publications, 2nd edition. Shieber, Stuart 1986 An Introduction to UniﬁcationBased Approaches to Grammar. Stanford: CSLI Publications. Sundholm, Göran 1986 Proof theory and meaning. In Gabbay and Guenthner (1986), chapter 8, 471–506. Tasistro, Alvaro 1997 Substitution, record types and subtyping in type theory, with applications to the theory of programming. Ph.D. thesis, Department of Computing Science, University of Gothenburg and Chalmers University of Technology. van Benthem, Johan and Alice ter Meulen, (eds.) 1997 Handbook of Logic and Language. North Holland and MIT Press. TTR and uniﬁcationbased grammar 33 van Eijck, Jan and Hans Kamp 1997 Representing discourse in context. In van Benthem and ter Meulen (1997), chapter 3, 179–237. Oneletter automata: How to reduce k tapes to one Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz Abstract The class of kdimensional regular relations has various closure properties that are interesting for practical applications. From a computational point of view, each closure operation may be realized with a corresponding construction for ktape ﬁnite state automata. While the constructions for union, Kleenestar and (coordinatewise) concatenation are simple, speciﬁc and nontrivial algorithms are needed for relational operations such as composition, projection, and cartesian product. Here we show that all these operations for ktape automata can be represented and computed using standard operations on conventional onetape ﬁnite state automata plus some trivial rules for tape manipulation. As a key notion we introduce the concept of a oneletter ktape automaton, which yields a bridge between ktape and onetape automata. We achieve a general and efﬁcient implementational framework for ntape automata. 1. Introduction Multitape ﬁnite state automata and especially 2tape automata have been widely used in many areas of computer science such as Natural Language Processing (Karttunen et al. 1996; Mohri 1996; Roche and Schabes 1997) and Speech Processing (Mohri 1997; Mohri et al. 2002). They provide an uniform, clear and computationally efﬁcient framework for dictionary representation (Karttunen 1994; Mihov and Maurel 2001) and realization of rewrite rules (Gerdemann and van Noord 1999; Kaplan and Kay 1994; Karttunen 1997), as well as text tokenization, lexicon tagging, partofspeech disambiguation, indexing, ﬁltering and many other text processing tasks (Karttunen et al. 1996; Mohri 1996; Roche and Schabes 1995, 1997). The properties of ktape ﬁnite state automata differ signiﬁcantly from the corresponding properties of 1tape automata. For example, for k ≥ 2 the class of relations recognized by ktape automata is not closed under intersection and complement. Moreover there is no general determinization procedure for ktape automata. On the other side the class of relations recognized by ktape ﬁnite state automata is closed under a number of useful relational operations like composition, cartesian product, projection, inverse etc. It is this latter property that 36 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz makes ktape automata interesting for many practical applications such as the ones listed above. There exist a number of implementations for ktape ﬁnite state automata (Karttunen et al. 1996; Mohri et al. 1998; van Noord 1997). Most of them are implementing the 2tape case only. While it is straightforward to realize constructions for ktape automata that yield union, Kleenestar and concatenation of the recognized relations, the computation of relational operations such as composition, projection and cartesian product is a complex task. This makes the use of the ktape automata framework tedious and difﬁcult. We introduce an approach for presenting all relevant operations for ktape automata using standard operations for classical 1tape automata plus some straightforward operations for adding, deleting and permuting tapes. In this way we obtain a transparent, general and efﬁcient framework for implementing ktape automata. The main idea is to consider a restricted form of ktape automata where all transition labels have exactly one nonempty component representing a single letter. The set of all ktuples of this form represents the basis of the monoid of ktuples of words together with the coordinatewise concatenation. We call this kind of automata oneletter automata. Treating the basis elements as symbols of a derived alphabet, oneletter automata can be considered as conventional 1tape automata. This gives rise to a correspondence where standard operations for 1tape automata may be used to replace complex operations for ktape automata. The paper is structured as follows. Section 2 provides some formal background. In Section 3 we introduce oneletter ktape automata. We show that classical algorithms for union, concatenation and Kleenestar over oneletter automata (considered as 1tape automata) are correct if the result is interpreted as a ktape automaton. Section 4 is central. A condition is given that guarantees that the intersection of two kdimensional regular relations is again regular. For ktape oneletter automata of a speciﬁc form that reﬂects this condition, any classical algorithm for intersecting the associated 1tape automata can be used for computing the intersection of the regular relations recognized by the automata. Section 5 shows how to implement tape permutations for oneletter automata. Using tape permutations, the inverse relation to a given kdimensional regular relation can be realized. In a similar way, Section 6 treats tape insertion, tape deletion and projection operations for kdimensional regular relations. Section 7 shows how to reduce the computation of composition and cartesian product of regular relations to intersections of the kind discussed in Section 4. plus tape insertion and projection. In Sec Oneletter automata 37 tion 8 we add some ﬁnal remarks. We comment on problems that may arise when using ktape automata and on possible solutions. 2. Formal Background We assume that the reader is familiar with standard notions from automata theory (see, e.g., (Aho et al. 1983; Roche and Schabes 1995)). In the sequel, with Σ we denote a ﬁnite set of symbols called the alphabet, ε denotes the empty word, and Σε := Σ ∪ {ε}. The length of a word w ∈ Σ∗ is written w. If L1 , L2 ⊆ Σ∗ are languages, then L1 · L2 := {w1 · w2  w1 ∈ L1 , w2 ∈ L2 } denotes their concatenation. Here w1 ·w2 is the usual concatenation of words. Recall that Σ∗ , ·, ε is the free monoid with set of generators Σ. If v = v1 , . . . , vk and w = w1 , . . . , wk are two ktuples of words, then v w := v1 · w1 , . . . , vk · wk denotes the coordinatewise concatenation. With ε̂ we denote the ktuple ε, . . . , ε . The tuple (Σ∗ )k , , ε̂ is a monoid that can be described as the kfold cartesian product of the free monoid Σ∗ , ·, ε . As set of generators we consider Σ̂k := {ε, . . . , a, . . . , ε  1 ≤ i ≤ k, a ∈ Σ}. ↑ i Note that the latter monoid is not free, due to obvious commutation rules for generators. For relations R ⊆ (Σ∗ )k we deﬁne R0 := {ε̂}, Ri+1 := Ri R, R∗ := ∞ [ Ri (Kleenestar). i=0 Let k ≥ 2 and 1 ≤ i ≤ k. The relation R (i) := {w1 , . . . , wi−1 , wi+1 , . . . , wk  ∃v ∈ Σ∗ : w1 , . . . , wi−1 , v, wi+1 , . . . , wk ∈ R} is called the projection of R to the set of coordinates {1, . . . , i − 1, i + 1, . . . , k}. 38 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz If R1 , R2 ⊆ (Σ∗ )k are two relations of the same arity, then R1 R2 := {v w  v ∈ R1 , w ∈ R2 } denotes the coordinatewise concatenation. If R1 ⊆ Σ∗ k and R2 ⊆ Σ∗ l are two relations, then R1 × R2 := {w1 , . . . , wk+l  w1 , . . . , wk ∈ R1 , wk+1 , . . . , wk+l ∈ R2 } is the cartesian product of R1 and R2 and R1 ◦ R2 := {w1 , . . . , wk+l−2  ∃w : w1 , . . . , wk−1 , w ∈ R1 , w, wk , . . . , wk+l−2 ∈ R2 } is the composition of R1 and R2 . Further wellknown operations for relations are union, intersection, and inversion (k = 2). Deﬁnition 1 The class of kdimensional regular relations over the alphabet Σ is recursively deﬁned in the following way: – ∅ and {v} for all v ∈ Σ̂k are kdimensional regular relations. – If R1 , R2 and R are kdimensional regular relations, then so are – R1 R2 , – R1 ∪ R2 , – R∗ . – There are no other kdimensional regular relations. Note 1 The class of kdimensional regular relations over a given alphabet Σ is closed under union, Kleenestar, coordinatewise concatenation, composition, projection, and cartesian product. For k ≥ 2 the class of regular relations is not closed under intersection, difference and complement. Obviously, every 1dimensional regular relation is a regular language over the alphabet Σ. Hence, for k = 1 we obtain closure under intersection, difference and complement. Deﬁnition 2 Let k be a positive integer. A ktape automaton is a sixtuple A = k, Σ, S, F, s0 , E , where Σ is an alphabet, S is a ﬁnite set of states, F ⊆ S Oneletter automata 39 is a set of ﬁnal states, s0 ∈ S is the initial state and E ⊆ S × (Σε )k × S is a ﬁnite set of transitions. A sequence s0 , a1 , s1 , . . . , sn−1 , an , sn , where s0 is the initial state, si ∈ S and ai ∈ (Σε )k for i = 1, . . . , n, is a path for A iff si−1 , ai , si ∈ E for 1 ≤ i < n. The ktape automaton A recognizes v ∈ (Σ∗ )k iff there exists a path s0 , a1 , s1 , . . . , sn−1 , an , sn for A such that sn ∈ F and v = a1 a2 . . . an−1 an . With R(A) we denote the set of all tuples in (Σ∗ )k recognized by A, i.e., R(A) := {v ∈ (Σ∗ )k  A recognizes v}. For a given ktape automaton A = k, Σ, S, F, s0 , E the generalized transition relation E ∗ ⊂ S × (Σ∗ )k × S is recursively deﬁned as follows: 1. s, ε, . . . , ε , s ∈ E ∗ for all s ∈ S, 2. if s1 , v, s ∈ E ∗ and s , a, s2 ∈ E, then s1 , v a, s2 ∈ E ∗ , for all v ∈ (Σ∗ )k , a ∈ (Σε )k , s1 , s , s2 ∈ S. Clearly, if A is a ktape automaton, then R(A) = {v ∈ (Σ∗ )k  ∃ f ∈ F : s0 , v, f ∈ E ∗ }. Note 2 By a wellknown generalization of Kleene’s Theorem (see Kaplan and Kay (1994)), for each ktape automaton A the set R(A) is a kdimensional regular relation, and for every kdimensional regular relation R , there exists a ktape automaton A such that R(A ) = R . 3. Basic Operations for oneletter automata In this section we introduce the concept of a oneletter automaton. Oneletter automata represent a special form of ktape automata that can be naturally interpreted as onetape automata over the alphabet Σ̂k . We show that basic operations such as union, concatenation, and Kleenestar for oneletter automata can be realized using the corresponding standard constructions for conventional onetape automata. Deﬁnition 3 A ktape ﬁnite state automaton A = k, Σ, S, F, s0 , E is a oneletter automaton iff all transitions e ∈ E are of the form e = s, ε, . . . , ε , a, ε , . . . , ε , s ↑ ↑ ↑ i−1 for some 1 ≤ i ≤ k and a ∈ Σ. i i+1 ↑ k 40 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz Proposition 1 For every ktape automaton A we may effectively construct a ktape oneletter automaton A such that R(A ) = R(A). Proof. First we can apply the classical εremoval procedure in order to construct an ε̂free ktape automaton, which leaves the recognized relation unchanged. Let Ā = k, Σ, S, F, s0 , E be an ε̂free ktape automaton such that R(A) = R(Ā). Then we construct A = k, Σ, S , F, s0 , E using the following algorithm: S = S, E = ∅ FOR s ∈ S DO: FOR s, a1 , a2 , . . . , ak , s ∈ E DO LET I = {i ∈ N  ai = ε} (I = {i1 , . . . , it }); LET S = {si1 , . . . , sit−1 }, SUCH THAT S ∩ S = ∅; S = S ∪S ; E = E ∪ {si j , ε, . . . , ε, ai j , ε, . . . , ε , si j+1  0 ≤ j ≤ t − 1, ↑ ij si0 = s and sit = s }; END; END. Informally speaking, we split each transition with label a1 , a2 , . . . , ak with t > 1 nonempty coordinates into t subtransitions, introducing t − 1 new intermediate states. Corollary 1 If R ⊆ (Σ∗ )k is a kdimensional regular relation, then there exists a ktape oneletter automaton A such that R(A) = R. Each ktape oneletter automaton A over the alphabet Σ can be considered as a onetape automaton (denoted by Â) over the alphabet Σ̂k . Conversely, every εfree onetape automaton over the alphabet Σ̂k can be considered as a ktape automaton over Σ. Formally, this correspondence can be described using two mappings. Deﬁnition 4 The mapping ˆ maps every ktape oneletter automaton A = k, Σ, S, F, s0 , E to the εfree onetape automaton Â := Σ̂k , S, F, s0 , E . Oneletter automata 41 The mapping ˇ maps a given εfree onetape automaton A = Σ̂k , S, F, s0 , E to the ktape oneletter automaton Ǎ := k, Σ, S, F, s0 , E . Obviously, the mappings ˆ and ˇ are inverse. From a computational point of view, the mappings merely represent a conceptual shift where we use another alphabet for looking at transitions labels. States and transitions are not changed. Deﬁnition 5 The mapping φ : Σ̂∗k → Σ∗ k : a1 · · · an → a1 · · · an , ε → ε̂ is called the natural homomorphism between the free monoid Σ̂∗k , ·, ε and the monoid Σ∗ k , , ε̂ . It is trivial to check that φ is in fact a homomorphism. We have the following connection between the mappingsˆ ,ˇand φ. Lemma 1 Let A = k, Σ, S, F, s0 , E be a ktape oneletter automaton. Then 1. Âˇ = A. 2. R(A) = φ(L(Â)). ˆ Furthermore, if A is an εfree onetape automaton over Σ̂k , then Ǎ = A . Thus we obtain the following commutative diagram: A ˇ ˆ R R(A) Â L φ L(Â) We get the following proposition as a direct consequence of Lemma 1 and the homomorphic properties of the mapping φ. Proposition 2 Let A1 and A2 be two ktape oneletter automata. Then we have the following: 1. R(A1 ) ∪ R(A2 ) = φ(L(Â1 ) ∪ L(Â2 )). 2. R(A1 ) R(A2 ) = φ(L(Â1 ) · L(Â2 )). 3. R(A1 )∗ = φ(L(Â1 )∗ ). 42 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz Algorithmic constructions From Part 1 of Proposition 2 we see the following. Let A1 and A2 be two ktape oneletter automata. Then, to construct a oneletter automaton A such that R(A) = R(A1 ) ∪ R(A2 ) we may interpret Ai as a onetape automaton Âi (i = 1, 2). We use any unionconstruction for onetape automata, yielding an automaton A such that L(A ) = L(Â1 ) ∪ L(Â2 ). Removing εtransitions and interpreting the resulting automaton A as a ktape automaton A := Ǎ we receive a oneletter automaton such that R(A) = R(A1 ) ∪ R(A2 ). Similarly Parts 2 and 3 show that “classical” algorithms for closing conventional onetape automata under concatenation and Kleenestar can be directly applied to ktape oneletter automata, yielding algorithms for closing ktape oneletter automata under concatenation and Kleenestar. 4. Intersection of oneletter automata It is wellknown that the intersection of two kdimensional regular relations is not necessarily a regular relation. For example, the relations R1 = {an bk , cn  n, k ∈ N} and R2 = {as bn , cn  s, n ∈ N} are regular, but R1 ∩R2 = {an bn , cn  n ∈ N} is not regular since its ﬁrst projection is not a regular language. We now introduce a condition that guarantees that the classical construction for intersecting onetape automata is correct if used for ktape oneletter automata. As a corollary we obtain a condition for the regularity of the intersection of two kdimensional regular relations. This observation will be used later for explicit constructions that yield composition and cartesian product of oneletter automata. A few preparations are needed. Deﬁnition 6 Let v = b1 . . . bn be an arbitrary word over the alphabet Ξ, i.e., v ∈ Ξ∗ . We say that the word v is obtained from v by adding the letter b iff v = b1 . . . b j bb j+1 . . . bn for some 0 ≤ j ≤ n. In this case we also say that v is obtained from v by deleting the symbol b. Proposition 3 Let v = a1 . . . an ∈ Σ̂∗k and φ(v) = a1 a2 · · ·an = w1 , . . . , wk . Let also a = ε, . . . , b, . . . , ε ∈ Σ̂k . Then, if v is obtained from v by adding the ↑ i letter a, then φ(v ) = w1 , . . . , wi−1 , wi , wi+1 , . . . , wk and wi is obtained from wi by adding the letter b. Oneletter automata 43 Deﬁnition 7 For a regular relation R ⊆ (Σ∗ )k the coordinate i (1 ≤ i ≤ k) is inessential iff for all w1 , . . . , wk ∈ R and any v ∈ Σ∗ we have w1 , . . . , wi−1 , v, wi+1 , . . . , wk ∈ R. Analogously, if A is a ktape automaton such that R(A) = R we say that tape i of A is inessential. Otherwise we call coordinate (tape) i essential. Deﬁnition 8 Let A be a ktape oneletter automaton and assume that each coordinate in the set I ⊆ {1, . . . , k} is inessential for R(A). Then A is in normal form w.r.t. I iff for any tape i ∈ I we have: 1. ∀s ∈ S, ∀a ∈ Σ : s, ε, . . . , a, . . . , ε , s ∈ E, ↑ i / E. 2. ∀s, s ∈ S, ∀a ∈ Σ : (s = s) ⇒ s, ε, . . . , a, . . . , ε , s ∈ ↑ 1 ↑ i ↑ k Proposition 4 For any ktape automaton A and any given set I of inessential coordinates of R(A) we may effectively construct a ktape oneletter automaton A in normal form w.r.t. I such that R(A ) = R(A). Proof. Let A = k, Σ, S, F, s0 , E . Without loss of generality we can assume that A is in oneletter form (Proposition 1). To construct A = k, Σ, S, F, s0 , E we use the following algorithm: E =E FOR s ∈ S DO FOR i ∈ I DO FOR a ∈ Σ DO IF ((s, ε, . . . , ε, a, ε, . . . , ε , s ∈ E ) & (s = s)) THEN ↑ i E = E \ {s, ε, . . . , ε, a, ε, . . . , ε , s }; ↑ i E = E ∪ s, ε, . . . , ε, a, ε, . . . , ε , s ; ↑ i END; END; END. The algorithm does not change any transition on an essential tape. Transitions between distinct states that affect an inessential tape in I are erased. For 44 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz each state we add loops with all symbols from the alphabet for the inessential tapes in I. The correctness of the above algorithm follows from the fact that for any inessential tape i ∈ I we have w1 , . . . , wi , . . . , wn ∈ R(A) iff w1 , . . . , ε, . . . , wn ∈ R(A). Corollary 2 Let R ⊆ (Σ∗ )k be a regular relation with a set I of inessential coordinates. Then there exists a ktape oneletter automaton A in normal form w.r.t. I such that R(A) = R. The following property of ktape automata in normal form will be useful when proving Lemma 2. Proposition 5 Let A = k, Σ, S, F, s0 , E be a ktape oneletter automaton in normal form w.r.t. the set of inessential coordinates I. Let i0 ∈ I and let v = a1 . . . an ∈ L(Â). Then for any a = ε, . . . , b, . . . , ε ∈ Σ̂k and any word ↑ i0 v ∈ Σ̂∗k obtained from v by adding a we have v ∈ L(Â). Proof. The condition for the automaton A to be in normal form w.r.t. I yields that for all s ∈ S the transition s, a, s is in E, which proves the proposition. Now we are ready to formulate and prove the following sufﬁcient condition for the regularity of the intersection of two regular relations. With K we denote the set of coordinates {1, . . . , k}. Lemma 2 For i = 1, 2, let Ai be a ktape oneletter automaton, let Ii ⊆ K denote a given set of inessential coordinates for Ai . Let Ai be in normal form w.r.t. Ii (i = 1, 2). Assume that K \(I1 ∪ I2 ) ≤ 1, which means that there exists at most one common essential tape for A1 and A2 . Then R(A1 ) ∩ R(A2 ) is a regular kdimensional relation. Moreover R(A1 )∩R(A2 ) = φ(L(Â1 )∩L(Â2 )). Proof. It is obvious that φ(L(Â1 ) ∩ L(Â2 )) ⊆ R(A1 ) ∩ R(A2 ), because if a1 . . . an ∈ L(Â1 ) ∩ L(Â2 ), then by Lemma 1 we have a1 · · · an ∈ R(A1 ) ∩ R(A2 ). We give a detailed proof for the other direction, showing that R(A1 ) ∩ R(A2 ) ⊆ φ(L(Â1 ) ∩ L(Â2 )). For the proof the reader should keep in mind that the transition labels of the automata Ai (i = 1, 2) are elements of Σ̂k , which means that the sum of the lengths of the words representing the components is exactly 1. Oneletter automata 45 Let w1 , w2 , . . . , wk ∈ R(A1 ) ∩ R(A2 ). Let j0 ∈ K be a coordinate such that for each j0 = j ∈ K we have j ∈ I1 or j ∈ I2 . Let E1 = K \ I1 . Recall that for i ∈ E1 , i = j0 always i ∈ I2 is an inessential tape for A2 . Then by the deﬁnition of inessential tapes the tuples w1 , . . . , wk and w1 , . . . , wk , where wi = ε, if i ∈ I1 wi , if i ∈ E1 wi = ε, if i ∈ E1 and i = j0 wi , otherwise respectively are in R(A1 ) and R(A2 ). Then there are words = a1 . . . an ∈ L(Â1 ) = a1 . . . am ∈ L(Â2 ) v v such that φ(v ) = w1 , . . . , wk and φ(v ) = w1 , . . . , wk . Note that n = ∑ki=1 wi  and m = ∑ki=1 wi . Furthermore, w j0 = w j0 = w j0 . Let l = w j0 . We now construct a word a1 a2 . . . ar ∈ L(Â1 ) ∩ L(Â2 ) such that a1 a2 . . . ar = w1 , . . . , wk , which imposes that r = n + m − l. Each letter ai is obtained copying a suitable letter from one of the sequences a1 . . . an and a1 . . . am . In order to control the selection, we use the pair of indices ti ,ti (0 ≤ i < n + m − l), which can be considered as pointers to the two sequences. The deﬁnition of ti ,ti and ai proceeds inductively in the following way. Let t0 = t0 := 1. Assume that ti and ti are deﬁned for some 0 ≤ i < n + m − l. We show how to deﬁne ai+1 and the indices ti+1 and ti+1 . We distinguish four cases: 1. if ti = n + 1 and ti = m + 1 we stop; else 2. if at = ε, . . . , b, . . . , ε for some j = j0 , then ai+1 := at , ti+1 := ti + 1, ↑ i i j ti+1 := ti , 3. if at = ε, . . . , b, . . . , ε or ti = n + 1, and at = ε, . . . , c, . . . , ε for some ↑ i i j0 ↑ j j = j0 , then ai+1 := at , ti+1 := ti , and ti+1 := ti + 1. i 4. if at = at = ε, . . . , b, . . . , ε for some b ∈ Σ, then ai+1 := at , ti+1 := i i ↑ i j0 ti + 1 and ti+1 := ti + 1. From an intuitive point of view, the deﬁnition yields a kind of zigzag construction. We always proceed in one sequence until we come to a transition 46 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz that affects coordinate j0 . At this point we continue using the other sequence. Once we have in both sequences a transition that affects j0 , we enlarge both indices. From w j0 = w j0 = w j0 it follows immediately that the recursive definition stops exactly when i + 1 = n + m − l. In fact the subsequences of a1 . . . an and a1 . . . am from which w j0 is obtained must be identical. Using induction on 0 ≤ i ≤ n + m − l we now prove that the word a1 . . . ai is obtained from a1 . . . at −1 by adding letters in Σ̂k which have a nonε symbol i in an inessential coordinate for R(A1 ). The base of the induction is obvious. Let the statement be true for some 0 ≤ i < n + m − l. We prove it for i + 1: The word a1 . . . ai ai+1 is obtained from a1 . . . ai by adding the letter ai+1 = ε, . . . b, . . . , ε , and according to the induction hypothesis a1 . . . ai is obtained ↑ j from a1 , . . . ati −1 by adding letters with a nonε symbol in a coordinate in I1 . If j ∈ E1 (Cases 2 and 4), then ai+1 = at , ti+1 = ti + 1 and ti+1 − 1 = ti , hence i a1 . . . ai ai+1 is obtained from a1 , . . . at −1 at −1 by adding letters satisfying i i+1 the above condition. On the other side, if j ∈ I1 (Case 3) we have ai+1 = at and ti+1 := ti , which means that a1 . . . at −1 = a1 . . . at −1 and ai+1 is a i i i+1 letter satisfying the condition. Thus a1 . . . ai ai+1 is obtained from a1 . . . at −1 i+1 adding letters which have nonε symbol on an inessential tape for A1 , which means that the statement is true for i + 1. Analogously we prove for 0 ≤ i ≤ n + m − l that a1 . . . ai is obtained from a1 . . . at −1 by adding letters in Σ̂k which have a nonε symbol in an inessential i coordinate for R(A2 ). By Proposition 5, a1 . . . an+m−l ∈ L(Â1 ) and a1 . . . an+m−l ∈ L(Â2 ). From Proposition 3 we obtain that a1 · · · an+m−l = u1 , . . . , uk , where ui = wi if i ∈ E1 and ui = wi otherwise. But now remembering the deﬁnition of wi and wi we obtain that a1 · · · an+m−l = w1 , . . . , wk , which we had to prove. Corollary 3 If R1 ⊆ (Σ)k and R2 ⊆ (Σ)k are two kdimensional regular relations with at most one common essential coordinate i (1 ≤ i ≤ k), then R1 ∩ R2 is a kdimensional regular relation. Algorithmic construction From Lemma 2 we see the following. Let A1 and A2 be two ktape oneletter automata with at most one common essential tape i. Assume that both automata are in normal form w.r.t. the sets of inessential tapes. Then the relation R(A1 ) ∩ R(A2 ) is recognized by any εfree 1tape automaton A accepting L(Â1 ) ∩ L(Â2 ), treating A as a ktape oneletter automaton A = Ǎ . Oneletter automata 47 5. Tape permutation and inversion for oneletter automata In the sequel, let Sk denote the symmetric group of k elements. Deﬁnition 9 Let R ⊆ (Σ∗ )k be a regular relation, let σ ∈ Sk . The permutation of coordinates induced by σ, σ(R), is deﬁned as σ(R) := {wσ−1 (1) , . . . , wi , . . . , wσ−1 (k)  w1 , . . . , wk ∈ R}. ↑ σ(i) Proposition 6 For a given ktape oneletter automaton A = k, Σ, S, F, s0 , E , let σ(A) := k, Σ, S, F, s0 , σ(E) where σ(E) := {s, ε, . . . , ai , . . . , ε , s  s, ε, . . . , ai , . . . , ε , s ∈ E}. ↑ ↑ σ(i) i Then R(σ(A)) = σ(R(A)). Proof. Using induction over the construction of E ∗ and σ(E)∗ we prove that for all s ∈ S and w1 , . . . , wk ∈ (Σ∗ )k we have ⇔ s0 , w1 , . . . , wk , s ∈ E ∗ s0 , wσ−1 (1) , . . . , wi , . . . , wσ−1 (k) , s ∈ σ(E)∗ . ↑ σ(i) “⇒”. The base of the induction is obvious since s0 , ε, . . . , ε , s0 ∈ E ∗ ∩ σ(E)∗ . Now suppose that there are transitions s0 , w1 , . . . , wi−1 , wi , wi+1 , . . . , wk , s ∈ E ∗ s, ε, . . . , a, . . . , ε , s ∈ E. ↑ i Then, by induction hypothesis, s0 , wσ−1 (1) , . . . , wi , . . . , wσ−1 (k) , s ∈ σ(E)∗ . ↑ σ(i) The deﬁnition of σ(E) shows that s, ε, . . . , a , . . . , ε , s ∈ σ(E). Hence ↑ σ(i) s0 , wσ−1 (1) , . . . , wi a, . . . , wσ−1 (k) , s ∈ σ(E)∗ , ↑ σ(i) 48 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz which we had to prove. “⇐”. Follows analogously to “⇒”. Corollary 4 Let R ⊆ (Σ∗ )k be a kdimensional regular relation and σ ∈ Sk . Then also σ(R) is a kdimensional regular relation. Algorithmic construction From Proposition 6 we see the following. Let A be a 2tape oneletter automaton. If σ denotes the transposition (1, 2), then automaton σ(A) deﬁned as above recognizes the relation R(A)−1 . 6. Tape insertion, tape deletion and projection Deﬁnition 10 Let R ⊆ (Σ∗ )k be a kdimensional regular relation. We deﬁne the insertion of an inessential coordinate at position i (denoted R ⊕ (i)) as R ⊕ (i) := {w1 , . . . , wi−1 , v, wi , . . . , wk  w1 , . . . , wi−1 , wi , . . . , wk ∈ R, v ∈ Σ∗ }. Proposition 7 Let A = k, Σ, S, F, s0 , E be a ktape oneletter automaton. Let A := k + 1, Σ, S, F, s0 , E where E := {s, ε, . . . , a, . . . , ε, ε , s  s, ε, . . . , a, . . . , ε , s ∈ E} ↑ ↑ i ↑ i k+1 ↑ k ∪ {s, ε, . . . , ε, a , s  s ∈ S, a ∈ Σ}. ↑ k+1 Then R(A ) = R(A) ⊕ (k + 1). Proof. First, using induction on the construction of E ∗ we prove that for all s ∈ S and w1 , . . . , wk , wk+1 ∈ (Σ∗ )k+1 we have s0 , w1 , . . . , wk , wk+1 , s ∈ E ∗ ⇔ s0 , w1 , . . . , wk , ε , s ∈ E ∗ . “⇒”. The base of the induction is obvious. Assume there are transitions s0 , w1 , . . . , wi−1 , wi , wi+1 , . . . , wk , wk+1 , s ∈ E ∗ s, ε, . . . , a, . . . , ε , s ∈ E . ↑ i First assume that i ≤ k. By induction hypothesis, s0 , w1 , . . . , wi−1 , wi , wi+1 , . . . , wk , ε , s ∈ E ∗ . Oneletter automata 49 Using the deﬁnition of E ∗ we obtain s0 , w1 , . . . , wi−1 , wi a, wi+1 , . . . , wk , ε , s ∈ E ∗ . If i = k + 1, then s = s . We may directly use the induction hypothesis to obtain s0 , w1 , . . . , wk , ε , s ∈ E ∗ . “⇐”. Let s0 , w1 , . . . , wk , ε , s ∈ E ∗ . Let wk+1 = v1 . . . vn ∈ Σ∗ where vi ∈ Σ. The deﬁnition of E shows that for all vi (1 ≤ i ≤ n) there exists a transition s , ε, . . . , ε, vi , s ∈ E . Hence s0 , w1 , . . . , wk , wk+1 , s ∈ E ∗ . ↑ k+1 To ﬁnish the proof observe that the deﬁnition of E yields s0 , w1 , . . . , wk , s ∈ E ∗ s0 , w1 , . . . , wk , ε , s ∈ E ∗ . ⇔ Corollary 5 If R ⊆ (Σ∗ )k is a regular relation, then R ⊕ (i) is a (k + 1)dimensional regular relation. Proof. The corollary directly follows from Proposition 7 and Proposition 6 having in mind that R ⊕ (i) = σ(R ⊕ (k + 1)) where σ is the cyclic permutation (i, i + 1, . . . , k, k + 1) ∈ Sk+1 . It is wellknown that the projection of a kdimensional regular relation is again a regular relation. The following propositions show how to obtain a (k − 1)tape oneletter automaton representing the relation R (i) (cf. Section 2) directly from a ktape oneletter automaton representing the relation R. Proposition 8 Let A = k, Σ, S, F, s0 , E be a ktape oneletter automaton. Let A := k − 1, Σ, S, F, s0 , E be the (k − 1)tape automaton where for i ≤ k − 1 we have s, ε, . . . , a, . . . , ε , s ∈ E ↑ i ↑ ⇔ s, ε, . . . , a, . . . , ε , s ∈ E ↑ 1 k−1 ↑ ↑ i k and furthermore s, ε, . . . , ε , s ∈ E ↑ k−1 Then R(A ) = R(A) (k). ⇔ ∃ ak ∈ Σk : s, ε, . . . , ε , ak , s ∈ E. ↑ k−1 ↑ k 50 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz Note 3 The resulting automaton A is not necessarily a oneletter automaton because A may have some ε̂transitions. It could be transformed into a oneletter automaton using a standard εremoval procedure. Proof of Proposition 8. It is sufﬁcient to prove that for all w1 , . . . , wk−1 , wk ∈ (Σ∗ )k and s ∈ S we have s0 , w1 , . . . , wk−1 , wk , s ∈ E ∗ s0 , w1 , . . . , wk−1 , s ∈ E ∗ . ⇔ Again we use an induction on the construction of E ∗ and E ∗ . “⇒”. The base is trivial since s0 , ε, . . . , ε , s0 ∈ E ∗ and ↑ k s0 , ε, . . . , ε , s0 ∈ E ∗ . ↑ k−1 Let s0 , w1 , . . . , w j , . . . , wk , s ∈ E ∗ and s, ε, . . . , a, . . . , ε , s ∈ E for some ↑ j 1 ≤ j ≤ k. First assume that j < k. The induction hypothesis yields s0 , w1 , . . . , w j , . . . , wk−1 , s ∈ E ∗ . Since s, ε, . . . , a, . . . , ε , s ∈ E ↑ ↑ j k−1 we have s0 , w1 , . . . , w j a, . . . , wk−1 , s ∈ E ∗ . If j = k, then the induction hypothesis yields s0 , w1 , . . . , wk−1 , s ∈ E ∗ . We have s, ε, . . . , ε , s ∈ E , ↑ k−1 hence s0 , w1 , . . . , wk−1 , s ∈ E ∗ . “⇐”. Similarly as “⇒”. Corollary 6 If R ⊆ (Σ∗ )k is a regular relation, then R (i) is (k − 1)dimensional regular relation. Proof. The corollary follows directly from R (i) = (σ−1 (R)) (k), where σ is the cyclic permutation (i, i + 1, . . . , k) ∈ Sk and Proposition 6. Algorithmic construction The constructions given in Proposition 8 and 6 together with an obvious ε̂elimination show how to obtain a oneletter (k − 1)tape automaton A for the projection R (i), given a oneletter ktape automaton A recognizing R. Oneletter automata 51 7. Composition and cartesian product of regular relations We now show how to construct composition and cartesian product (cf. Section 2) of regular relations via automata constructions for standard 1tape automata. Lemma 3 Let R1 ⊆ (Σ∗ )n1 and R2 ⊆ (Σ∗ )n2 be regular relations. Then the composition R1 ◦ R2 is a (n1 + n2 − 2)dimensional regular relation. Proof. Using Corollary 5 we see that the relations R1 := (. . . ((R1 ⊕ (n1 + 1)) ⊕ (n1 + 2)) ⊕ . . .) ⊕ (n1 + n2 − 1) R2 := (. . . ((R2 ⊕ (1)) ⊕ (2)) ⊕ . . .) ⊕ (n1 − 1) are (n1 + n2 − 1)dimensional regular relations. Using the deﬁnition of ⊕ we see that the essential coordinates for R1 are in the set E1 = {1, 2, . . . , n1 } and those of R2 are in the set E2 = {n1 , n1 + 1, . . . , n1 + n2 − 1}. Therefore R1 and R2 have at most one common essential coordinate, namely n1 . Corollary 3 shows that R = R1 ∩ R2 is a (n1 + n2 − 1)dimensional regular relation. Since coordinates in E1 (resp. E2 ) are inessential for R2 (resp. R1 ) we obtain w1 , . . . , wn1 −1 , w, wn1 +1 , . . . , wn1 +n2 −1 ∈ R1 ∩ R2 w1 , . . . , wn1 −1 , w ∈ R1 & w, wn1 +1 , . . . , wn1 +n2 −1 ∈ R2 . ⇔ Using the deﬁnition of and Corollary 2 we obtain that R (n1 ) is a (n1 + n2 − 2)dimensional regular relation such that w1 , . . . , wn1 −1 , wn1 +1 , . . . , wn1 +n2 −1 ∈ R (n1 ) ∃w ∈ Σ∗ : w1 , . . . , wn1 −1 , w, wn1 +1 , . . . , wn1 +n2 −1 ∈ R. ⇔ ↑ n1 Combining both equivalences we obtain ⇔ w1 , . . . , wn1 −1 , wn1 +1 , . . . , wn1 +n2 −1 ∈ R (n1 ) ∃w ∈ Σ∗ : w1 , . . . , wn1 −1 , w ∈ R1 & w, wn1 +1 , . . . , wn1 +n2 −1 ∈ R2 , i.e. R (n1 ) = R1 ◦ R2 . Lemma 4 Let R1 ⊆ (Σ∗ )n1 and R2 ⊆ (Σ∗ )n2 be regular relations. Then the cartesian product R1 × R2 is a (n1 + n2 )dimensional regular relation over Σ. 52 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz Proof. Similarly as in Lemma 3 we construct the (n1 + n2 )dimensional regular relations R1 := (. . . ((R1 ⊕ (n1 + 1)) ⊕ (n1 + 2)) ⊕ . . .) ⊕ (n1 + n2 ) R2 := (. . . ((R2 ) ⊕ (1)) ⊕ (2)) ⊕ . . .) ⊕ (n1 ). The coordinates in {1, 2, . . . , n1 } are inessential for R2 and those in {n1 + 1, . . . , n1 + n2 } are inessential for R1 . Therefore R1 and R2 have no common essential coordinate and, by Corollary 3, R := R1 ∩ R2 is a (n1 + n2 )dimensional regular relation. Using the deﬁnition of inessential coordinates and the deﬁnition of ⊕ we obtain ⇔ w1 , . . . , wn1 , wn1 +1 , . . . , wn1 +n2 ∈ R w1 , . . . , wn1 ∈ R1 & wn1 +1 , . . . , wn1 +n2 ∈ R2 , which shows that R = R1 × R2 . Algorithmic construction The constructions described in the above proofs show how to obtain oneletter automata for the composition R1 ◦ R2 and for the cartesian product R1 × R2 of the regular relations R1 ⊆ (Σ∗ )n1 and R2 ⊆ (Σ∗ )n2 , given oneletter automata Ai for Ri (i = 1, 2). In more detail, in order to construct an automaton for R1 ◦ R2 we 1. add n2 −1 ﬁnal inessential tapes to A1 and n1 −1 initial inessential tapes to A2 in the way described above (note that the resulting automata are in normal form w.r.t. the new tapes), 2. intersect the resulting automata as conventional onetape automata over the alphabet Σ̂n1 +n2 −1 , obtaining A, 3. remove the n1 th tape from A and apply an εremoval, thus obtaining A , which is the desired automaton. In order to construct an automaton for R1 × R2 we 1. add n2 ﬁnal inessential tapes to A1 and n1 initial inessential tapes to A2 in the way described above, 2. intersect the resulting automata as normal onetape automata over the alphabet Σ̂n1 +n2 , obtaining A, which is the desired automaton. Oneletter automata 53 At the end we will discuss the problem of how to represent identity relations as regular relations. First observe that the automaton A := 2, Σ, S, F, s0 , E where Σ := {a1 , . . . , an }, S := {s0 , s1 , . . . , sn }, F := {s0 } and E := {s0 , ai , ε si  1 ≤ i ≤ n} ∪ {si , ε, ai s0  1 ≤ i ≤ n} accepts R(A) = {v, v  v ∈ Σ∗ }. The latter relation we denote with Id Σ . Proposition 9 Let R1 be 1dimensional regular relation, i.e., a regular language. Then the set Id R1 := {v, v  v ∈ R1 } is a regular relation. Moreover Id R1 = (R1 ⊕ (2)) ∩ Id Σ . 8. Conclusion We introduced the concept of a oneletter ktape automaton and showed that oneletter automata can be considered as conventional 1tape automata over an enlarged alphabet. Using this correspondence, standard constructions for union, concatenation, and Kleenestar for 1tape automata can be directly used for oneletter automata. Furthermore we have seen that the usual relational operations for ktape automata can be traced back to the intersection of 1tape automata plus straightforward operations for adding, permuting and erasing tapes. We have implemented the presented approach for implementation of transducer (2tape automata) representing rewrite rules. Using it we have successfully realized Bulgarian hyphenation and tokenization. Still, in real applications the use of oneletter automata comes with some speciﬁc problems, in particular in situations where the composition algorithm is heavily used. In the resulting automa