Monday, August 22, 2016

Looking for Chomsky in Paris, 2016:
Early Morning Reflections on Mesoamerican Linguistics and the Preservation of Indigenous Manuscripts

Time and money are spent in collecting the remains in wood and stone, in pottery and in tissue and bone, in laboriously collating isolated words, and in measuring ancient constructions…But closer to the very self, to thought and being, are the connected expressions of men in their own tongues.                                                                       --D. Brinton (1883)
There is in reality never a permanent, stable equilibrium in any language. We thus pose the principle of the incessant transformation of languages as absolute. The case of an idiom which finds itself in a state of rest and equilibrium cannot be found.
--F. Saussure 

Paris may seem like an unlikely place to find yourself pondering the origins, diversity, and collecting of ancient Mesoamerican languages, but the sun, the quiet and the caffeine of the cafes appear to have, at least in me, a profound focusing effect on the mind. Perhaps this is why generations of Paris intellectuals have done their best writing in places like Le Select, La Rostand and the Café Flore.

I have seemingly, if not done some of my best writing, then certainly have accomplished some of my best reading and thinking while staring out into the streets from a small table accompanied by copious amount of coffee. It is in places like this, whether in Paris or in all night dinners in New York City, where I have absorbed some of the great classics in linguistics and the philosophy of language. The works of Ludwig Wittgenstein, Fritz Staal’s essays on the ancient Sanskrit grammarians, Ferdinand de Saussure's Cours de Linguistique Générale from 1916 (the 100th anniversary conference was happening during my visit), Gottlieb Frege’s articles

 on sense and reference, Saul Kripke's amazing Naming and Necessity, and most importantly, Aspects of a Theory of Syntax and Syntactic Structures, by Noam Chomsky, have been read slowly over the years with a small notebook close at hand.

Author Imaging the Codex Xolotl in the Bibliothèque nationale de France, June 2016
As a linguist and curator more mathematically inclined than language proficient, these works fired my imagination and made me think of language in more analytical ways. I was amazed to find linguists like Staal arguing that ancient grammarians, trying to find a systematic grammar for Sanskrit, had applied a kind of recursion theory to language and hypothesized a generative system for grammar of the kind that would not be postulated again until Chomsky’s work in the 1950’s[1]. The earliest methods of generative construction proposed by these grammarians, especially Pāṇini, who lived in the fourth century BCE, allowed for the development of discrete, potentially infinite language systems. The formal basis for Panini's methods involved the use of what today would be called “auxiliary" markers, first utilized by the logician Emil Post for the description of computer languages[2].

It was Chomsky however, especially his early writings on the formal and computational structure of grammar, such as, “Three Models for the Description of Language” (1956), “On certain formal properties of grammars” (1959) and “The algebraic theory of context free languages” (1963), that spurred my imagination. I can remember clearly the days when I first read Syntactic Structures, many years ago, while soaking up the sun in the Luxembourg Gardens. I saw in this short, yet dense work, for the first time, a profound computational basis for the analysis of grammars that could be applied across time, space, and in spite the performative, external differences found in individual languages.

Authors Old Notes on Chomsky's Lectures, Language and Mind, delivered at University of California, Berkeley in 1967

In Syntactic Structures Chomsky clearly outlines the idea that a grammar is a set of rules, or algorithms, whose goal is two-fold. First, these rules can be used to generate the sentences of the associated languages and only those sentences, and second they can be employed as a kind of decision procedure to classify and determine whether a given sentence is an element of a particular language or not.

While this may seem somewhat obvious, it is not easy to find simple grammars that span and generate an entire language, especially ancient languages which have been subject to change over the millennium and whose spoken and documentary evidence is quite limited and perhaps even tainted by the passing of time. In most cases we are in this position with many New World languages, including those of Mesoamerica, whose cultures languages have seen a great deal of change over the last 500 years, and whose first grammars and lexicons were written down by non-native Spanish clerics after the conquest[3].

For Chomsky there are three possible forms of grammar of increasing complexity: Finite-state, Phase structure and Transformational.  The first of these grammars, the finite-state form is capable, just like real languages, of generating an infinite number of sentences from a finite number of primitive elements. These elements can be morphemes, phonemes or words. Morphemes are the smallest grammatically functioning elements of a language, for example ‘girl’ and ‘s’ in ‘girls’. In Chomsky’s terms these kinds of languages form what is known as a “finite-state Markov processes.” I would later come to specialize in the mathematics of these processes whose complexity is nearly infinite.

One can think of this type of grammar as a kind of machine. Chomsky notes that “this conception of grammar is an extremely powerful and general one,” and that “In producing a sentence, the speaker begins in some initial state, produces the first word of the sentence, thereby switching into the second state which limits the choice of the second word…[4]”. This continues with each new state the speaker passes through having different limits and restrictions imposed on the choice of the next word by the grammar.

There are no spoken human languages that can be completely modeled using finite-state grammars and hence higher levels of formal complexity must be added. Phase Structure grammars are far more complex, and they can generate sentences that cannot be given by finite-state machines.  In one sense you can think of finite state machines as generating the infinite component of language and phase structure grammars as generating the complexity and variability.

Although it took me sometime to understand Chomsky’s grammatical hierarchy[5] and his presentation of phase-structure grammars and transformation rules I was drawn deeply to this approach and its possible applicability to real languages[6]

There is another related reason however, besides the contemplation of formal grammars and the minimalist program[7], for a linguist, ethnohistorian or curator interested in Mesoamerican languages to make his or her way to the City of Lights. Housed in that old and storied library on Ave Richelieu is a collection of rare and seldom seen pictorial codices, grammatical manuscripts, vocabularies, orthographic handbooks, and other materials brought together in the 19th century by a small cadre of French manuscript hunters whose interest in the languages of Mexico and Central America bordered on obsessive. 

This small group, composed of eccentrics and travelers like the Abbe Charles Etienne Brasseur de Bourbourg, Leonce Angrand and Joseph Aubin, collected, stole or somehow absconded with some of the most important records relating to the history of ancient Mesoamerican languages to survive from the period of contact  and into the 17th and 18th centuries[8]

Figure 1: Title Page of Bibliothèque nationale de France American Manuscript 63, Arte de la lengua quiche [o] utlatecat with the Ex Libris of Brasseur de Bourbourg. Everything from De Landa’s Relacion, the Popol Vuh and the Madrid Codex passed through his hands.
These collectors were moreover, not merely trophy hunters, but rather expressed through their collecting and in their writings a genuine concern for the preservation of manuscripts and for the ethnohistory and archaeology of the New World. Brasseur de Bourbourg writes in his letter from Guatemala to Rabinal that,

When the current president Rafael Carrera arrived to power his first concern was to recall the clergy, who although their influence was on the decline, were still a reminder of order and civilization.  Their monasteries, libraries and papers were all returned to them—but in what condition!  Most of the works from the libraries were no longer complete and had been eaten by worms.  Sullied manuscripts were falling to pieces covered in a pungent dust.  These are the gains that had been made by knowledge thanks to the revolution and liberalism in Guatemala. […]

For three months I was engaged in searching through what remained of these monastic riches. In the library of the University, I found, among the manuscripts of Father Francisco Ximénez a history in Kiché, with an incomplete translation by Ximénez.  I had copied both versions, while also trying to acquire an elementary knowledge of indigenous languages, when His Grace the Archbishop offered me the possibility of administrating the parish of Rabinal...[9]

The grammars and vocabularies that they collected form, in many instances, the basis for study of the historical development of important Central American languages like K’iche, Cakchiquel and Q’eqche. Ephemeral dictionaries, word lists, and grammars, like that of Domingo de Vico (Figure 2), which the anthropologist Robert M. Carmack has called, “the most important source for the study of K’ichean culture,” have only survived because of the activities of these collectors and bibliophiles.

Many of these important linguistic manuscripts have yet to be published and have been little studied in a systematic way. Although some scholars, most notably Munro S. Edmonson, had looked at all of the available major sources, no collation of the contents or of the linguistic variations found with in them has appeared. Edmonson’s now classic, but flawed, Quiche-English Dictionary does give some indication of the main dialectical and regional variations in K’ichee, breaking them up into the Eastern (centered on Rabinal), the Central (centered on Chichicastenango) and the Western (centered on Quezaltenango), but he goes no further. Several authors have called for and suggested that a comprehensive colonial K’ichee dictionary is needed which would preserve the original forms, by transcribing them in the now conventional orthography of the Academia de las Lenguas Mayas de Guatemala, and by noting mistakes of transcription and other possible hints to the manuscripts transmission found in the original (see Frauke Sachse, Documentaion of Colonial K’ichee Dictionaries and Grammars, FAMSI, 2007).

The bibliographic records for the manuscripts found in France, are described for the most part in the Catalogue des Manuscrits Americains de la Bibliotheque Nationale, written and compiled by the longtime curator of manuscripts Henri Omont and published in 1925, but his list gives little indication of the importance of these materials and no systematic details about contents of the originals. Other collections that also house important grammars and vocabularies are those of Leonce Angrand whose archive of notes and letters also survive.

Figure 2: A page from Angrand manuscript 9) in the Bibliothèque nationale de France

For many of these early French collectors and Americanists this was not simply an intellectual exercise performed from some comfortable library or institute in Paris. Most, especially Brasseur de Bourbourg, traveled widely and collected the manuscripts during long and sometimes difficult journeys in Central America. Some of the linguistic work accomplished by these early travelers and collectors is still relevant. Marc Zender, in his often quoted paper One Hundred Years of Nahuatl Decipherment, published in the PARI Journal in the Spring of 2008, writes that,

Aubin's monumental Mémoires sur la peinture didactique et l' écriture figurative des anciens Mexicains, first published in 1849, included lexical identification of over a hundred Nahuatl signs, the recognition of alternating logographic and phonetic spellings of the same names and a detailed study of the glyphic compounds [...] It remains even today a critical reference...

My own journey to these manuscripts starts in Paris years ago but recently, in fact, just a few days I finished looking at some of rarest of these manuscripts and imaging them from the ultraviolet to the infrared for the first time since their discovery more than a century ago. A group of us, led by Jerry Offner from the Houston Museum of Art, and Antonino Cosentino from Cultural Heritage Science Open Source, studied several manuscripts purchased by Joseph Marius Alexis Aubin (another of the great Mesoamerican manuscript hunters) and brought to Paris in the 19th century. For several days we photographed and imaged the very rare and almost never seen Codex Xolotl and Mapa Quinatzin, both written in Nahuatl. 

Fragment of the Codex Xolotl, Bibliothèque nationale de France
The hope is that by using these images we can begin to computationally reconstruct the manuscripts as they were originally made, revealing previously hidden and unread sections that might be of interest to scholars studying the history of the Valley of Mexico in the 15th century.

Antonino Cosentino from Cultural Heritage Science Open Source looking at section of the
Codex Xolotl (BnF, Paris)in Ultra Violet light

[1] For more on this see Fritz Staal, Universals, Studies in Indian Logic and Linguistics  (Chicago: University of Chicago Press, 1988); R. Briggs, “Knowledge Representation in Sanskrit and Artificial Intelligence,” AI Magazine 6(1985) 22-38 and especially Saroja Bhate and Subhash Kak, “Panini’s Grammar and Computer Science,” Annals of the Bhandarkar Oriental Research Institute 72 (1993) 79-94.
[2] Kadvany, John (2007). "Positional Value and Linguistic Recursion". Journal of Indian Philosophy 35: 487–520.
[3] For an important case study on these lexicons and the difficulties in using this important source of linguistic information see Frauke Sachse, “Reconstructing the Anonymous Franciscan K’ichee Dictionary,” Mexicon 31 (2009) 10-18.
[4] Noam Chomsky, Syntactic Structures (New York: Mouton de Gruyter, 1957) 20
[5] The Chomsky hierarchy is sometimes referred to as the Chomsky-Schutzenberger Hierarchy. See Noam Chomsky and Marcel P. Schutzenberger, “The algebraic theory of context free languages,” in Computer Programming and Formal Languages, edited by P. Braffort (Amsterdam: North-Holland Publishing, 1963) 118-161.
[6] For more on the theoretical grammars outlined by Chomsky see Howard Lasnik, Syntactic Structures Revisited: Contemporary Lectures on Classic Transformational Grammar (Cambridge: MIT Press, 2000) and Geoffrey Poole, Syntactic Theory (New York: Palgrave Macmillan, 2002).
[7] Minimalism is name of the current form of the generative enterprise and the Principles and Parameters approach. For more see Noam Chomsky, The Minimalist Program (Cambridge: MIT Press, 1995).
[8] A good and up-to-date description of these collectors, which is also sympathetic to the ethical boundaries of the 19th century, can be found in Wendy Kramer and George Lovell’s paper “Pillage in the Archives: The Whereabouts of Guatemalan Documentary Treasures,” Latin American Research Review 48 (2013) 153-167
[9]  Translation of Brasseur de Bourbourg’s letters provided by Katia Sainson from her forthcoming collection of Brasseur’s travel writings. (University of Oklahoma Press, Fall 2017)