Tuesday, November 13, 2012

Conceptual Quicksand:
Vagueness, Topology, and Mereology
 An Experiment in Bio-Biblio-Geographic Writing

An introduction to my forthcoming book, Cartography in the Age of Computer Simulation: lectures on the conceptual and topological foundations of GIS

It is obvious that an imagined world, however different it may be from the real one, must have something—a form in common with it.
                 --Ludwig Wittgestein,
                  Tractatus Logico-philosophicus

It all begins for me in a bookstore. I can still remember the day that I became interested in the underlying topology of space. I was in graduate school, a physics student, so it was not geographic space that first held my interest, but rather, space in the abstract and purely mathematical sense. A new book called 300 Years of Gravitation had just come out celebrating the publication of Newton's Principia Mathematica. I picked up the book while in the Princeton University Bookstore, and when I opened it, all that I can remember seeing is a series of illustrations that showed something called Everett Branching Space-time. I had never seen diagrams like this before.

The branching of space-time into different possible worlds made such an impression on me that I can still, more than 25 years distant, draw them from memory.  As it turns out they were part of a radical re-thinking of the mathematics of space-time by Hugh Everett, called the relative-state formulation, which is based on what has become known as the many-worlds interpretation and lots of topology.

Although I never looked into the Everett diagrams any further, topology came to be my main subject of study and over the next three years I devoured the classic works on the subject. In particular, Felix Hausdorff’s Set Theory and Nicolas Bourbaki’s General Topology became my close friends, as I started spending more time in mathematics than in physics departments. Topology, especially in its algebraic form, would later become quite important to me in my geographic work and can be formally defined as the study of qualitative properties of certain objects, called topological spaces that are invariant under a certain kinds of transformations.  I have written about the intersection of some of the classic theorems like the Brouwer Fixed Point and the Borsuk-Ulam Theorem and geographical problems. (see my paper How to Map a Sandwich: Surfaces, Topological Existence Theorems and the Changing Nature of Modern Thematic Cartography, 1966-1972)

Most of these have to do with applications in which the properties that we are interested in are invariant under a certain kind of equivalences, called homeomorphisms. To put it quite simply, for geographic purposes, topology is the study of the continuity and connectivity of continuous fields, networks and discrete objects.

Years later the same sort of questions brought up by the Everett diagrams and of the many-worlds interpretation came back into my thinking through a seminar with David Lewis, which concentrated on his theory of modal realism. Modal realism also deals with questions surrounding the plurality of worlds, although from a much more logical and less mathematical perspective. Of all the professors that I have had the pleasure of learning from it was Lewis who had the most profound effect on me. Lewis was a mathematical and philosophical renegade, and although firmly part of academia, was always putting forth new ideas that pushed the limits.

Today I still re-read his four books, On the Plurality of Worlds, Counterfactuals, Convention, Parts of Classes and his essays almost yearly, as the depth of their insights is boundless. I always think of cartography, at least in its modern computer incarnation, as a theory of possible worlds; a place where counter-factual simulations can be carried out. My own sense is that the actual maps that exist are but a tiny subset of the theoretical maps that could exist. These real maps are products of a very small number of trajectories through cartographic space, each with its own unique place in this mathematical construction. Every real map is surrounded by a tiny cluster of real or unreal neighbors who are its ancestors and descendants. Sorting out the real from the unreal is the purpose of geographic analysis.

Even though it is not directly applicable to geography and cartography my interactions with the mathematician Saul Kripke would be decisive for what I would spend many years engaged in reading. Kripke, while at Princeton, gave a series of seminars on Gödel’s Theorems which have become infamous for not only their density but also for their stunning originality. Saul Kripke is one of those creative geniuses who only come along once in a person’s lifetime. He became a professor at MIT while still an undergraduate at Harvard. His published writings are few and difficult to understand, and his lectures are even more so. Most of what he has written circulates around the mathematical logic community in manuscript.

Kripke’s seminar concentrated on how in 1931, the young Kurt Gödel single handedly changed the face of mathematics through his proof that its basic foundations could not be derived from the axioms of logic alone. Gödel’s theorems are simple in their conclusions but the insights that Gödel needed to have in order to prove them are the stuff of any mathematician’s dreams. Gödel’s first incompleteness theorem continues to fascinate me and I try to keep up on anything written about it, as it has deep implications for the foundations of computation and the development of algorithms. The idea that it is wrong to think that the perfectly natural notion that we can completely axiomatize simple arithmetic, still seems strange and otherworldly in my mind.

 In studying Gödel’s work Kripke found several alternate proofs, and his lectures and unpublished manuscript, ‘Non-Standard Models and Gödel's Theorem: A Model-Theoretic Proof of Gödel's Theorem,’ have circulated widely in manuscript form. So widely in fact, that the philosopher of science Hilary Putnam felt it necessary to publish a summary of the article in 2000. Putnam showed that while today we know purely algebraic techniques that could be used to show the same thing, Kripke actually used techniques to establish incompleteness that could have, in principle, been understood by nineteenth-century mathematicians. This kind of thing, at least to me, is truly beautiful stuff. It is this kind of retrograde analysis that makes looking back at the history of geographic analysis so rewarding.

To get back geography, it was Lewis, who first introduced me to the subject of mereology that is such a large part of my book on the foundations of GIS. Lewis, in Parts of Classes, sets out to provide a mereological foundation for the richer and more abstract field of classes found in set theory. Mereology in Lewis’ sense is simply a formal and mathematical theory that tries to discern general principles regarding the relationships of parts and wholes that provide the starting point for most of pure mathematics. When first approaching these ideas this level of abstraction may seem to have little relationship to cartography, but in fact it is critically important to the foundations of modern GIS and mapmaking, as these activities take place within an algorithmic and mereotopological formal framework.

In a Geographic Information System we are worried about keeping track of different kinds of objects and fields that inhabit our lived space and that have different dimensional and topological structure. So for example we have zero-dimensional points for cities, one-dimensional lines for roads and other networks,  two-dimensional polygons for regions and territories, three-dimensional spaces for the earth itself, and four-dimensional space-time structures when we add in temporality. As we build in other thematic forms of data we add continuous fields into the mix. (for more on topology of GIS see the ERSI White Paper, GIS Topology )

To keep track of all this in a computer is quite different from the drawing of lines of traditional cartography. We need to know deep mathematical things about the world’s spatial structure, like the overlap of roads with regions, the temporal extent of events and how boundaries are spatially related to the regions they bound. It might have been Nick Chrisman in his insightful article from 1978 called, ‘Concepts of Space as a Guide to Cartographic Data Structures,’ who first pointed out the deep conceptual connections between the mathematical structure of space and the data structures of computer mapping. Today cartography is at its base mereological and is much different from its former printed incarnation. These foundations have both profound mathematical and philosophical import that is just beginning to be sorted out by people like me who are interested in such problems.

Another one of the biggest influences on my thinking about geographical mereology is Achilles Varzi, professor of philosophy at Columbia University. Varzi’s two books, Holes and other Superficialities and Parts and Places: the structures of spatial representation treat in great detail the various formal and mathematical systems of mereology and topology.

In his books and papers Varzi gives the various formalizations and discusses logical structure of each of these systems and their philosophical import. One of the deepest conclusions of mereology is that there can be, just like in geometry, a large group of different axiom systems, which are all consistent with each other.  The important thing proposed by Varzi, in examining these various systems, is that none of them is alone strong enough axiomatically and logically to contain a full theory of spatial objects. It is here that topology comes into play and provides the link to a full formal theory of GIS, something he calls mereotopology. (for more on this see Casati and Varzi, Ontological Tools for Geographic Representation)

Varzi, in a long discussion we once had about the mathematics books that had been most important to us, once told me that if he was stranded on a desert island the title he would want to have with him would be Lattice Theory by Garrett Birkoff. I agreed, if we added The Elementary Concepts of Topology by Paul Alexandroff.  

 Mereotopology is composed of two parts and for logicians and mathematicians studying spatial structure at this level of abstraction these two parts are really two ways at looking at spatial entities. One of them considers part/whole distinctions, which is the job of mereology. Modern mereology is very much connected with various forms of ontology that philosophers have studied since Plato and Aristotle, and that were a bit of an obsession for medieval philosophers like Abelard and Aquinas. The problems of parts and wholes and their relationship to the identity of objects would not receive formal treatment however, until after Edmund Husserl published his Logical Investigations around 1900.

Classical mereology takes as its foundation the fact that any theory of spatial representation, geographic or otherwise, must consider the structure of the entities that inhabit the space. For geographers this is a critical point as one could doubt the usefulness of representing space either logically or mathematically independent of the entities that are in it.

The second part is that of connection and continuity. How are the various types of entities connected to the space they inhabit and to each other? This is the territory of topology, which studies the mathematics of connection. We can begin asking mathematical and ontological questions like, “What is the difference between the cup and the glass spread all over the floor after we drop it?” These kinds of questions are important to geographers, as they give us insight into how events are connected physically, and how they retain or loose their material identity over time. One must remember that all of this conceptual thinking must be formalized into algebraic structures within some computational framework.

Geographical space is much different than any abstract notion of bare space, which is infinitely extended and is an infinitely divisible continuum.  This conception of space has proved enormously fruitful in providing a framework for the physical sciences. Geographical space on the other hand is different and is divided into regions and populated with many kinds of objects. Regions themselves can be treated as abstract objects and their existence is entirely dependent on the existence of other more concrete objects.  As soon as space is partitioned like this the mathematical continuum loses its purity but acquires a degree of richness that is represented by sets of relations where space itself is composed of discrete and identifiable objects. It is these complex conceptual connections that mereotopology sets out to explore.

Formal mereotopological treatments, which really form the basic ontology of today’s GIS, have their roots in some of the debates surrounding the axiomatic foundations of geometry that took place at the turn of the last century. In the midst of all the problems stemming from the discovery and applications of non-Euclidean geometry, logicians like Alfred Tarski and Stanislaw Lesniewski, wrote classic papers on the mereology of objects. One of Lesniewski’s in particular, “Foundations of the General Theory of Sets,” from 1916, started me thinking about some of the problems of integrating time in geographic analysis as a continuum rather than a series of discrete values. A kinematics of cartography is the way I like to look at it.

 Many researchers are now looking into these kinds of representations and have started to think in terms of “geographic flows” and the kind of dynamical systems that would be required to picture these in a GIS. In mathematical terms this can be thought of as the difference between Eulerian and Lagrangian approaches. Eulerian models look at the evolution of a system, or a piece of geographic space, through time as series of discrete snapshots. Lagrangian models, on the other hand, follow some part of the system being modeled, continuously.

 On the geographic but still analytical side, perhaps the most important person for me, and for many others, who have over the years become interested in the foundations of GIS, was William Warntz. During the 1970s Warntz ran the Harvard Laboratory for Computer Graphics and Spatial Analysis, and with an extremely creative group of mathematicians and programmers, took the first steps towards creating modern GIS. Warntz, and other members of the lab, produced a series of important but now largely forgotten papers called, The Harvard Papers in Theoretical Geography. 

The papers themselves, which are really short books of about 75 pages, treated geographical and cartographic problems with a mathematical sophistication that was new. The papers utilized theorems from algebraic topology, from abstract algebra and other areas of pure mathematics to try to solve real world problems. Reading them, even today is not for the timid, as they are extremely dense and require much more mathematics than most geographers ever see. But they are thought provoking, their creativity is stunning, and these papers were the first things I ever read that made me want to become a geographer. I have read through the entire run many times.

The work done at the Harvard Lab was controversial and revolutionary and it also changed the face of cartography for ever. Warntz’s words about the changing face of discipline, to me at least, still ring true, even though geographers in most academic departments today might not fully agree.

We now look upon maps not only as stores for spatially ordered information, but also as a means for the graphical solution of certain spatial problems for which the mathematics proves to be intractable, and to produce the necessary spatial transformations for hypothesis testing....The modern geographer conceives of spatial structures and spatial processes as applying not only to such things as landforms....but also to social, economic, and cultural phenomena portraying not only conventional densities but other things such as field quantity potentials, probabilities, refractions etc. Always these conceptual patterns may be regarded as overlying the surface of the real earth and the geometrical and topological characteristics of these patterns, as transformed mathematically or graphically, thus describe aspects of the geography of the real world,

We recognize yet another role for maps. In the solution of certain problems for which the mathematics, however elegantly stated, is intractable, graphical solutions are possible. This is especially true with regard to "existence theorems". There are many cases in which the graphical solution to a spatial problem turns out to be a map in the full geographical sense of the term, "map." Thus a map is a solution to the problem.

These days my interests have become even more theoretical and I mostly find myself reading and writing about the two poles of geographic information called the object and field approaches. These two approaches to the questions of what geographic objects are and how can they be portrayed in an algorithmic way are now at the forefront of geographic research.

These distinctions are important because the one element that has been missing from GIS analysis is that of time. GIS, in the past, has typically treated time in discrete units, asking questions like what is the temperature or population at some moment in time and then graphically displaying them on a map. For some simple problems this works quite well and is a common form of thematic mapping.  But when one is dealing with rapidly changing fields (in the fullest mathematical sense) and huge data sets that can be modeled with complex non-linear differential equations these discrete units tell us very little. We want to visualize spatial evolution of whatever we are studying in all its temporal richness. We really want to model, predict and visually represent real world events. This kind of modeling does away with all the traditional forms of cartography and makes GIS a true computational tool.

Adding temporality to geographic information systems poses real serious problems both from a philosophical and technical perspective. From the philosophical point of view there are questions of identity. How do we represent geographic objects that change over time? How much change can take place before these objects are no longer the same? Do spatial objects have temporal parts? How do we keep track of these parts? Strangely enough, these are some of the very problems that Plato talks about in his dialogues like the Theaetetus and the Meno and that philosophers like David Wiggins in his groundbreaking Sameness and Substance Renewed, have been theorizing about for many years.  

 The Theseus Ship paradox, which Plato writes about in the Meno, is a problem that brings up the question of whether an object which has had all its component parts replaced remains fundamentally the same object. In geography this is important because we are constantly seeing the objects of our study change and evolve. This kind of change takes place in the material, conceptual and bio-geographical sense continuously in the real world, and is the source of many of the philosophical and analytical conundrums that confront the foundations of GIS.

 The other much related question that interests me and that is quite difficult to explain to someone not versed in the theoretical underpinnings at this level of abstraction, is geographic vagueness. Vagueness enters into geographical analysis when we ask questions like, “Are the world’s forests disappearing?” or “Is desertification increasing?”

Vagueness is ubiquitous in spatial and geographical concepts and tends to persist even where steps are taken to give precise definitions. To answer questions like this we need formal and topological definitions of what a forest and a desert is. When one begins to think about questions of this type, other deeper questions appear. What do we mean by a forest’s boundary? How does it grow or shrink? How many trees make up a forest?  Must a forest or desert be self-contained, or can it consist of several disjoint parts?  These questions may seem trivial, but in fact they are quite difficult. The definitions we come up with must be subject to quantification in order for us to build algorithms that allow a GIS to actually give us real world answers.

These are exciting and theoretically complicated problems which are also formally very difficult to program. How we answer them in the future will determine the strength of our newly evolving spatial models and the role that geographical analysis will play in policy decisions regarding things like global warming, resource allocation and urban planning. They are also the very questions whose theoretical core I hope to deal with in this book.

Saturday, September 15, 2012

Fourier Finds Caesar:

A Study in the Physical Evidence of Roman Surveying and Land Usage Using Image Analysis and Periodic Functions

Landscapes are dynamic constructions, with each community and each generation imposing its own cognitive map on an anthropogenic world of interconnected morphology, arrangement, and coherent meaning.

--Kurt Anschuetz, An Archaeology of Landscapes

Finding the physical and epigraphical remains of Roman surveying and centuriation throughout the Roman world remains an area of research that currently engages only a few historians of cartography. In the past the practice of Roman surveying was studied by many important Roman historians like Theodore Mommsen and Max Weber[1]. There remain however, many difficult and unanswered questions about Roman cartography, and the lack of actual extant maps has made me begin to look elsewhere for information that might shed light on its origins and methods. My current research on this problem employs GIS and image analysis to historical aerial photography and remote sensing imagery. It is my hope that in the near future it will produce the first complete map of North African sites that shows both the extent and orientation of Roman mapping. Several authors, such as Rita Compatangelo [2], J.W.M. Peterson [3], D.J. Bescoby [4] have pioneered the use of various mathematical transforms in the analysis of remote sensing imagery for the purpose of finding new sites and orientations. I have started to apply these methods in combination with edge detection algorithms in order to calculate the extent of Roman surveyng and the various types of orientations associated with the physical remains of limites.

The physical remains of Roman centuriation take on a number of sizes and orientations, but are typically discovered through the outlines of the limites that seperated the various regions from one another. Limites or Limes (singular) can be defined as a man-made boundary or balk, that is uncultivated and wide enough to form a road or pathway, which divided centuriae or other land division units from one another. These can take many forms from simple paths all the way to larger structures like the main roads of the decumanus and kardo maximus that were centrally located in a surveyed region. The feature that makes these remains discoverable through the use of transformational techniques is the fact that they appear on the landscape as periodic phenomena. This simply means that the pattern of centuration repeats itself over areas of the landscape, showing up as linear freatures that appear in remote sensing imagery as periodic pattern of grids over fixed distances. One of the most useful ways to study periodic phenomenon, at least from a mathematical perspective, is through the use of Fourier Series and transforms [5]. These transforms model any periodic phenomenon that we might be ineterested in as a infintie series of cosine and sine functions of varying frequencies.

This sum can be expressed more conviently through the use of complex exponentials which are easier to work with algebraically. Using a discrete and algorithmic version of the Fourier transform, known as the fast Fourier transform (FFT), Peterson and Compatangelo, in truely ground breaking papers, showed that one could calculate the most common period found in a group of periodic linear features found on more modern maps. I say the most common period, because many of the linear features found on the landscape today are subdivisions of modern and medieval origin, and it is sometimes extremely difficult to determine the date of the features whose period the transforms are measuring. As an illustrative example, one can think of the linear features found in the landscape as a more complex superposition of the images in the figure shown below. In the figure we see that there are linear features that repeat themselves and that in each of the figures they have different periods of repetition. One of the figures also has a different orientation than the other two. What we see in the landscape is typically a combination of all of these in the same region and on the same remotely sensed image. Peterson used the FFT to generate periodograms that produced the most common harmonics in a series of regions dislaying linear features that he took from 19th Century Ordnance Survey Maps of Scole-Dickleburgh area in South Norfolk. What the periodogram does is allow one to pick out the frequency of the linear features and the larger harmonics. An simple example of this is shown in the figure below. The periodograms not only show the most common distances between the linear features found on the map or on the satellite image, but they also yield a series of harmonics that might have the physical meaning. Larger harmonics beyond the most common one may show subdivisions in the original survey or different grid patterns from the type one is looking for. Because we are not only interested in the distances between linear features in the landscape buy also in their orientation, we have employed a second technique known as a Radon transform. This technique has been used by Bescoby to detect Roman boundaries in aerial photographs in Albania. The strength of this method is that it allows for the calculation of the angle and hence the orientation of the series of linear features. When combined with the two-dimensional version of the Fourier transform, this allows a complete characterization of the grid formed by the limites of Roman surveying. The Radon transform can be expressed by the equation below and its operation can be seen in the graph shown in the figure. Finding the size and orientation of linear features in a landscape lets us compare what we have calculated with known patterns of centriation found in literary and epigraphical evidence, such as that found in the 5th or 6th century Corpus Argimensorum. According to Hyginus, one of the authors found in this compilation of Roman surveying texts, the typical layout for an area of surveyed land is shown below. The letters and numbers define the parcel of land and very often appear as eppigraphic inscriptions on surviving boundary stones. The main intersection shown in the figure is that of the kardo and decumanus maximus which are the beginning points of any Roman survey. A typical kardo or decumanus can be seen in the photograph below that I took in a heavily centuriated area around Carthage just north of Tunis. The distances that the Romans typically used and their various names are shown in the schematic, with a century measuring 2400 Roman feet or about 705 meters. Many of these grids would however have been further subdivided in a variety of schemes that are not easily dated using physical evidence. The research that I have been doing has concentrated its efforts on the non-coastal regions of North Africa, taking in parts of Tunisia, Algeria and Libya. Below are two satellite images of the areas around Dougga and El Jem in Tunisia both of which are the sites of important Roman towns and ruins.

In both of these photographs one can see a variety of linear features that may or may not be associated with Roman activity.

It would be interesting to know the extent to which the Romans actually produced maps of these areas considering their overall importance to the history of Roman colonization and occupation in the region. El Jem for example contains one of the best preserved Roman colesseums in all of Africa.

To apply these algorithms to remote sensgin imagery it was necessary to clean them up and enhance the linear features using edge detection algorithms. An example of this is shown in the figure below.

Once this is accomplished and the various transforms have been applied we can begin to compare the results with known grids based on our knowledge of Roman practice derived from the epigraphic and literary evidence.

Using ArcGIS I have generated maps with overlays showing the orientation and extent of the surveyed region under study. The map below shows a single division into centuriae around Enfida, Tunisia. The map below shows both a division into centuriae and into a second subdivison which probably dates from a later time. [1] There are many studies of Roman surveying. For a modern bibliography see Brian Campbell, The Writings of the Roman Land Surveyors, Society for the Promotion of Roman Studies, 2000.

[2] Rita Compatangelo, Un Cadastre De Peirre Le Salento Romain, Annales Litteraires de l'Universite de Besancon, 1989

[3] John Peterson, Fourier Analysis of Field Boundaries, in G. Lock and J. Moffet, CAA91: Computer Applications and Quantitative Methods in Archaeology 1991. BAR International Series s577. Oxford, 149-156.
[4] D.J. Bescoby, Detecting Roman land Boundaries in aerial photgraphs using Randon transforms, Journal of Archaeological Science (2006) 33, 735-743. See also, J. S. Bailly et'al "Agarian Landscapes linear features detection: application to artificial drainage networks" International Journal of Remote Sensing 29 (2008) 3489-3508 and E. Magli, et.al. Pattern Recognotion by means of the Radon transofrm and the continuous wavelet transform, Signal Processing 73 (9990 277-289.

[5] See any of the recommended books on Fourier Analysis on this blog or for a good introduction to the subject see L. Solymar, Lectures on Fourier Series, Oxford University Press, 1988.

Sunday, August 12, 2012

Epigraphic Evidence for Large-Scale Roman Mapping

What survives of their treatises [of the Roman surveyors] can appeal to few readers now, but so diverse are the manuscripts that preserve it, so many the names associated with its preservation, that no text opens the window wider on the transmission of Latin literature from Antiquity to print…
--L.D. Reynolds
Texts and Transmission

Besides the epigraphic cadastres from the colony of Orange in the South of France, a small fragment of which is shown in the figure below, there is other epigraphic evidence that the Romans actually made detailed maps of their territories. Although extremely rare, there are several examples of epigraphic inscriptions where explicit mention is made of the word "map'.

In the Corpus Agrimensorum, a compilation of Roman Surveying manuals from the 6th century, there are several words used for map. Writing in the text the surveyor Siculus Flaccus says,

The maps are given various names: some are set up on wooden tablets, others on bronze, still others on skins, although ‘map’ is their generic term, they are sometimes called ‘territory’, ‘centuriation,’ ‘demarcation,’ ‘limitation,’ ‘grid-pattern,’ figures…”
Hence Latin words such as Forma, tabula, pertica, typon, and metatio all appear to mean map.

Epigraphic evidence from Tunisia shows other examples of the word Forma being used in this fashion. In the Corpus Inscriptorum Latinarum (CIL) we find two examples in which maps are mentioned as having been made or that are being referred to. The figure below shows CIL 22788, an inscription from Henchir Chenah, that is carved on four sides of stone.

CIL 22788

The part of the inscription that we are interested in reads:

sec]undu(m) [f]orma(m) missa(m) sibi ab posu[it]

and records a boundary settlement made "in accordance with the map".

A second inscription from Henchir ez Zoubia, CIL 23910, shown below records a longer inscription referring to a boundary stone set up between the land of two communities.

The inscription reads:

positum sic [secum] dum forman [um mar]

This refers to the fact that the settlement was set up once again "according to the map". From the remainder of the inscription we might imply that this was done by a soldier (perhaps a surveyor) attached to the XIIi Urban cohort based in Carthage.

CIL 23910

Tuesday, February 07, 2012

Modeling Roman Land Use and Environment:
Epigraphy, Servitudes, and Game Theory

...we have all too often lacked, or failed to consider, conceptual frameworks of theory in which to examine Man's relationship to his environment, the manner in which he weighs the alternatives presented, and the rationality of his choices once they have been made....
---Peter Gould

In the study of Roman agricultural patterns it is important to have a conceptual framework in which to place the fragmentary information and evidence that is available from epigraphy, Roman law, and landscape archaeology. For the past few months I have been experimenting with Game Theoretical Models and the concept of Nash Equilibrium trying to see what type of land use models would arise.

The basis of game theory was first laid down in the late 1940's by the mathematician John von Neumann and the economist Oskar Mogenstern in their now classic book

the Theory of Games and Economic Behavior. In the book von Neumann gives the proof of the Minimax Theorem, which is central to game theoretic reasoning and that he first approached in 1928. In the 1944 book, von Neumann placed the theorem within the context of linear inequalities and the theory of convexity, which was later updated with more formal proofs of equilibrium states by John Forbes Nash.

My current work on modeling land use and some of the environmental decisions made by Roman farmers takes its real start however, from a conversation that I had with Waldo Tobler, Emeritus Professor of Geography at California, Santa Barbara, about 8 years ago. I had just read Peter Gould’s paper on African farmers in General Systems Theory, a paper that would later lead me to his seminal work, Man against the Environment. I knew that Tobler was close to Gould and that he was also playing around with some game theory during these years, and so I asked Tobler about the paper. What was most impressive to me in all this was not really Gould’s mathematics, but rather his vision of what game theory might be able to do in geographical sciences, that even simple matrix games had a spatial component that few geographers had thought to utilize.

One of the things that Gould wrote and that struck me as profound was that , “we have all too often lacked, or failed to consider, conceptual frameworks of theory in which to examine Man's relationship to his environment, the manner in which he weighs the alternatives presented, and the rationality of his choices once they have been made.” The rationality part instantly jumped out at me. As you may or may not know, the idea of rationality is an area of hot debate when it comes to questions of the Roman economy. There are many scholars, especially after Finley’s seminal book called The Ancient Economy, who believe that to consider Roman farmers and landowners as ‘rational’, in the sense of their maximizing the yield from their farms and thinking about market forces, is to project too much of a modern conception of a market economy onto the past. More recently however, some scholars like Dennis Kehoe, Cynthia Jordan Bannon and D. W. Rathbone, using legal inscriptions and the everyday account books of farms that survive as papyrus fragments, have started to use economic models and things like the theory of the commons to talk about Roman markets and agricultural estate management. Each of them in their own way incorporates many of the terms and categories of game and decision theory in their analysis. Perhaps the best book that accepts and summarizes the presence of ‘rational’ actors in the Roman economy is a book by Paul Erdkamp, entitled, The Grain Market in the Roman Empire: a social, political and economic study. Erdkamp puts forward many different models in the book, and summarizes the economic theory in his historical examinations and reconstructions. His is the sort of book that makes you anxious when you read it, as it gives you a good idea of how much you do not know and how long it takes to make any real progress in this area.

My own models are simply extensions of this kind of work. One group of Gould’s papers, from which my research certainly takes its inspiration, was written by him in the 1960's. His papers, "Wheat on Kilimanjaro: the perception of choice in game and learning model frameworks," and "Man against His Environment: a game theoretic framework", were among the first attempts to use the concepts of game theory and Nash equilibrium to look into agricultural land use. These papers, and a few others, were also discussed in an early review article on these methods written by David Harvey, "Theoretical Concepts and the Analysis of Agricultural Land-Use Patterns in Geography." It is in fact from Harvey’s book, Models in Geography that my concept of geographic model derives.

Harvey asserts, in his review article on agricultural land use, that at the time he was writing, many geographers tended to ignore theoretical breakthroughs from other disciplines, mainly on the "grounds that they proved too abstract to help in the search for unique causes of specific events." To counter this he quotes from William Bunge, whose book Theoretical Geography transformed geography and opened up an analytical window for the field, suggesting a more theoretical and inherently mathematical approach to the study of geographical and spatial distributions. To me Bunge’s book is the most important work of geography in the 20th century and I still mine it for inspiration.

Most of Harvey's paper is dedicated to outlining the requirements for a set of theoretical and conceptual elements to constitute a model in geography. A model, according to Harvey, requires a set of relationships to be established that somehow link the input, status and output variables in a specific way. This linkage must quantify the model mathematically in order for it to be tested. For Harvey, the relationships of the variables in the model can be of three distinctive types:

1. Deterministic relationships which specify cause and effect sequences.
2. Probabilistic relationships which specify the likelihood of a particular cause leading to a particular effect.
3. Functional relationships which specify how two variables are related or correlated without necessarily having any causal connection at all.

For agricultural models Harvey makes a distinction between two types of frameworks, one in which the underlying structure is normative and therefore, describes what ought to be under certain assumptions. The second, is descriptive, and describes what it is that exists. These distinctions are extremely important when we try to interpret game theoretical models, especially in something as difficult to conceptualize as the Roman economy.

In his early research Gould, using a normative game theoretic model, studied a group of African farmers around Kilimanjaro and analyzed how they decided what to plant in varying environmental conditions. Gould understood the patterns of land-use and the choices made by farmers are the result of decisions made either individually or collectively and that it might be useful to try to model those decisions in a game theoretical framework. In Gould's models the environment is one player and the farmer is another. Each of the players is faced with a number of different strategies the solution of which is the game's equilibrium. Using simple matrix games he was able to construct cartographic representations of various equilibrium alternatives that could be compared to what was in the fields.In his early research, Gould studied a group of African farmers around Kilimanjaro using decision theory to analyze how they decided what to plant in varying environmental conditions. Gould understood the patterns of land-use and the choices made by farmers are the result of decisions made either individually or collectively and that it might be useful to try to model those decisions in a game theoretical framework. In Gould's models the environment is one player and the farmer is another. Each of the players is faced with a number of different strategies the solution of which is the game's equilibrium.

Importantly, Gould realized that the game theory of the time was still algorithmically primitive and that his results determined neither how the farmers actually behaved nor how they should have behaved in an absolute sense, but rather how they should behave if they want to achieve particular results. In strategic games, such as the one Gould proposed in his papers, Nash equilibria are a set of actions amongst the payers that lead to a steady state. It is a position in the game in which each player holds the correct expectation about the other player and behaves and acts rationally according to his choices. Gould uses the simple graphical solutions to the matrix games he creates which I found so attractive early on in Harold Kuhn’s lectures. For more on Kuhn and John Nash watch the video of a recent seminar they gave together at Princeton, here.

The concept of equilibrium is not so straightforward here as one might think, and it can be interpreted in several ways. For example, when we say that a physical system is in equilibrium we might mean that it is in a stable state, one in which all the causal forces internal to the system are in balance. This is the traditional economic meaning of equilibria. The variables are dynamic however, and the balance between them that makes up the equilibrium can be thought of as networks of mutually constraining relations. Equilibria can then be considered as endogenously stable states of the model. Some scholars however, interpret game theoretic equilibria as being explanatory of the process of strategic reasoning alone. For them a solution must be an outcome that a rational agent would predict using the mechanisms of rational computation alone. The meaning of equilibrium states is still a matter of discussion in the literature of game theory and has interesting philosophical implications to how we view and interpret what the models tell us outside of their mathematical formalism. (For more on the interpretation of game theoretical results see Ariel Rubinstein's seminal paper Comments on the Interpretation of Game Theory or the Philosophy of Game Theory by Grune-Yanoff.

The current models I am working with are of course much more complex than anything Gould could have considered, as he lacked both the mathematics and the computing power. New techniques like quantal response functions, which allow us to look at probable actions, are much more powerful and yield much more interesting results. They were first introduced by McKelvey and Palfrey in the late 1990s and considered mathematically for the possibility that the players will make mistakes and therefore they give more realistic results than anything Gould imagined, at least we hope they do.

The power of models in historical geography is that you can look at many different scenarios and compare them with the little actual historical data you have. I would never assert that what I am doing actually gives me any definitive answers on what decisions Roman farmers made or how they planted, rather they show me what possibilities there were and how to rank them. Most importantly however, they greatly inform my thinking about the Roman economy in its most empirical form, and since I do not have the disciplinary constraints on my ideas that an economic historian might, I can push the limits of the models for purely theoretical and curiosity reasons.
My hope is that these methods will yield an 'experimental' historical geography, an acceptance of simulation as a method in historical studies. These simulations have the potential to shed light on the decision alternatives that face farmers and estate owners acting within primitive or developing economies. They give us a glimpse into how historically farmers interacted with their environment on a mainly cognitive level, allowing us to consider the choices they made and how their decisions affected the landscape around them. This to me, and to other geographers before me, like Gould and Harvey, is certainly a central geographical question.

For those interested I am using a software package that can calculate the Nash equilibria for games with large numbers of players, or in this case environmental variables called GAMBIT.
It is an open source program and you can have a great deal of fun experimenting with variables and how they change the equilibrium outcome.

Thursday, January 05, 2012

Random Walks Across the Atlantic:
Stochastic Processes and the Geometry of the Early Renaissance Portolan Chart

What the historian of cartography should be concerned with is a systematic study of the factors effecting error, and seek to establish their cause and variability and the statistical parameters by which error is characterized...
--J.B. Harley

When one is considering trying to model the accuracy of the Medieval and Renaissance Portolan chart it is useful to reflect upon what types of data might have been used in order to construct these charts, such as the one from around 1300-1320, shown below and which is part of the collections of the Geography and Map Division at the Library of Congress. If we assume that these charts are simply graphic displays of information measured by sailors and navigators we can ask ourselves what type of information might have been used in the charts construction and what is the statistical error that might reside in such measurements?[1].

Calculated isolines of rotation that mimic lines of magnetic declination

see my presentation at the LOC's Conference on Portolans Charts at:

Any model of measurement that includes measurement instruments, such as the compass or the hour-glass, can be thought of using classic measurment models which are composed of three parts.

1. a family of observables M (physical magnitudes like declination and longitude) each with a range of possible values.
2. a set states S...physical states of both the system measured and of the measuring system.
3. a stochastic response function P for each m in M and s in S, which is a probability measure of the range m with P to be interpreted as the probability that a measurement of m will give a value in E, if performed when the state is s [2].

So what does these mean for Portolan Charts? If, as I mentioned earlier, we think of a nautical chart as the graphic expression of empirical sailing measurements, we can ask ourselves how accurate is the data that went into the cartographic representation, and is there a way for us to compare the actual data, assuming it survives, with the chart in a way that is mathematically consistent and has significance tests. There are many surviving examples of log books from transatalantic voyages, but few if any statistcal studies of the actual data that was compiled in them. One important example of such a compilation was done by the cartographer Guillaime Delisle in 1705. He collected about 10,000 positional and declination measurements in a series of notebooks that still survive in the National Archives in Paris. This information has never been published, but contains a wealth of historical measurements taken at sea during the 16th and 17th centuries that might give us some idea of how accurate the data available to Portolan makers was.

Considering the date of Delisle's data, we must recognize that the positional measurements he cites are mostly based on dead-reckoning and astronomical navigation, coming as they do before the invention of the nautical chronometer. The fact that these measurements are based on dead-reckoning helps us to model the error involved because we can think of the error in positional measurement as serially correlated. Practically, this means that as a voyage proceeds the error in a particular positional measurement also incorporates the error found in the previous positional readings. The positional measurements were then corrected by the navigator when land was sighted, and hence the error forms a series of independent legs[3].

If we look at Delisle's data and a make a scatter plot of the individual errors in longitude accumulated as a function of time between points of land fixing, we can see by the figure below that the error forms a Gaussian distribution.

If we graph the error in positional fixation geographically using the difference in actual (modern) versus measured position we get a figure of the type shown below.

The fact that the scatter in the data is Gaussian and that it can be represented in figures like that shown above leads one to believe that the data on Portolan charts can be modeled using particular stochastic processes such as those known as the Random Walk and the Brownian Bridge[4].

According to the historical data found in most log books, two situations arise in the numbers that correspond to each of the models mentioned earlier:

1. a leg of a voyage starts from a known location and then a number of positional observations are made before the leg of the journey finishes with no geographical endpoint noted in the log book. This type of data fits the model of the classic Random walk.

2. a leg of a journey begins at a known starting point follwed by a number of positional observations and land-sightings, concluding at a known location. This type of data fits the pattern of a stochastic model known as a Brownian bridge[5]. In the case of the Random walk, the error pattern that emerges is the result of the accumulation of errors that are independent of each other, so-called independent increments. The idea of independent increments is applicable in the case of voyages where dead-reckoning was employed because the error contributes cumulatively to the positional uncertainty and the errors are not systematic but rather random with many causes.
We can therefore express the error accumulation by the above equation. Each time a positional measurement is made it increases the overall error by a small increment. Assuming a Gaussian distribution of the error leads to the summation,

The variance of this summation is the quantity that we are looking to model, as it will allow us to correlate the root mean square error for positional measurements taken from the log books, and that which would have been incorporated into the charts themselves. If we solve for the variance we can see that it is increasing with the length of the voyage, as we would of course expect from serially correlated errors.

When we actually graph the positional error from the log books against the models we find that the root mean square error for a voyage of, say, 50 days for example, is between 1 and 4.4 degrees with larger numbers for very long journeys. These two RMS minimums and maximums are shown in the log-log plot below. As it is very rare for a journey to last more than 5o days without a known positional fixation or land sighting, the Random walk model is probably representative of the error that might be found using the more mathematically complex Brownian bridge.

If we look at Portolan Charts that show the transatlantic regions, the Cantino Planisphere for example, we can compare our stochastic models with the calculations of scale error that we have accomplished using Huber tranformations.

Rotation and Scale Distortion on the Cantino Planisphere....for more on this see the Washington Post's Article on my research at:

The RMS found in the data is comparable with the RMS found on the charts which, depending on local variations, is between 3 and 5.2 degrees. Although this shows us clearly that the data found in the log books matches the error found on the charts, it does not say anything about how the Portolan makers compiled the information they used...a problem that still awaits a real theory.

[1]. For more on Medieval Portolan Charts see Tony Campbell's seminal article and survey in Volume 1 of the History of Cartography, "Portolan Charts from the Late Thirteenth Century to 1500".
[2]. See A.R.T. Jonkers, Earth's Magnetism in the Age of Sail, John Hopkins University Press, 2003 for more on the surviving forms of positional and magnetic data and his models in "Four Centuries of Geomagnetic Secular Variation", Philosophical Transactions (2000) 957-990.
[3]. The philosophy and probabilty of measurement processes have been the subject of any number of articles. A good modern treatment can be found in Bass van Fraassen's, Scientific Representation: Paradoxes of Perspective, Oxford University Press, 2008. For a more formal treatment one should also consult the relevant sections on probability in his book, The Scientific Image, Oxford Library of Logic and Philosophy, 1980.
[4]. Rabi N. Bhattacharya and Edward C. Waymire, Stochastic Processes with Applications, Siam Classics in Applied Mathematics, 2009.
[5]. An interesting example of the type of data that we are concerned with here can be found in J.M. Vaquero's study "A note on some measurements of geomagnetic declination in 1776 and 1778", Physics of Earth and Planetary Interiors 152 (2005) 62-66.