Thursday, January 05, 2012

Random Walks Across the Atlantic:
Stochastic Processes and the Geometry of the Early Renaissance Portolan Chart

What the historian of cartography should be concerned with is a systematic study of the factors effecting error, and seek to establish their cause and variability and the statistical parameters by which error is characterized...
--J.B. Harley

When one is considering trying to model the accuracy of the Medieval and Renaissance Portolan chart it is useful to reflect upon what types of data might have been used in order to construct these charts, such as the one from around 1300-1320, shown below and which is part of the collections of the Geography and Map Division at the Library of Congress. If we assume that these charts are simply graphic displays of information measured by sailors and navigators we can ask ourselves what type of information might have been used in the charts construction and what is the statistical error that might reside in such measurements?[1].

Calculated isolines of rotation that mimic lines of magnetic declination

see my presentation at the LOC's Conference on Portolans Charts at:

Any model of measurement that includes measurement instruments, such as the compass or the hour-glass, can be thought of using classic measurment models which are composed of three parts.

1. a family of observables M (physical magnitudes like declination and longitude) each with a range of possible values.
2. a set states S...physical states of both the system measured and of the measuring system.
3. a stochastic response function P for each m in M and s in S, which is a probability measure of the range m with P to be interpreted as the probability that a measurement of m will give a value in E, if performed when the state is s [2].

So what does these mean for Portolan Charts? If, as I mentioned earlier, we think of a nautical chart as the graphic expression of empirical sailing measurements, we can ask ourselves how accurate is the data that went into the cartographic representation, and is there a way for us to compare the actual data, assuming it survives, with the chart in a way that is mathematically consistent and has significance tests. There are many surviving examples of log books from transatalantic voyages, but few if any statistcal studies of the actual data that was compiled in them. One important example of such a compilation was done by the cartographer Guillaime Delisle in 1705. He collected about 10,000 positional and declination measurements in a series of notebooks that still survive in the National Archives in Paris. This information has never been published, but contains a wealth of historical measurements taken at sea during the 16th and 17th centuries that might give us some idea of how accurate the data available to Portolan makers was.

Considering the date of Delisle's data, we must recognize that the positional measurements he cites are mostly based on dead-reckoning and astronomical navigation, coming as they do before the invention of the nautical chronometer. The fact that these measurements are based on dead-reckoning helps us to model the error involved because we can think of the error in positional measurement as serially correlated. Practically, this means that as a voyage proceeds the error in a particular positional measurement also incorporates the error found in the previous positional readings. The positional measurements were then corrected by the navigator when land was sighted, and hence the error forms a series of independent legs[3].

If we look at Delisle's data and a make a scatter plot of the individual errors in longitude accumulated as a function of time between points of land fixing, we can see by the figure below that the error forms a Gaussian distribution.

If we graph the error in positional fixation geographically using the difference in actual (modern) versus measured position we get a figure of the type shown below.

The fact that the scatter in the data is Gaussian and that it can be represented in figures like that shown above leads one to believe that the data on Portolan charts can be modeled using particular stochastic processes such as those known as the Random Walk and the Brownian Bridge[4].

According to the historical data found in most log books, two situations arise in the numbers that correspond to each of the models mentioned earlier:

1. a leg of a voyage starts from a known location and then a number of positional observations are made before the leg of the journey finishes with no geographical endpoint noted in the log book. This type of data fits the model of the classic Random walk.

2. a leg of a journey begins at a known starting point follwed by a number of positional observations and land-sightings, concluding at a known location. This type of data fits the pattern of a stochastic model known as a Brownian bridge[5]. In the case of the Random walk, the error pattern that emerges is the result of the accumulation of errors that are independent of each other, so-called independent increments. The idea of independent increments is applicable in the case of voyages where dead-reckoning was employed because the error contributes cumulatively to the positional uncertainty and the errors are not systematic but rather random with many causes.
We can therefore express the error accumulation by the above equation. Each time a positional measurement is made it increases the overall error by a small increment. Assuming a Gaussian distribution of the error leads to the summation,

The variance of this summation is the quantity that we are looking to model, as it will allow us to correlate the root mean square error for positional measurements taken from the log books, and that which would have been incorporated into the charts themselves. If we solve for the variance we can see that it is increasing with the length of the voyage, as we would of course expect from serially correlated errors.

When we actually graph the positional error from the log books against the models we find that the root mean square error for a voyage of, say, 50 days for example, is between 1 and 4.4 degrees with larger numbers for very long journeys. These two RMS minimums and maximums are shown in the log-log plot below. As it is very rare for a journey to last more than 5o days without a known positional fixation or land sighting, the Random walk model is probably representative of the error that might be found using the more mathematically complex Brownian bridge.

If we look at Portolan Charts that show the transatlantic regions, the Cantino Planisphere for example, we can compare our stochastic models with the calculations of scale error that we have accomplished using Huber tranformations.

Rotation and Scale Distortion on the Cantino Planisphere....for more on this see the Washington Post's Article on my research at:

The RMS found in the data is comparable with the RMS found on the charts which, depending on local variations, is between 3 and 5.2 degrees. Although this shows us clearly that the data found in the log books matches the error found on the charts, it does not say anything about how the Portolan makers compiled the information they used...a problem that still awaits a real theory.

[1]. For more on Medieval Portolan Charts see Tony Campbell's seminal article and survey in Volume 1 of the History of Cartography, "Portolan Charts from the Late Thirteenth Century to 1500".
[2]. See A.R.T. Jonkers, Earth's Magnetism in the Age of Sail, John Hopkins University Press, 2003 for more on the surviving forms of positional and magnetic data and his models in "Four Centuries of Geomagnetic Secular Variation", Philosophical Transactions (2000) 957-990.
[3]. The philosophy and probabilty of measurement processes have been the subject of any number of articles. A good modern treatment can be found in Bass van Fraassen's, Scientific Representation: Paradoxes of Perspective, Oxford University Press, 2008. For a more formal treatment one should also consult the relevant sections on probability in his book, The Scientific Image, Oxford Library of Logic and Philosophy, 1980.
[4]. Rabi N. Bhattacharya and Edward C. Waymire, Stochastic Processes with Applications, Siam Classics in Applied Mathematics, 2009.
[5]. An interesting example of the type of data that we are concerned with here can be found in J.M. Vaquero's study "A note on some measurements of geomagnetic declination in 1776 and 1778", Physics of Earth and Planetary Interiors 152 (2005) 62-66.