[skip to content] [site map] [general links] [quick links]

Ancient Norfolk. Norfolk before the Norman Conquest.

[skip to content]

home / historical discussions / a pilot statistical study of potential roman sites by place names

CAUTION! I did this study in 2003-4, as a pilot to looking at place name associations with known Roman roads. I looked at it again in 2007. I noted a number of problems in the original data, which I have now put right. Because it is a pilot study, you should use caution when using the results. I will follow up with a better study at a future date.

Please read the "criticisms" section near the end of this page.

a pilot statistical study of potential roman sites by place names.

introduction.

Note: I have been unable to meet all accessibility standards here. This is because I need to use special statistical terms. I have tried to simplify the page from the original one though.

Starting Christmas 2003, I carried out a grid square by grid square survey of Norfolk. I used the OS Explorer™ series of maps[1]. I looked for place names that may have an association with the Roman roads also shown on the maps. I finished this in August 2004. I publish a summary of the findings here, which I modified in August.

My aim was to look for any correlations between general Roman sites and modern place names.

summary.

I looked at place names on the same grid squares as Roman roads. Only one place name type I looked at showed a statistical association to a 0.01 confidence interval. This was the type; roads, tracks and associated features containing "Stone", "Stony", and similar. Hamlets and villages containing "Town" showed an association to a 0.1 confidence interval. Other place name types showed lower significance. Minor features containing "Street" showed a confidence interval of .102. But hamlets and villages containing "Street" in their name showed no more than chance association, at .22. Note that these findings do not show place names in adjacent grid squares. Nor do they show those at greater distances from Roman roads.

The survey also gives the frequency of the place names examined. You may use this to assess the statistical significance of clusters of place name types. I have also included download links to the source data.

methodology.

I studied each grid square on the OS Explorer™ series of maps. I noted each grid square containing a marked, known Roman road or course thereof. I also noted each containing a target place name of one of the types in table 1 below. I did this both on grid squares in general, and those on which a Roman road also occurred. I tallied these results and used binomial statistical analysis[2,3] to assess the results. I also checked that the analysis was right by computational simulation also. Finally, I used these results to estimate the statistical significance of the observed results, in table 3 below.

The algorithm for simulated comparison[4] ran 10,000 independent tests. Each test had the same number of simulated grid squares as there are real ones in Norfolk; that is 7,087. Each of the tests generated an integer number of simulated matches. These were where a square contained both a simulated Roman road and a simulated place name of the given type. The observed frequencies were used in the simulation. This produced a distribution of 10,000 data points, each simulating all 7,087 grid squares in Norfolk, for each place name type. (This takes less than 1 minute per place name on a high end PC, as of 2007.) The assumption behind the model was that both place names and Roman roads were randomly and evenly distributed.

The probability of any one grid square containing the target place name is given by dividing the number of observed place name squares into the total number of grid squares. These are given in the "Gen. freq." column of table 2 below. The sample size for the binomial distribution is the number of Roman road squares, 285. The observed frequencies of grid squares for the target place name that coincide with Roman road squares is given in the "Freq. Roman" column in table 2 below. Binomial theorem can then be used to work out the cumulative probabilities for chance association given the observed frequencies. A one-tailed hypothesis is used for signifance, to 0.01. If the cumulative probability for the observed frequencies are greater than 0.01, then the association is rejected as being "chance or negative association".

findings.

I first noted visually a possible association between the approximate place names shown in table 1 below, and the Roman roads marked on the maps.

Table 1. Candidates for place names associated with known Roman roads.
Abbreviation Description
beacon any place name including "beacon"
eccles, eagles any place name including "eccles", or "eagles"
fort any place name including "fort"
magna any place name including "magna"
major ston/e/y village or town name including "stone", "stony", "stan", etc.
major street village or town name including "street"
minor street road, track or other minor feature including "street" in name
parva any place name including "parva"
ston/e/y lane road or track including "stone", "stony", etc. in name
town any place name including "town"

At the time, I was unaware of:

  • the conventional association of names including "Strad-" or "Strat-" with Roman roads[5,6,7], and so these were not included in this study.
  • the etymology for the "Parva" and "Magna" place names[8], which is why I included them in the original study.

My first study of grid squares produced raw statistics for:

  • occurrences of the place name types,
  • frequency of grid squares containing known Roman roads.

I show these statistics is table 2 below. Please note that Norfolk is covered by 7,087 grid squares. I used this in working out frequencies in the "General Frequency" column. The frequencies in the "Frequency Roman" column are out of the 285 grid squares containing a known Roman road.

Table 2. Frequencies for place names, and of those associated with known Roman roads.
Abbr. Gen. Freq. Freq. Roman
roman road 0.0402 : 285 squares 1.0000 : 285 squares
beacon 0.0014 : 10 squares 0.0000 : 0 squares
eccles, eagles 0.0016 : 11 squares 0.0035 : 1 square
fort 0.0002 : 2 square 0.0000 : 0 squares
magna 0.0002 : 2 squares 0.0000 : 0 squares
other ston/e/y 0.0134 : 95 squares 0.0175 : 5 squares
major street 0.0093 : 66 squares 0.0105 : 3 squares
minor street 0.0106 : 75 squares 0.0175 : 5 squares
parva 0.0004 : 3 squares 0.0000 : 0 squares
ston/e/y lane 0.0030 : 21 squares 0.0140 : 4 squares
town 0.0021 : 15 squares 0.0070 : 2 squares

The results of the full study are shown in table 3. This gives the statistical significance of the place name type with association on the same grid square as a Roman road. You should not confuse this with the probability of finding a Roman road on a grid square with the place name type. The significance is all or nothing. This means that we either accept the association, or else reject it, across the whole map of Norfolk and not just on any individual grid square.

Table 3. Statistical significance of observed association with known Roman roads to 0.01 confidence interval.
Abbreviation Estimated Significance (Confidence Interval)
beacon 0.669 : chance or negative association
eccles, eagles 0.285 : chance or negative association
fort 0.923 : chance or negative association
magna 0.923 : chance or negative association
major ston/e/y 0.150 : chance or negative association
major street 0.220 : chance or negative association
minor street 0.102 : chance or negative association
parva 0.886 : chance or negative association
ston/e/y lane 0.009 : significant association
town 0.100 : chance or negative association

In summary, to a confidence interval of 0.01, only place names like "Stone Lane", "Stony Road", or similar show a statistically significant association with known Roman roads[2]. Place names including "Town" showed an unexpectedly high association, significant to 0.1. Though this would not conventionally be taken as statistically significant, and is also rejected in this study.

Most notable is that place names including "Street", which are conventionally taken as indicating the presence of a nearby Roman road, were even less statistically significant. Yet minor features such as "Street Plantation", or "Street Farm" had a confidence interval of 0.102, villages and hamlets including "Street" came out as just 0.22. I suspect that, like “Stone Lane” and so on, it suggests a smaller average distance from Roman road, rather than lack of association.

The study only considered place names on the same grid square as a known Roman road. This has several shortcomings. It may be that certain place name types have a greater statistical association, but at a greater average distance. A follow up study may investigate this possibility. I also suggest that a comprehensive, national study may be more reliable than a small scale study concentrating on one county.

criticisms.

I personally note these shortcomings of my study:-

  • The model assumed that Roman road grid squares were randomly scattered, which they are not.
  • I looked only for occurrences on the same grid square as a known Roman road. This approach therefore approximates distances from Roman roads. It also does not allow for associations at greater distances.
  • To this end, an approach such as examining the distributions of perpendicular distances from known Roman roads for each place name type may offer a better solution. I am unsure how significance could be worked out from such an approach. Others may wish to follow up on this.
  • I conducted the survey before realising that place names beginning "Strad-" or "Strat-" were conventionally taken to signify proximity to a Roman road. I also noted that place names ending "-mer" or "-mere" looked like having a high association also, but only after conducting this study. Others may wish to follow up on this. Note that "Cromer" is a 13th century renaming[5]. "Parva" and "Magna" are conventionally accepted as later mediaeval renamings[8].
  • I suggest that a more exacting and national survey by experts in statistics and mapping may be more reliable. Since conducting this survey, I have found only one study approaching this: http://keithbriggs.info/distance_to_roman_roads.html
  • I did not use any controls in this study; by this I mean place name types not considered to have any association with Roman roads.

conclusion.

The study suggests, somewhat tentatively, these possibilities:-

  • Roads, tracks and associated features containing "Stone", "Stony" and similar suggest to a 0.01 confidence interval the presence of a Roman road on the feature itself. This would conventionally be taken as statistically significant[2].
  • A follow up study may clarify the possibility that place names containing "Town" have an association to a proximate Roman road. I also noted a similar possibility with minor features including the name "Street".
  • The proximity of Roman roads to village or hamlet place names containing "Stone", "Stony", "Stan", and similar is possible. But it is not shown to have a significant association on the same grid squares. I can only speculate that a more rigorous follow-up study would decide whether these have an association to a greater distance. The same is noted for place names containing "Eccles".
  • A follow up study including greater distances may be able to explain the unexpected finding that villages or hamlets containing the name "Street" do not have any more than a chance association. Again, I can only speculate that a more rigorous follow-up study would decide whether these have an association to greater distance.
  • A more rigorous follow-up study may wish to also consider place names containing "-mer" or "-mere" or beginning "Strad-" and "Strat-", or similar, for an association to proximate Roman roads. I would also suggest considering including control types of place names expected not to have any association.

source data.

I have added these to the website:

  • source data,
  • the source code and packaged setup files for computational confirmation of the binomial statistical method used.

The links for downloading the zipped files are given below. CAUTION! Please note:

  • I retain copyright of the source materials. They may be downloaded for private or academic study, otherwise all rights reserved.
  • Although I do not expect there to be viruses or spyware contained in the files, I do not certify that they are virus or spyware free. Please be sure to use up-to-date virus and spyware scanners to check the files before using.
  • Using the computational method for calculating the binomial distributions is not strictly necessary. This simply confirmed that the binomial, theoretical approach I used was right. The outputs from my runs are contained in various CSV files; as is the comparison between the computed results and the theoretical ones.
  • The computational files are provided AS IS. I cannot guarantee that they will work on your computer, or that use of them will not result in loss or other damage. I cannot offer any technical support in installing or running the program.
  • The source code was written in Microsoft® Visual Basic® 6 (Learning Edition). The source code is included.

Source data, computed results and comparison to theoretical distribution: click here.

Source code for the computational study: click here.

Setup package for the computational study: click here.

That concludes the report on this study.

references.

[1] Ordnance Survey® Explorer™ series of maps, Norfolk only; numbers: 229, 230, 236, 237, 238, 250, 251, 252, OL40.

[2] Wackerly, Dennis D., Mendenhall III, William & Scheaffer, Richard L. (2002). Mathematical Statistics with Applications, 6th Edition. Duxbury.

[3] Calculated using the Open Office Calc application, v2.2.

[4] Calculated using a bespoke application written, tested and compiled in Microsoft® Visual Basic 6® (Learning Edition).

[5] Mills, A.D. ( ed.) (1991). Oxford Dictionary of British Place Names. OUP.

[6] Rye, James (1991). A Popular Guide to Norfolk Place-Names. Larks Press.

[7] Nottingham University.

[8] Whynne-Hammond, Charles (2005). English Place-Names Explained. Countryside Books.

this page links

offsite links.

I am not responsible for the content of offsite links. Please note that other sites may have their own policies on:

  • accessibility.
  • privacy.
  • terms of use.

Here are links to other web sites you may find helpful:

[plain english campaign]

[Valid XHTML 1.0 Strict]

[Valid CSS!]

[Level Double-A conformance icon,
        W3C-WAI Web Content Accessibility Guidelines 1.0]