1. Introduction
One of the rapidly growing application areas of Geographical Information
Systems (GIS) is in the analysis of crime. Crime mapping has made rapid advances
in recent years with regard to data availability and analytical techniques. Many of
the papers presented to the annual US National Institute of Justice (NIJ) Crime
Mapping Research Center (now the Mapping and Analysis for Public Safety
program) conferences (www.ojp.usdoj.gov/nij/maps/) regularly discuss analytical
techniques and analyses performed on high volumes of individual crime locations.
The mapping of high volume crime data is made possible through the automated
geocoding of address level data extracted from police recorded crime databases. As
a result point level analytical tools are now coming to the fore. New aggregation
techniques for point pattern data, based on Local Indicators of Statistical
Association (LISA), have been developed (Chakravorty 1995, Ratcliffe and
McCullagh 1999, Unwin 1996) and spatial crime analysis by law enforcement is
now a substantial market for GIS companies.
Crime is an inherently spatial phenomenon and crime mapping tends to be
point-specific. While some crimes are more difficult to map (internet fraud, tax
evasion and some motoring offences such as driving without a licence), the majority
of criminal activity and day-to-day incidents that police are required to respond to
can be analysed spatially. The location of incidents which have to be mapped are
usually well known: businesses have thefts at specific sites, residential burglaries
occur at houses, and street crimes (assaults and vehicle crimes) often occur outside
premises with known addresses. The process of geocoding—turning an address into
a point on a map—is therefore of vital importance in crime mapping. Any error in
the initial geocoding process will translate into compounding errors as the
analytical and dissemination stages of police intelligence work are undertaken.
Moreover some crime sites are not geocodable in that the address information
presented to the crime analyst contains insufficient information to determine the
incident location. The reality of modern crime analysis is that while crime mapping
is an enlightening and practical intelligence tool at many levels, the analyst rarely
has time to track down the location of ungeocoded incidents and completely
successful geocoding is not the norm. Crime maps, while they may not say as such
on any output, are rarely created from 100% of the original data.
This paper statistically tests the accuracy of thematic crime maps generated
from data sets with incomplete geocoding in order to arrive at a first estimate of a
reliable minimum geocoding level. A Monte Carlo simulation of a declining
geocoding hit rate (the percentage of unit records in a crime database that are
successfully geocoded) is combined with a statistical analysis of aggregated
outcomes to determine a point where the output is significantly different from that
generated by maps created with 100% geocoded records. While the discussion and
data sets employed have a crime focus, there are technical and policy implications
for the spatial analysis of any address-based data, from hospital records and
insurance claims to newspaper subscription and voter registers. The paper starts
with a brief overview of the use of spatial data within law enforcement.