Spatial Interpolation Methods

Download the http://www.ce.utexas.edu/prof/maidment/giswr2008/geostat/
Nebraska.zip folder on your computer. You will see the following subfolders in
the ...

Part of the document

Geostatistical Analyst Prepared by Parikshit Ranade, Dr Ayse Irmak and David R. Maidment Spatial Interpolation Methods [pic]
Figure 1 : The interpolated value at the unmeasured yellow point is a
function of the neighboring red points (From ArcGIS Help Menu).
A very basic problem in spatial analysis is interpolating a spatially
continuous variable from point samples. Many spatially explicit
hydrologic/watershed models require continuous surfaces of temperature.
Three commonly used interpolation methods to model spatially distribution
from point data are Inverse Distance Weighting (IDW), spline and ordinary
kriging.
The IDW is simple and intuitive deterministic interpolation method based on
principle that sample values closer to the prediction location have more
influence on prediction value than sample values farther apart. Using
higher power assigns more weight to closer points resulting in less
smoother surface. On the other hand, lower power assigns low weight to
closer points resulting in smoother surface. We optimized power parameter
using ArcGIS. Major disadvantage of IDW is "bull's eye" effect (higher
values near observed location) and edgy surface. Spline is deterministic
interpolation method which fits mathematical function through input data to
create smooth surface. Spline can generate sufficiently accurate surfaces
from only a few sampled points and they retain small features (Anderson,
2008). Spline works best for gently varying surfaces like temperature. In
ArcGIS Spline is Radial Basis Function.
Unlike IDW and spline, kriging is method based on spatial autocorrelation.
It uses semivariogram.
Basics of Kriging
Kriging was developed in the 1960s by the French mathematician Georges
Matheron. The motivating application was to estimate gold deposited in a
rock from a few random core samples. Kriging has since found its way into
the earth sciences and other disciplines. It is an improvement over
inverse distance weighting because prediction estimates tend to be less
bias and because predictions are accompanied by prediction standard errors
(quantification of the uncertainty in the predicted value). The basic tool of geostatistics and kriging is the semivariogram. The
semivariogram captures the spatial dependence between samples by plotting
semivariance against separation distance (semivariance will be explained in
the next paragraph). The premise of any spatial interpolation is that
close samples tend to be more similar than distant samples (this is also
called spatial autocorrelation). This property of spatial data is
implicitly used in IDW. In kriging, one must model the spatial
autocorrelation using a semivariogram instead of assuming a direct, linear
relationship with separation distance. Semivariance equal one-half the squared difference between points separated
by a distance d±?d (assuming no direction preference). As the distance
between samples increase, we expect the semivariance to also increase
(again, because near samples are more similar than distant samples). This
is true, however, only up to some given separation distance. For this
distance and up, points are unrelated. Stated another way, if 50m is this
critical separation distance, two points separated by 50m are likely to be
just as similar (or dependent on one another) as samples separated by 100,
200, 300, or any distance greater than 50m. Suppose we have the semivariogram shown in Figure 2. What information does
the plot provide? Well, the semivariance between samples separated by no
distance is about 1.5E-4. This is called the nugget. What it says is that
if you measure the variable at locations very, very close to one another,
the values measured might be quite different. Why would this happen?
Suppose you had a gold nugget in the middle of an otherwise gold-free rock.
If you sample just on the edge of the nugget you get a high gold estimate.
If you sample just outside of the edge, you get no gold in your estimate.
The presence of a nugget in the semivariogram therefore tells you that,
assuming no measurement error, the variable is not spatially continuous. The semivariogram also tells us that points separated by 60,000 m are
likely to have the same average difference as points separated by 100,000,
150,000, 200,000 m or any distance above 60,000m. 60,000 m is the range of
the semivariogram and suggests the area of influence for any given point.
An unmeasured location can be predicted based on its neighboring samples
closer than 60,000m. A sample collected 61,000 m away from the sample will
likely have no influence on the actual value at the unmeasured location. When you look at the model of a semivariogram, you'll notice that at a
certain distance, the model levels out. The distance where the model first
flattens out is known as the range Sample locations separated by distances
closer than the range are spatially autocorrelated, whereas locations
farther apart than the range are not. The value that the semivariogram
model attains at the range (the value on the y-axis) is called the sill.
The partial sill is the sill minus the nugget
[pic]
Figure 2 : The semivariogram is used to model the spatial relationships
between samples separated by some distance, d
For kriging estimation, the semivarogram model (the yellow line in figure
2) is used to obtain estimates for the weighting parameters of Equation 1.
This process is done automatically by the geostatistical analyst once the
user is satisfied with the semivariogram. If you are interested in the
derivation of the weighting parameters (or any of the other topics
discussed here), Applied Geostatistics by Edward H. Isaaks and R. Mohan
Srivastava is an excellent resource. Or for the more mathematical folks,
try Statistics for Spatial Data by Noel A.C. Cressie.
Case study
Now we know the basics of spatial interpolation. Lets use our knowledge to
estimate mean annual air temperature for each county of Nebraska. Study Area Figure 3 : Location of major river basins in Nebraska.
Nebraska covers a total of 124496 square kilometers area, making it the
16th largest of the 50 states in United States of America. It is the mid-
western state between longitude coordinates 95°25'W and 104°W (~690km) and
latitude coordinates 40°N and 43°N (~340 km). The geographic center of the
state is located in Custer County with a longitude: 99° 51.7'W; and
latitude of 41° 31.5'N (Figure 1). The sate comprises of UTM zones 13, 14
and 15. The highest point is Panorama Point, at 1653 meters above sea level
and the lowest point is 256 meters above sea level at the Missouri River in
southeastern Richardson County. The Mean Elevation of the state is 792
meters above sea level. The major basins in the state are Missouri,
Niobrara, Platte, and Republican River. State has 93 counties. In this case
we have used 215 NWS weather stations in and around Nebraska to model
spatial variation of mean annual temperature.
Data Download
Download the
http://www.ce.utexas.edu/prof/maidment/giswr2008/geostat/Nebraska.zip
folder on your computer. You will see the following subfolders in the
folder.
1. Tmean - This is a shpefile of National Weather service and co-
operative observer networks weather stations (NWS). The attributes of
the file are mean monthly and mean annual air temperature for each
station.
2. NE_Boundary - This is shapefile for Nebraska border.
3. NE_County - This is a shpefile of all the counties in Nebraska.
4. NE_Cities - This is shapefile of all the cities in Nebraska.
Getting Started
In this exercise we need to use spatial analyst and geospatial analyst
toolbar. We need to enable this extension in the first place. Open Tools (
Extension.
Check the 'Spatial analyst' and 'Geospatial analyst' box. Now we can use
all the functions in both toolbars.
[pic]
Add State shapefile from NE_Boundary folder to your computer. Add the Tmean
file to your map. Add County shapefile. Following view will appear on your
screen.
[pic] Exploratory Spatial Data Analysis
Exploratory Spatial Data Analysis (ESDA) is a process of understanding the
properties of a spatial dataset in order to best model the data using
geostatistics. The word "Explore" should tell you that ESDA is more of an
adventure than a strict - you must follow the path at all times -
procedure. In this exercise we show how the Geostatistical Analyst tools
can be used to understand the population distribution of the attribute of
interest and how to understand the large-scale patterns in the dataset
through 3D visualization. This is not a comprehensive list of ESDA
procedures, but a good start. Through this process, keep in mind that the
better one understands the spatial characteristics of the data, the better
kriging model one can build to interpolate the data, and, consequently, the
better estimates one will produce. Histogram Open Geospatial analyst ( Explore Data ( Histogram. [pic] Select the 10 bars and check the statistics option to view th stastsitcs of
mean annual air temperature. Keep the transformation as None. Select layer
as Tmean and Attribute as ANNUAL. Similar window will appear on screen. [pic] We will now analyze the histogram for mean annual air temperature of
Nebraska. Upper right corner shows the statistics of the statistics of mean
annual air temperature. Histogram shows that our data is not perfectly
normally distributed, but it is fairly normally distributed. One of the
crosscheck of normal distribution of data is that mean should be closer to
the median. In our case mean is 6.9oC and median is 6.6oC. We can consider
our data as normally distributed. User can transform the data into