New in Geostatistical Analyst 10.1 : Areal Interpolation

For version 10.1, we’ve taken on a classic problem in GIS: how to reallocate data from one set of polygons to a different set of polygons.  For example, demographers frequently collect data from various sources, so their data might be a mixture of census block groups, postal codes, and county boundaries.  However, to perform an accurate analysis, they might need all of their data in the same administrative units.

While there are various methods for going from small polygons to large polygons (from census blocks to postal codes, for example), the benefit of areal interpolation is that it additionally provides a statistically accurate framework for going from large polygons to small polygons.  By convention, the starting polygons are called the “source” polygons, and the ending polygons are called the “target” polygons.

Reallocating polygonal data is a two-step process.  First, a smooth prediction surface is created from the source polygons.  This step is done in the Geostatistical Wizard. Then, this surface is reallocated to each of the target polygons using the new Areal Interpolation Layer To Polygons geoprocessing tool in the Geostatistical Analyst toolbox.

The graphic below uses obesity rates among fifth grade students in the Los Angeles area.  Blue areas indicate low obesity rates, and red areas indicate high rates.

Areal Interpolation workflow

Areal interpolation can work with three different types of data:

  1. Counts, such as population counts.
  2. Rates, such as cancer rates.
  3. Continuous variables, such as median age.

Areal interpolation also helps solve the problem of missing data in polygons.  By using the same polygons as the source and the target, areal interpolation will predict the data values in any polygons that are missing data.

You can learn more about areal interpolation here.

This post was contributed by Eric Krause, a product engineer on the analysis and geoprocessing team.

This entry was posted in Analysis & Geoprocessing and tagged , , . Bookmark the permalink.

Leave a Reply

3 Comments

  1. Ted says:

    Fantastic functionality.

  2. tharen says:

    I wonder if you could elaborate on the statement “the benefit of areal interpolation is that it additionally provides a statistically accurate framework for going from large polygons to small polygons”. Depending on the origin of the source polygons and the collection methods of the source data, it’s concievable that the interpolation could introduce considerable bias and error in the target dataset.

    • Eric Krause says:

      Areal interpolation does not solve the modifiable areal unit problem, if that is what you are asking. It is a problem that cannot be solved in general by statistics. That being said, our implementation of areal interpolation is built from a kriging framework, and the model comes with assumptions that are analogous to point kriging. We assume that the values of the polygons can be represented by a smooth, underlying surface (the “Obesity density surface” in the above graphic) with particular statistical properties. If these assumptions are met, our implementation is unbiased and statistically optimal. We provide several diagnostics for determining whether these assumptions are being violated, so you can decide whether you can trust the results. However, even in the very worst case (ie, gross violations of the assumptions), areal interpolation will make predictions to the target polygons by using a local average of the source polygons.

      I’m happy to continue this discussion if you have further questions or need clarifications, but I would rather do it on the Geostatistical Analyst forum:
      http://forums.arcgis.com/forums/100-Geostatistical-Analyst