More adventures in overlay: comparing commonly used overlay tools

In the blog about distributing data and analysis, I showed using the Tabulate Intersection tool to overlay archeological sites and land parcels to create a table containing information about each parcel within an archeological site. Figure 1 shows the data and the results of Tabulate Intersection. This “what sites overlay what parcels?” question is a classic overlay query. These types of queries are the bread-and-butter of geographic analysis.

Figure 1: Map of Parcels and Sites and the table output from Tabulate Intersection. In this table, SID is the Site Id, PID is the parcel identifier, and STATUS is an attribute on the Parcel Layer. AREA is in hectares and PERCENTAGE is the percent that a parcel covers the site. For a SID, PERCENTAGE will add up to 100

There are several tools that answer overlay queries in ArcGIS, but there are subtle and important differences between them. In fact, while researching my previous blog post, our client pointed out some differences that I wasn’t aware of. It was then that I decided that a review and comparison of some overlay capabilities in ArcGIS would make for a good series of blogs.

Download example models

You can download this geoprocessing sample that contains data and models that demonstrate creating a tabulation table using the tools discussed below.

Context: creating a tabulation table using other tools

The context for this blog comes from this question: “Tabulate Intersection is only available with an Advanced license, which I don’t have. Also, it’s a new tool at version 10.1, which I haven’t installed yet. I tried getting a tabulation table using other tools, but was unsuccessful. Can I create the tabulation table output by Tabulate Intersection using other tools?”

The table below lists the four tools most commonly used for overlay queries. They all work with a Basic or Standard license. In the remainder of this blog, I’ll look at using these four tools to produce a tabulation table (the output of Tabulate Intersection).  The Sample Model column contains the name of the model found in the download that demonstrates the use of the tool.

Name Tool type Availability Sample Model
Intersect Geoprocessing tool: Analysis toolbox > Overlay toolset All license levels, but with Basic and Standard licenses, only two layers can be intersected at a time. Using Intersect
Select By Location ArcMap tool: from the main ArcMap menu, click Selection > Select By Location. All license levels.
Select Layer By Location Geoprocessing tool: Data Management toolbox > Layers and Table Views toolset. This tool does the same work as Select By Location above. Used in ModelBuilder. All license levels Using Select Layer By Location
Spatial Join Geoprocessing tool: Analysis toolbox > Overlay toolset All license levels Using Spatial Join

The granddaddy of them all – the Intersect tool

Whenever you have an overlay query, the first tool you should think of using is Intersect. Of all the tools listed above, it’s the only one that creates new features based on the geometric intersection of the input features. The new features have the correct area of intersection which means that we can calculate the AREA and PERCENT fields shown in Figure 1. In addition, Intersect can apportion any attribute based on area.
Below are the results of Intersect for one of the sites, Ar_8496. The left graphic (A) shows the raw geometry output by Intersect. The right graphic (B) shows the output features in relation to the parcels, showing the PID (Parcel ID) and SID (Site ID) labels as well as the portions of parcels both inside and outside of site Ar_8496.

Figure 2: Using the Intersect tool.

Figure 3: Attribute table for the output of Intersect

Here are a few notes about the output table fields:

  • FID_Sites is the OBJECTID of the polygon in the Sites layer and SID is a unique site identifier for the polygon. Since many parcels intersect one site, there are duplicate values for these fields.
  • FID_Parcels is the OBJECTID of a parcel that intersects the site polygon and PID is a unique parcel identifier. For site Ar_8496, there are 10 unique parcels that it intersects.
  • STATUS is a field on the Parcels layer.
  • Shape_Area and Shape_Length are the computed area and boundary length of the polygons created by Intersect. The units of these two fields are inherited from the first input layer. In this case, the units are feet.

The above table has all the necessary information to create the tabulation table shown in Figure 1 — it just takes a bit of work, work that the Tabulate Intersection tool does for you. The Using Intersect model available in the download shows how it’s done and outputs a table identical to that in Figure 1. In brief, here are the steps for transforming the above table into the tabulation table output by Tabulate Intersection:

  • Add a field named AREA and calculate its values using Calculate Field with !shape.area@hectares! as the expression.
  • Copy the feature attribute table to a stand-alone table using the Table To Table tool. This tool has a Field Map parameter which you can use to determine the fields you want copied to the output table. In the model, I remove all fields except SID, PID, STATUS, and AREA. (See below for an example of the Table To Table tool.)
  • Figure 4: polygon intersecting twice

    Use the Summary Statistics tool to calculate total area for each unique SID. The total area is used to calculate percentages.

  • Add a field named PERCENTAGE and use Calculate Field to calculate percentage by dividing AREA by the total area.
  • Finally, use the Summary Statistics tool to clean up one loose end: an individual parcel, due to its shape, may intersect a site polygon many times, as shown in Figure 4.

Select By Location

Select By Location is available from the Selection menu in ArcMap and is frequently used to answer overlay queries. Can it be used to create the same output as Tabulate Intersection? The short answer is no—Select By Location selects whole parcels and does not split parcels based on the portion that is inside a polygon. This means that you don’t get the true area of intersection like you do using Intersect. Therefore, you can’t calculate the AREA and PERCENTAGE fields output by Tabulate Intersection.
However, Select By Location does work to give you a list of parcels that overlay sites (it just can’t calculate the area). But you have to pay close attention to the spatial selection method. Figure 5 shows the Select By Location dialog using the spatial selection method of “intersect the source feature layer”.

Figure 5: Using Select By Location

Note that in Figure 5, parcel p127 is selected. This is interesting because parcel p127 was not included in the output of the Intersect tool. As far as the Intersect tool is concerned, p127 doesn’t intersect Ar_8496.

Why is there a difference? The answer lies in the spatial selection methods used by Select By Location. These methods are based on a set of rules for testing spatial relationships known as Clementini rules. In this set of rules, features whose boundaries touch are considered intersecting. The Intersect tool, on the other hand, does not consider polygons whose boundaries only touch to be intersecting. The best way to think about this difference is that the Intersect tool, when intersecting polygons, cannot create polygons from points, and polygons whose boundaries only touch because the result is a point, not a polygon. (Further details on these spatial relationships can be found here.)

Use ‘within a distance’ to emulate the Intersect tool

Figure 6: Using Select By Location with the ‘within a distance’ selection method

If you want Select By Location to ignore polygons whose boundaries only touch (like the Intersect tool), use the ‘within a distance’ spatial selection method and enter a small negative distance, as shown in Figure 6.

Creating a tabulation table

Figure 7: Tabulation table without AREA

Select By Location simply selects features; it doesn’t transfer attributes between the target and the source layers. In the context of this example, this means that you can’t get a tabulation table of Site and Parcels IDs unless you’re willing to jump into ModelBuilder or Python and iterate over each Site feature one at a time, then use the Select Layer By Location tool to select parcels, then create and update an SID field. (The Select Layer By Location tool is the same as the Select By Location dialog accessed from the ArcMap Selection menu, and is used anytime you want to automate a Select By Location query with geoprocessing.) The model Using Select Layer By Location in the sample download shows how to do this in ModelBuilder. The output of this model is shown in Figure 7. The table is the same as shown in Figure 1, but with the AREA and PERCENTAGE fields missing because the necessary information—the portions of parcels that fall within a site—isn’t available to make the calculation.

The Using Select Layer By Location model is an interesting exercise in using iterators in ModelBuilder, but it’s complicated and of little use otherwise. So, rather than iterating over each site (which is slow), use the Spatial Join tool.

Spatial Join

Figure 8: Spatial Join dialog.

The Spatial Join tool uses the same spatial selection methods as the Select Layer By Location tool. The big difference is that it produces a new feature class instead of a selection of existing features, and attributes from the joined features are output to the new feature class.

  • Spatial Join, as the name implies, joins two feature classes to create a new table. It does not split line or polygon features based on the portion that is inside another polygon, unlike the Intersect tool. This means that you cannot create a tabulation table that has AREA and PERCENTAGE. You can only create a table as shown in Figure 7. Note that Spatial Join is found in the Overlay toolset of the Analysis toolbox, but all the other tools in the Overlay toolset do split features—Spatial Join is the only tool in this toolset that doesn’t.
  • Like Select Layer By Location, you use the Match Option of WITHIN_A_DISTANCE and supply a small negative distance if you want to exclude polygons that only touch at boundaries.

Because Spatial Join doesn’t split features or apportion attributes by length or area, I like to delete any length and area fields as well as any field that should be apportioned by length or area. This prevents me from using these fields in calculations at some future time and getting the wrong results.

Figure 9: Table to Table tool dialog

To create the tabulation table, I use the Table To Table tool to convert the feature class output by Spatial Join into a simple table (no Shape column). Table To Table, like Spatial Join, has a Field Map parameter that lets you remove fields from the output table. To remove a field, select the field in the Field Map parameter and delete it, as shown in Figure 9, where all fields other than SID, PID, and STATUS are removed.

The result of Table To Table is the tabulation table shown in Figure 7 above. The model Using Spatial Join in the download strings together Spatial Join and Table To Table.

Summary

Overlay is the bread-and-butter of GIS, and there are many ways you can get answers to overlay queries in ArcGIS—only a few were touched on in this blog. Whenever you have an overlay query to perform, you need to stop and ask yourself if features to be split so that correct area or length can be calculated. If so, the tools in the Analysis toolbox > Overlay toolset (with the exception of Spatial Join) are what you want to use because they overlay and split features to yield the correct areas and lengths. If you don’t need to split features in order to get area and length, then you have other options such as Spatial Join and Select By Location (and its corresponding geoprocessing tool Select Layer By Location). These tools have a lot of different spatial selection options and you need to be aware of their definition of ‘intersect’ — it’s different than the Intersect tool.

The Tabulate Intersection tool can be emulated by using Intersect followed by a series of steps that can be automated in ModelBuilder. If your goal is to get a tabulation table similar to that produced by Tabulate Intersection, but you don’t have an Advanced license, you can start with the Using Intersect model provided in the download. The geoprocessing team created the Tabulate Intersection tool at version 10.1 so you don’t have to create a model that uses Intersect. Tabulate Intersection uses the same code logic as Intersect, but it has a speed advantage: it doesn’t write out features—it creates them in memory to calculate area and length, and then throws them away instead of writing them disk.

Next blog about overlay:   More adventures in overlay: splitting polygons with cartographic spaghetti

This entry was posted in Analysis & Geoprocessing, Uncategorized and tagged , , , , , , . Bookmark the permalink.

Leave a Reply