More adventures in overlay: spaghetti, meatballs, and the one-to-many problem

In my last blog post, I showed how you could count the number of overlapping polygons using the spaghetti and meatballs technique.  In this post, I want to go a little further and show how you can use this technique to find unique combinations of overlapping polygons based on a common attribute. 

Download

You can download the data and models I used in this blog here.

The scenario

Figure 1 shows some data I downloaded from the NatureServe site (http://www.natureserve.org/) 1.  Using data from the site I created a feature class, Bovidae_Ranges, containing polygons of the distribution range of four different Bovidae species.  Each polygon represents a different species.

Figure 1: Bovidae Ranges feature class

The task is to create non-overlapping polygons and calculate the combination of species within each polygon, as shown in Figure 2. In the figure, the legend shows all the unique combinations of species, with each species separated by a hyphen. The labels in the map are ‘stacked’—each species within the polygon is on a separate line (as opposed to a single line as shown in the legend).

Figure 2: Unique combinations of ranges

First, create some spaghetti and meatballs

Figure 3: Spaghetti and meatballs created from Bovidae_Ranges

The details of creating cartographic spaghetti with meatballs can be found in my previous post.  Basically, the Feature to Polygon  tool creates the spaghetti and the Feature To Point tool creates the meatballs (the centroids of the spaghetti polygon). Figure 3 shows the spaghetti polygons and their point centroids (the meatballs) as well as the point attribute table.

Point-in-polygon overlay

The next step is to overlay the meatballs with the overlapping polygons in the Bovidae_Ranges feature class. This overlay determines which polygons enclose each meatball point and, by reference, the attributes of the enclosing polygon. I could use the Intersect or the Identity tool for this, but whenever I need to do a simple point-in-polygon overlay, I tend to use the Spatial Join tool, as shown in Figure 4. In my last blog, I was only interested in getting the count of overlapping polygons, so I used JOIN_ONE_TO_ONE for the Join Operation parameter. This time, I need more than a count; I need to know about each overlapping polygon (potentially four overlapping polygons), so I use JOIN_ONE_TO_MANY.

Figure 4: Spatial Join dialog

The output of Spatial Join is a new point feature class named Combinations. Figure 5 shows its attribute table. There is one point feature for each overlapping species polygon. This table captures the one-to-many relationship between the spaghetti polygons and the Bovidae_Ranges polygons and can be joined to either the Bovidae_Ranges feature class (using JOIN_FID) or to the spaghetti polygons (using ORIG_FID).  The attributes of the Combinations points are as follows:

  • ORIG_FID is the OBJECTID of the spaghetti polygon.
  • TARGET_FID is the OBJECTID of the meatball (centroid) point. (Because of the way the spaghetti and meatballs was constructed, ORIG_FID and TARGET_FID are equal.)
  • JOIN_FID is the OBJECTID of the Bovidae_Ranges polygon.
  • ENGL_NAME is the ‘English Name’ of the species. Bovidae_Ranges has many attributes, including SCI_NAME (Scientific Name). But for this scenario, I just use ENGL_NAME

In Figure 5, I’ve highlighted some of these one-to-many relationships.

Figure 5: Combinations point feature class

  • ORIG_FID = 7 has two ranges; American Bison and Bighorn Sheep, constructed from the overlap of polygons 1 and 18 in the Bovidae_Ranges feature class.
  • ORIG_FID = 9 has three ranges; American Bison, Bighorn Sheep, and Mountain Goat, constructed from the overlap of polygons 1,2, and 20 in the Bovidae_Ranges feature class.

Transforming 1:M to 1:1

You can join the Combinations table above to the spaghetti polygons with a one-to-many (1:M) join (see About joining and relating tables for more information.) However, you cannot use a 1:M join in ArcMap to create labels as shown above in figure 2. So we need to somehow combine the many values of ENGL_NAME into one value that can be accessed with a 1:1 join. Figure 6 shows what we want, an All_Names field that combines the values of ENGL_NAME:

Figure 6: All_names field contains the concatenation of all ENGL_NAME values for a unique ORIG_FID

The All_Names field is the concatenation of all values of ENGL_NAME for each record with the same ORIG_FID value. In this example, the values are concatenated with a ‘ – ‘ (space/hyphen/space) delimiter. This table can now be used in both 1:M joins and 1:1 joins.

Concatenate Row Values sample tool

To create the table, I used the Concatenate Row Values tool 2 available from the Analysis and Geoprocessing Tool Gallery.

Figure 7: Concatenate Row Values sample tool

Concatenate Row Values works as follows:

  • It first scans the Input table to create a Python dictionary of key:value pairs.  A key is a unique value found in the Case Field parameter (ORIG_FID), and the value for each key is a sorted list of field values in the Read From Field parameter (ENGL_NAME).
  • Once the dictionary is built, it makes another pass over the Input table and writes the value for each key to the Copy To Field (All_Names).  Each element in the value list is separated by the Delimiter (space/hypen/space in this example).

The result is a table that can be used in both 1:1 joins as well as 1:M joins.  In the model shown in Figure 8, I use the Join Field tool to do a 1:1 join of the Combinations table to the spaghetti polygons by ORIG_FID and copy over the All_Names field.

Figure 8: Joining the All_Names field back to the non-overlapping (spaghetti) polygons

(The next post in this series shows how you can create the All_Names field without using the sample tool.)

Removing artifact polygons

In the attribute table shown above, note that for some records, All_names is empty. These are artifact polygons—polygons representing voids—that need to be deleted (see the previous post for more explanation). The last bit of the model (not shown here) is to select all polygons where All_Names is empty and delete them using the Delete Features tool.

Stacking labels in ArcMap

Figure 9: Label Expression dialog box

We now have all the information to produce the map shown in Figure 2. The spaghetti polygons are drawn using the unique values in the All_Names field.  To create the stacked labels shown in Figure 2, open the layer’s properties dialog box and click the Labels tab. In the Labels panel, click the Expression button to open the Layer Expression dialog box shown in Figure 9.

In the Expression text box, enter the following:

[All_Names].replace(' – ', '\n')

Be sure to choose Python in the Parser drop-down list.

This expression replaces the delimiter (space/hyphen/space) with a carriage return (‘\n’) which, in turn, stacks the labels.

Follow these links to learn more about label expressions

Summary

This spaghetti and meatballs technique is not a substitute for standard overlay tools such as Intersect, Identity, and Union.  Spaghetti and meatballs is only for data that has the exact same attribute schema—that is, you have a single feature class with overlapping polygons. If you have two or more feature classes with different schemas, such as one of soil data and another of slope data, you’ll want to use the standard overlay tools.

Next post in this series: More adventures in overlay: Spatial Join and the Field Map

Footnotes

1 Data provided by NatureServe in collaboration with Bruce Patterson, Wes Sechrest, Marcelo Tognelli, Gerardo Ceballos, The Nature Conservancy – Migratory Bird Program, Conservation International – CABS, World Wildlife Fund – US, and Environment Canada – WILDSPACE.

2 The Concatenate Row Values tool is a sample tool and as such the author may revise and improve the sample.   The tool is included with the sample data and models that accompany this blog, but newer versions may be available.

This entry was posted in Analysis & Geoprocessing and tagged , , , , , , . Bookmark the permalink.

Leave a Reply

2 Comments

  1. johanbaard2 says:

    Hi Dale, thx for this interesting info.
    In a similar scenario I have overlapping polygons, in this case fire scars in a natural environment. I am interested in the ‘top most’ polygon to know veld age.
    I go through a process of ‘Split by Attributes’ (on Year); then ‘Batch Union’.
    Then in Excel I manipulate a bit and get to an answer that I link back to the GIS.
    I would like to do the whole process with modelbuilder (which I understand how to use). Can you think of a way to get ‘all the overlaps out’ and only keep the ‘latest year’; this would be veld age.
    Thx!

  2. Dale Honeycutt says:

    This is really easy to do using Spaghetti and Meatballs. First, create the spaghetti polygons and meatball points as described above, then perform a Spatial Join using the meatballs as the Target Features and the overlapping burn area polygons as the Join Features (same as what’s done in the post above). This yields a new point feature class where there are many burn years for each ORIG_FID. The next step is to use the Summary Statistics tool to find the maximum year for each unique ORIG_FID. In Summary Statistics, the Input Table would be the output of Spatial Join, the Statistics Field(s) would be your ‘Year Burned’ attribute (let’s say the field name is YEAR) and the Statistic type is MAX (find maximum value). The Case field parameter is ORIG_FID. The output of Summary Statistics is a table with one row per case field (ORIG_FID) with the maximum value of the YEAR attribute . This table is ready to join back to the spaghetti polygons using the Join Field tool. For the Join Field tool, the Input Table is the spaghetti polygons, the Input Join Field is OBJECTID, the Join Table is the output of Summary Statistics, and the Output Join Field is ORIG_FID.

    You’ll have to clean up artifact polygons as described above (expression would be “MAX_YEAR IS NULL”). And you’ll probably want to do a Dissolve on the final spaghetti polygons with MAX_YEAR as the dissolve attribute. I wouldn’t be surprised that a burn area polygon of, say, 2011, is intersected by another burn area polygon from 1990 to create three non-overlapping polygons. However, since all three polys have a MAX_YEAR of 2011, they can be dissolved together into one polygon.

    You can avoid the use of Summary Stats altogether by manipulating the Field Map parameter of Spatial Join as described in my next blog post. In this case, you’d add a new output field to Spatial Join’s field map. It’s Merge Rule would be ‘Maximum’.