Avoiding GUI headaches: a case for scripting geoprocessing tools

Sometimes it can be a real pain to use a graphic user interface, or GUI. With ArcGIS geoprocessing tools we have tried to make the tool GUI, the tool dialog, easy to use. But sometimes the repetitive nature of a task can make using a tool dialog time-consuming and inefficient.

One such case is selecting a large number of fields for the Statistics Field(s) parameter of the Summary Statistics (Analysis) tool. With this parameter, you select fields from your input table on which you want to summarize, e.g. calculate the mean or sum of a numeric field. This interaction for a single field is quick and easy. But imagine you had to do this for hundreds of fields in a demographics dataset. Repeatedly picking a single field and the corresponding statistic is time-consuming and inefficient and truly a Masters level course in frustration.

The Summary Statistics (Analysis) tool dialog

Luckily there is more than one way to run a geoprocessing tool, and one of those ways is very good at making repetitive tasks easy — Python scripting. Consider the task of getting a Sum statistic for every numeric field in a dataset, and how it can be accomplished using Python scripting.

ArcPy, the ArcGIS scripting package, has many functions for doing all kinds of GIS tasks, including one that is especially important for the above scenario, ListFields. The ListFields function, as its name suggests, returns a list of all of the attribute fields in a specified dataset. Listing a dataset’s fields and checking that the field type is numeric are key steps in scripting the task described above.

The other key part of the script is constructing the Statistics Field(s) parameter with the correct Python syntax. In a Python script, the Statistics Field(s) parameter is best represented by a list of lists. You can think of the outer/main list as the full Statistics Field(s) parameter table in the tool dialog, and the sub-lists as a single row in that table. Like you can see in the picture of the Summary Statistics (Analysis) tool dialog, the Statistics Field(s) parameter table has two columns: the first column is for the field name, and the second is the statistic to calculate for that field. Similarly, in Python scripting each sub-list (row) has two elements, corresponding directly to the columns in the parameter table. The sub-lists are constructed by putting together the field name and Sum statistic type while iterating through the list of fields. But remember, this is only done after a check if the field type is numeric, since it is impossible to calculate the Sum of other types like text or dates. In just a few lines of code this repetitive and time-consuming task has been automated.

# Script that runs the Summary Statistic tool to Sum every numeric attribute
# of Census tracts by unique County IDs
import arcpy

# Local variables
intable = "C:/Data/f.gdb/CensusTracts"
outtable = "C:/Data/f.gdb/CensusTracts_SumStats_Counties"
casefield = "CNTY_FIPS"
# Create a new empty list to store pairs of field + statistic
stats = []

# Loop through all fields in the Input Table
for field in arcpy.ListFields(intable):
    # Just find the fields that have a numeric type
    if field.type in ("Double", "Integer", "Single", "SmallInteger"):
        # Add the field name and Sum statistic type as a list
        # to the list of fields to summarize (makes a list of lists)
        stats.append([field.name, "Sum"])

# After looping, the Statistics list of lists will look like
# [["HOUSEHOLDS", "Sum"], ["MALES", "Sum"], ...]

# Run the Summary Statistics tool with the Statistics list of lists
arcpy.Statistics_analysis(intable, outtable, stats, casefield)

Python scripting is one of several ways to run ArcGIS geoprocessing tools. Scripting a geoprocessing tool can often help work around problems that occur because a tool dialog parameter is tricky or requires repetitive action. You can use the powerful and wide-ranging functions available in ArcPy to help you with many GIS tasks, such as using the ListFields function to return a list of all attribute fields in a datasets which can be subsequently looped through. Using ArcPy functions together with geoprocessing tools can help you be more productive and avoid those awful GUI headaches.

This entry was posted in Analysis & Geoprocessing, Python. Bookmark the permalink.

Leave a Reply

6 Comments

  1. offermann says:

    Nice demonstration of when to use python scripting. But the intendention is horrible, everyone would agree that one single blank (line 14) is never enough, please use at least two blanks. After all, this script won’t even start running, because lines 16 to 18 have to be indented after the if statement.

  2. mboeringa2010 says:

    To be honest, representing “Python scripting” as a substitute for bad GUI design, is a bit of a shame. I (have) see(n) many examples in ArcGIS where a little bit of thinking, and the proper use of multi-select listboxes or separating out of column values in different controls, would have gone a long way in making tedious repetitive selecting-and-setting situations all but unnecessary.

    In my opinion, developers should always look at ways of minimizing the need to use mouse-clicks, and ways to allow easy multi-select from lists using standard “shift/ctrl-click” type of operations.

    I think there are still hosts of opertunities for improvement like this in ArcGIS interface design, e.g. one prominent example being the Select by Attributes dialog. Why can’t I simply multi-select multiple values from a single field and have ArcGIS build an “OR” type SELECT query on them, this is a very common operation, and should be implementable through an interface.

    I am aware that “generic” interface dialogs like the ones used in Geoprocessing tools may represent challenges, but I still think quite a number of GUI interface dialogs of ArcGIS could use a serious “make-over” to make them far more productive and less tedious to use with much less mouse-clicking required.

    • Drew Flater says:

      Thank you for your comment. I agree there is room for improvement on many geoprocessing tool dialogs. Future ArcGIS for Desktop releases and updates will strive to make these important interfaces as good as possible. This post is not meant to expose bad UI, but rather explain how Python scripting can be used to improve productivity. Even with the best UI, scripting presents many opportunities to make our work easier on a daily basis.

  3. wdnr_wetlands1 says:

    I am new to Python scripting. This example provided seems similar to what I am trying to do, but I still need help. I am trying to select by location station points that fall within a watershed boundary, and then label them with the watershed ID from the watershed polygon file. At present I am trying to use CalculateField, but am having a difficult time getting the right scripting to get the data from the polygon file to be transferred to the point file. Is there another command I should use, or if it is CalculateField, how should that be written?