shitijmehta

Recent Posts

Generating a multivalue choice list

Prerequisite Reading
Generating a choice list from a field

This blog is an extension of the blog Generating a choice list from a field, explaining:

How to create a MULTIVALUE parameter choice list from an input feature class/table automatically. In this example, multiple input parameters from a choice list are then iterated through.  The model can be easily extended to carry out numerous analysis that require user selected input parameters.

Continue reading

Posted in Analysis & Geoprocessing | Tagged , , , , , , , | Leave a comment

2D Beehive Tool a.k.a. Create Hexagons Tool

Hexagonal PolygonsThe 2D Beehive tool creates a hexagonal polygon feature class for various applications such as wildlife management etc. or for data binning e.g. to map large point-based datasets as explained here.

Download the tool from here. The tool also adds the hexagon ID, based on a polygon’s spatial location, starting with the bottom left hexagon and going left to right and bottom to top (see image below).

Continue reading

Posted in Analysis & Geoprocessing | Tagged , , | 2 Comments

Concatenate Row Values

In a table you can concatenate field values or row values.

Concatenating field values
Learn more about concatenating field values from this blog.

Concatenating row values
Download the script tool from here. This tool is based on a python script that takes an input table, selects a row(s) based on a field value, and concatenates the row values of any other specified field. This tool does not create a new output, but updates the input data.

Continue reading

Posted in Analysis & Geoprocessing | Tagged , , | 21 Comments

Understanding which "if" troubles you

Question

Which If?

Use which tool?

Additional Help

Feature
How do I change the field values based on a condition?

Example:

Select rows of field1 with value X, and for those selected rows add value Y to the field 2.

If field value… Calculate Field, Select Layer By Attribute, Select
How do I continue running/stop the tool from running in a model if the condition is true/false? If condition is true/false… While, Stop
How do I run the model based on if the input feature type is a point/polyline/polygon? 

Example:

Do a conditional check if the input feature class is a polygon or a point feature type. If it is point type, run the next tool – Add Location. If it is a polygon feature, convert the polygon feature to a point feature by using the Feature To Point tool before running the Add Location tool.

If feature type… Calculate Value, Script tool If-then-else blog post 1 

If-then-else blog post 2

How do I run the model only if the input has projection similar to another feature class or a projection system specified by me? If projection is/isNot… Calculate Value, Script tool If-then-else blog post 5
How do I run the model only if the sql query resulted in any selected feature? 

Example:

If a common sql query is used in the Select Layer By Attribute tool to make a selection for all the input layers. Sometimes the SQL query results in selected features in one layer and no selected features for another layer. Do a conditional check if any of the features are selected from one layer to another.

If selection exists/ Does not exist… Select Layer By Attribute, Select
+
Get Count, Script tool
If-then-else blog post 4 

If-then-else blog post 2

How do I know if a file exists in a particular directory or not? 

Example:

Do a conditional check if a particular feature class exists in a particular workspace, and if it does, continue running the next tool Add Field in the model. If it does not exists, copy the feature class from a backup location before adding a field.

If feature class exists/Does not exist… Calculate Value, Script tool If-then-else blog post 1 

If-then-else blog post 2

Raster
How do I to perform a conditional if/else evaluation on each cell of the input raster? 

Example:

If the value is greater than 2000 set the new value to be the same as the input data; else set the remaining cells to NoData. Leaving the Input false raster parameter in the Con tool empty sets the value to NoData. Optionally you can provide another raster or a constant value.

If cell value… Con, Raster Calculator
How do I extract the cells of a raster based on a query? 

Example:

Create a new raster layer with cell value that are greater than 2000.

If cell value… Extract By Attribute, Make Raster Layer, Select Layer By Attribute
How do I perform a true/false evaluation of the input raster using an expression? 

Example:

If the cell value is greater than 2000, then set those cell values to 1 (true); else set the output cell value to 0.

If cell value… Test, Con, Raster Calculator
How do I select cells based on an SQL query and set them to NoData? 

Example:

If the cell value is greater than 2000, then set those cell values to NoData; else use the input data values for the remaining cells.

If cell value… Reclassify, Set Null, Raster Calculator
How do I set the NoData values in my input raster to something else? 

Example:

If the cell value is NoData, then set those cell values to be 0 or use values from another raster for those cells. Set the remaining cell values to be same as the input data.

If cell value is NoData… Reclassify, Is Null + Con
How do I check the properties of a raster dataset and run the model only if the property value matches a particular condition?
 

Example:

Check if the mean of all cell values is greater than 100.

If raster property… Get Raster Properties Examples of using the Calculate Value tool to set conditions
How do I check what is the raster format of the input data and run a model if it is of a particular type? 

Example:

Check if the raster dataset format is type Grid, Img, Tif, etc. then branch your model to run tools based on the condition.

If raster format… Calculate Value, Script tool using the Describe function Raster dataset formats
+
Examples of using the Calculate Value tool to set conditions
Posted in Analysis & Geoprocessing | Tagged , , , , , | 2 Comments

If you are stuck at "if" – Part 5- Does Projection Exist model example

Continue reading

Posted in Analysis & Geoprocessing | Tagged , , , , , , , , | Leave a comment

If you are stuck at "if" – Part 4 – Does Selection Exists model example

Recap

If you haven’t already, take a quick look at:

Part 1 on examples of using the Calculate Value tool to create branches using if-else logic.

Part 2 builds on Part1 highlighting how to: create a script tool from the Python script, a value list, expose tool parameters and create model parameters.

Part 3 highlights an example of using if-then-else script tool and a tool to generate custom errors, messages and warnings.

This part 4 highlights:

  • Branching using a script tool with if-then-else logic
  • Using Feature Set, value lists, preconditions and in-line variable substitution
  • Creating complex SQL expressions
  • Creating a model tool

 
Concepts

What is Feature Set?
Feature sets allow features to be entered interactively and immediately used as input to a model. This interactive data entry is done through a process similar to digitizing features on a map. Any tool that uses a feature class as input also accepts a feature set. This means that for any of the tools, the input can be entered interactively by building a simple model around the tool and changing the input data type from feature class to feature set.

 

What is inline variable substitution?
In ModelBuilder, value of any variable can be used in the tool parameters by enclosing the name of the substituting variable between the percent signs (%). Substituting variables in this manner is called inline variable substitution.

For example:
If we create a variable of type string in a model and change the name of the variable to Bird. Now, the string variable name is Bird, but the value of that string variable can be anything such as, Canada Goose/Pelican/Flamingo/etc. We can use this variable to make a selection in the Select Layer By Attribute tool by using the variable name in the SQL query expression: “Type” = ‘%Bird%’. When the tool executes the name of the variable (Bird) will be replaced by the value of the variable (Canada Goose) and all the records with the value Canada Goose will be selected (see illustration 1 below).

Illustration 1 – Using inline variable substitution

 
Setup

Download the data from here.

To follow this example open the DoesSelectionExists.mxd in ArcMap and the Does Selection Exist Model included with your download. This workflow shows an example of using the If-then-else script tool to check if any features exist in a selection based on a query.

 
Data

The siting location data for different species of birds is hypothetical data generated for this example and set in Morro Bay, a popular bird watching location in California.

 
We start with

A model tool, that allows any user of this model to provide the grid/quadrant of interest interactively, the bird species and the beginning and end years.

Illustration 2 – Does Selection Exists Data and Model Tool Dialog

This model tool was created from the model as shown in illustration 3.

Illustration 3 – Does Selection Exists Model

Inside the model the if-then-else script tool checks twice to see if the SQL query resulted in any selection for the selected grid/s. The second conditional check starts from the False output result of the first condition:

Illustration 4– How the if-then-else logic is used in the model

Condition

Do what?

Result

Do what after the result?

Based on 1st SQL query: “Type” = ‘%Bird%’ AND (“Year” >= %From Year% AND “Year” <= %To Year% )
1st if-then-else condition

Select the specified bird species in the specified years for the selected grid/s.

True

Run Summary statistics for the specified bird in the specified years for the selected grid/s.

Select the specified bird species in the specified years for the selected grid/s.

False

Run Select Layer By attribute to select all birds in the specified years for the selected grid/s and then check if-then-else condition on the selection for the second time.

If the 1st if-then-else condition is False:

Based on 2nd SQL query: “Year” >= %From Year% AND “Year” <= %To Year%

2nd if-then-else condition

Select all birds in the specified years for the selected grid/s.

True

Run Frequency for all the birds in the specified years for the selected grid/s.

Select all birds in the specified years for the selected grid/s.

False

Run Frequency for all the birds for all the years for the selected grid/s.

If there are bird sightings for that particular user provided bird species during those particular years in that particular quadrant the Does Selection Exist script tool returns a true value and the model calculates the statistics for that bird species by year to generate a summary table by year.

If the first selection has no features for the user provided query than the script tool makes the No Selection Boolean parameter to true and Selection Exists Boolean to False.

If the first condition is false a second query is executed and the selection results are checked by the second if-then-else script tool for any bird sightings that have occurred in the user specified years. If true, a frequency table is generated for all the bird species (since the specified bird was not found) in that quadrant for user specified years. If there are no bird sighting in those years than a frequency table is generated for all the birds found in all the years for just the selected grid/s (since no birds were found in the specified years for the grid/s).

 
Creating the model

The model was created by:

  • Creating the script tool (see blog part 2 to learn about creating a script tool). The script tool in this example uses the Get Count tool to find the number of selected features and then sets the True-False logic for the two branches. The branch that is true executes all the downstream tools. The branch that is false stops the execution of tools at that point.
  • Creating one String (Birds) and two Long type (From Year, To Year) variables  from Insert menu > Create Variable > String/ Long.
  • Renaming the string variable to Bird and Long type variables to From Year and To Year
  • It is always a good practice to rename the inputs and outputs to be more meaningful instead of using the default variable name.

  • Creating a value list filter for a list of Birds, From Year and To Year parameter
  • Adding and connecting the script tool with other tools as shown in the illustration 3.
  • Setting precondition on the Summary Statistic and both the Frequency tools.
  • Exposing the Input Feature Layer and the Selecting Features parameters of the Select Layer By Location tool.
  • Changing the data type of the Selecting Features to Feature Set.
  • Illustration 5– Creating Feature Set

  • Renaming model variables as shown in the illustration 3
  • Making Selecting Features, Birds, From Year and To Year a model parameter.
  • Using the variables as inline variable substitution in the Select and Select Layer By Attribute tools
  • Illustration 6– Using inline variable substitution

  • Connecting the Frequency tool 1 and Frequency tool 2 to the Add Warning tool that informs the user about which criteria was not met and what output is to be expected as shown in illustration 7 below. Learn more about this custom warning tool from blog Part 3.
  • Illustration 7– Add Warning tool

Posted in Analysis & Geoprocessing | Tagged , , , , , | Leave a comment

If you are stuck at "if" – Part 3 – Does Extension Exists model example

This blog was supposed to be the last of a 3-part blog. But we discovered that we had more to say, so this third part isn’t the last; we’ve got a couple more to go.

Recap

If you haven’t already, take a quick look at:

Part 1 on examples of using the Calculate Value tool to create branches using if-else logic.

Part 2 builds on Part1 highlighting how to create a script tool from the Python script, creating a value list, exposing tool parameters and creating model parameters.

This part 3 of the blog highlights:

  • Branching using a script tool with if-then-else logic
  • Working with Value lists values
  • Using custom messaging tool

 
Data

Download the script tool and model from here.

 
We start with

An example model that uses the Hillshade tool from 3D Analyst extension, similar to the example in blog Part 1. This example shows only a part of a larger workflow and can be extended as required.

The model is created for sharing, but it requires the 3D Analyst extension which might not be available with all those the model is shared with. To account for such a case a simple “if” script tool is added to check the availability of the extension. If the extension is not available, the model uses a custom messaging tool to inform the user about the unavailability of the license and that the model cannot be run.

Illustration 1- Does Extension Exists model

The Does Extension Exist? script tool uses the CheckExtension Acrpy function to check the availability of the extension and the CheckOutExtension function to check out the extension for use.

Illustration 2- Does Extension Exists Script tool portion highlighting the ArcPy functions

 
Creating the model

The model is created by:

To make this script tool a generic tool that can be used to check the availability of any extension, a value list filter is set on the Extension to Check parameter from the script tool properties as shown below.

Illustration 3- Does Extension Exists value list parameter

The list that appears in the drop down list on the tool dialog are the actual names of the extensions available with ArcGIS such as, 3D Analyst, Business Analyst, etc. However, these extensions have a code name that is used to look up the extension license by the script tool. To account for this difference in the name that appears on the tool dialog and the name code that is used in the ArcGIS system, the script tool uses a list of if-then-else statements as shown below.

Illustration 4- Handling extension names and their codes in the script

  • Adding and connecting the script tool to other tools in ModelBuilder
  • Setting precondition

The Exists Boolean parameter is set as a precondition to the Hillshade tool. This ensures that the Hillshade tool will run only if the Exists Boolean parameter is True. The Does not Exist Boolean parameter is connected as a precondition to another script tool that adds a custom warning if the extension is not available. This custom script tool can be created easily with just three lines of code as shown below:

Illustration 5- Simple Add Warning script tool

To create a more generic tool that can add an error, a message or a warning see the Add Error, Message or a Warning script tool with the download. This tool has three optional input parameters, one each for a custom error, a message or a warning. The script uses if-then-else statements to check which input parameter has a value and based on that creates an appropriate output.

Illustration 6- Generic Add Error, Message or Warning script tool

Any inline variable can be used as an input to this script tool parameters. For example to add an error using the name of the extension that the user of this tool has selected from the drop down list the variable Extension to Check is used as inline variable as shown below:

Illustration 7- Using inline variable substitution in the error message

Posted in Analysis & Geoprocessing | Tagged , , , , , | 5 Comments

Python Multiprocessing – Approaches and Considerations

The multiprocessing Python module provides functionality for distributing work between multiple processes, taking advantage of multiple CPU cores and larger amounts of available system memory. When analyzing or working with large amounts of data in ArcGIS, there are scenarios where multiprocessing can improve performance and scalability. However, there are many cases where multiprocessing can negatively affect performance, and even some instances where it should not be used.

There are two approaches to using multiprocessing for improving performance or scalability:

  • Processing many individual datasets
  • Processing datasets with many features

The goal of this article is to share simple coding patterns for effectively performing multiprocessing for geoprocessing. The article will cover relevant considerations and limitations, which are important when attempting to implement multiprocessing.

1. Processing large numbers of datasets

The first example performs a specific operation on a large number of datasets, in a workspace or set of workspaces. In cases where there are large numbers of datasets, taking advantage of multiprocessing can help get the job done faster. The following code demonstrates a multiprocessing module used to define a projection, add a field, and calculate the field for a large list of shapefiles. This Python code will create a pool of processes equal to the number of CPUs or CPU cores available. This pool of processes will then be used to processes the feature classes.

import os
import re
import multiprocessing
import arcpy

def update_shapefiles(shapefile):

  # Define the projection to wgs84 — factory code is 4326.

  arcpy.management.DefineProjection(shapefile, 4326)

  # Add a field named CITY of type TEXT.

  arcpy.management.AddField(shapefile, ‘CITY’, ‘TEXT’)

  # Calculate field ‘CITY’ stripping ‘_base’ from the shapefile name.

  city_name = shapefile.split(‘_base’)[0]
  city_name = re.sub(‘_’, ‘ ‘, city_name)
  arcpy.management.CalculateField(shapefile, ‘CITY’, ‘”{0}”‘.format(city_name.upper()), ‘PYTHON’)

# End update_shapefiles

def main():

  # Create a pool class and run the jobs–the number of jobs is equal to the number of shapefiles

  workspace = r’C:GISDataUSAusa’
  arcpy.env.workspace = workspace

 
fcs = arcpy.ListFeatureClasses(‘*’)
 
fc_list = [os.path.join(workspace, fc) for fc in fcs]

 
pool = multiprocessing.Pool()
 
pool.map(update_shapefiles, fc_list)

  # Synchronize the main process with the job processes to ensure proper cleanup.

  pool.close()
 
pool.join()

# End main

if __name__ == ‘__main__’:
  main()

2. Processing a individual dataset with a lot of features and records

This second example looks at geoprocessing tools analyzing an individual dataset with a lot of features and records. In this situation, we can benefit from multiprocessing by splitting data into groups to be processed simultaneously. For example, finding identical features may be faster when you split a large feature class into groups, based on spatial extents. The following code uses a pre-defined fishnet of polygons covering the extent of 1 million points (Figure 1).

Figure 1: A fishnet of polygons covering the extent of one million points.

import multiprocessing
import arcpy

def find_identical(oid): 

  # Create a feature layer for the tile in the fishnet.

  tile = arcpy.management.MakeFeatureLayer(r’c:testingtesting.gdbfishnet’, ‘layer{0}’.format(oid[0]),
                                                                                              
“”OID = {0}”"”.format((oid[0])))

  # Get the extent of the feature layer and set the extent environment.

  tile_row = arcpy.SearchCursor(tile)
 
geometry = tile_row.next().shape
 
arcpy.env.extent = geometry.extent

  # Execute Find Identical

  identical_table = arcpy.management.FindIdentical(r’c:testingtesting.gdbrandom1mil’, r’c:cursortestingidentical{0}.dbf’.format(oid[0]),  ‘Shape’)
  return identical_table.getOutput(0)

# End find_identical

def main():

  # Create a list of OID’s used to chunk the inputs

  fishnet_rows = arcpy.SearchCursor(r’c:testingtesting.gdbfishnet’, ”, ”, ‘OID’)
 
oids = [[row.getValue('OID')] for row in fishnet_rows]

  # Create a pool class and run the jobs–the number of jobs is equal to the length of the oids list

  pool = multiprocessing.Pool()
 
result_tables = pool.map(find_identical, oids)

  # Merge the all the temporary output tables — this is optional. Omitting this can increase performance.

  arcpy.management.Merge(result_tables, r’C:cursortestingctesting.gdbfind_identical’)

  # Synchronize the main process with the job processes to ensure proper cleanup.

  pool.close()
  pool.join()

# End main

if __name__ == ‘__main__’:
 
main()

There are tools that do not require data be split spatially. The Generate Near Table example below, shows the data processed in groups of 250000 features by selecting them based on object ID ranges.

import multiprocessing
import arcpy

def generate_near_table(oid_range):

 
i = oid_range[0]
 
j = oid_range[1]

 
lyr = arcpy.management.MakeFeatureLayer(r’c:testingtesting.gdbrandom1mil’, ‘layer{0}’.format(i),
                                             
“”"OID >= {0} AND OID <= {1}”"”.format(i, j))

 
gn_table = arcpy.analysis.GenerateNearTable(lyr, r’c:testingtesting.gdbrandom10000′,
                                                                                        
r’c:testingoutnear{0}.dbf’.format(i))
  return gn_table.getOutput(0)

# End generate_near_table function

def main():

 
oid_ranges = [[0, 250000], [250001, 500000], [500001, 750000], [750001, 1000001]]
 
arcpy.env.overwriteOutput = True

  # Create a pool class and run the jobs

  pool = multiprocessing.Pool()
 
result_tables = pool.map(generate_near_table, oid_ranges)

  # Merge resulting tables is optional. Can add overhead if not required.

  arcpy.management.Merge(result_tables, r’c:cursortestingctesting.gdbgenerate_near_table’)

  # Synchronize the main process with the job processes to ensure proper cleanup.

  pool.close()
  pool.join()

# End main

if __name__ == ‘__main__’:
 
main()

Considerations

Here are some important considerations before deciding to use multiprocessing:

The scenario demonstrated in the first example, will not work with feature classes in a file geodatabase because each update must acquire a schema lock on the workspace. A schema lock effectively prevents any other process from simultaneously updating the FGDB. This example will work with shapefiles and ArcSDE geodatabase data.

For each process, there is a start-up cost loading the arcpy library (1-3 seconds). Depending on the complexity and size of the data, this can cause the multiprocessing script to take longer to run than a script without multiprocessing. In many cases, the final step in the multiprocessing workflow is to aggregate all results together, which is an additional cost.

Determining if multiprocessing is appropriate for your workflow can often be a trial and error process. This process can invalidate the gains made using multiprocessing in a one off operation; however, the trial and error process may be very valuable if the final workflow is to be run multiple times, or applied to similar workflows using large data. For example, if you are running the Find Identical tool on a weekly basis, and it is running for hours with your data, multiprocessing may be worth the effort.

Whenever possible, take advantage of the “in_memory” workspace for creating temporary data to improve performance. However, depending on the size of data being created in-memory, it may be necessary to write temporary data to disk. Temporary datasets cannot be created in a file geodatabase because of schema locking. Deleting the in-memory dataset when you are finished can prevent out of memory errors.

Summary

These are just a few examples showing how multiprocessing can be used to increase performance and scalability when doing geoprocessing. However, it is important to remember that multiprocessing does not always mean better performance.

The multiprocessing module was included in Python 2.6 and the examples above will work in ArcGIS 10.0. For more information about the multiprocessing module, refer the Python documentation.

Please provide any feedback and comments to this blog posting, and stay tuned for another posting coming soon about “Being successful processing large complex data with the geoprocessing overlay tools”.

 

This post contributed by Jason Pardy, a product engineer on the Analysis and Geoprocessing team

Posted in Analysis & Geoprocessing | Tagged , , | 7 Comments

Generating a choice list from a field

Very often we are asked

  1. “How to generate a choice list for a parameter?” and
  2. “How to generate a choice list for a parameter from an input feature class/table automatically?” i.e. how to create a new choice list of unique field values each time the input feature class/table changes.

 
Generating a choice list for a parameter

For the first question, all you have to do is to update the value list property of a tool parameter.

A value list is a predefined set of input choices for a parameter. Only values contained in the value list or commonly called the drop down list are allowed as inputs. Values not in the list raise an error and the tool does not execute. In a model a value list filter can be used for a String, Long, Double and Boolean data types. For Long and Double types, you enter the allowable numeric values. For Boolean data types, the value list contains two values: the true value and the false value. The true value is always the first value in the list.

  • Learn more about setting a value lists from here.
  • See an example of using the value list.

 
Generating a choice list for a parameter from an input feature class/table automatically

For the second question, life is simple again. All you need to do is:

  1. Download an example script tool from here.
  2. Add the script tool to the model.
  3. Expose the input parameters (input feature class, field, value) of the script tool in the model.
  4. Make the input parameters model parameter.
  5. Use as described in the example below:

In the following example the user can define an input feature class (Bird Locations in this example) and a field (Type in this example) to generate a value list, from the model tool dialog. The value list parameter is then populated with a choice list of all the unique
values in the field. The output of the script tool is the selected value (Canada Goose in this example). The Select tool uses the output of the script tool (the derived output parameter as shown in illustration 1) as inline variable in the expression to select features (Type” = ‘%Output Value%’)

Illustration 1 – Shows a model example of using the dynamically generated value list

 
Understanding the script and script tool

  1. A Python script is created and set as a source file for the script tool.
  2. Three required parameters are created for the script tool: input dataset, input field and the input value. After the input data and input field values are received, the inputs pass through a custom validation code in the script tool and automatically populates the third input value parameter with a choice list of unique values from the input field. User of the tool can then choose any value from the populated list (see illustration 2 below).
  3. A derived output parameter is created, and set to be equal to the value selected by the user from the automatically generated choice list (see illustration 2 below). This is important to chain the script tool to other tools inside a model.

    Illustration 2 – Shows how the script tool parameter properties and the code in the script are connected

  4. Illustration 3 shows the script tool validation tab. The code can be edited by clicking the Edit button at the bottom of the validation tab. 

    Illustration 3 – Shows the validation code block part under the script tool validation tab

The workflow used in the validation code is as shown below:

Illustration 4 – Validation code workflow

To understand the validation code in details see illustration below:

Illustration 5 – Line by line explanation of the validation code

The script tool was contributed by Jason Scheirer, a developer on the Analysis and Geoprocessing team.

Posted in Analysis & Geoprocessing | Tagged , , , , | 5 Comments

Tutorial on Iterators in ModelBuilder

Shitij Mehta has uploaded a tutorial on working with Iterators in ModelBuilder.  Click here to download the tutorial.

Posted in Analysis & Geoprocessing | Tagged , , , | Leave a comment