Automating data validation workflows

Several times I’ve been asked, “what’s the best way to automate data validation workflows?” With the release of ArcGIS 10, this can be achieved by integrating the ArcGIS Workflow Manager and ArcGIS Data Reviewer extensions. Data Reviewer has added custom steps that can be used within a Workflow Manager workflow. In this blog topic, I’ll describe how you can configure these custom steps.

If you don’t already know, using Workflow Manager you can define workflows that model your business processes and track the progress of assigned tasks. The steps in the workflow can be configured to execute functions like sending notifications, opening applications like ArcMap, and running geoprocessing tools.

Data Reviewer allows you to manage and track quality control of your data. You can automate data validation using rules stored in a batch job. A Reviewer session is used to store information about any errors that are found and to track issues through their life-cycle to ensure they are fixed and verified.  

Three Data Reviewer functions can be automatically executed in a Workflow Manager workflow using the newly added custom steps. The first function is the ability to create a Reviewer session to store errors and utilizes the Create Reviewer Session custom step. The second allows you to run pre-defined validation checks in the form of batch jobs on your data using the Run Reviewer Batch Job custom step. And the third, Start Reviewer Session in ArcMap, allows you to open the Reviewer session when ArcMap is launched from the workflow. Now, let’s go ahead and see how to
configure these steps.

Sample Workflow

As an example workflow, let’s consider the case where we have outsourced our data migration to a contractor and we want to validate their work before accepting final delivery. The figure below illustrates this sample workflow. If you’re not working with contractors you can still use the concepts and configurations described here to build your own workflows.

While the workflow contains all the steps required to validate the data delivery, we are only going to focus on three steps – Create Reviewer Session, Validate Changed Features, and Fix Errors – that take advantage of the custom steps described above.

 

The Create Reviewer Session step uses the first custom step, Create a Reviewer Session. Let’s say in our workflow we want to have a unique Reviewer session for each job. When you are creating one session per job you may want to ensure that each session has a unique name, especially if you want to tie the session name to the job id or job name. This can be accomplished by using tokens such as [JOB:ID] or [JOB:NAME] while configuring your step.

 

After the Reviewer session has been created, we move to the validation portion of our workflow. A loop has been created where we validate the data and if errors are found the next steps would be to notify the contractor to fix the errors. We iterate through this process until all the errors are fixed.

The Validate Changed Features step use the Run Reviewer Batch Job custom step to validate the data. Here are a few things to keep in mind when configuring the step.

  • What data workspace or version will be validated? When configured, batch jobs store a fixed path
    to the database. To override this stored path and validate the version of the job you created choose the [JOB:VERSION] token when you configure the validation step.
  • What features will be validated? In the Run Reviewer Batch Job custom step is a property that can be set allowing you to limit the features validated to only those within the job’s area of interest (AOI). This
    is especially useful if you plan on assigning different areas of interest, within the data delivery, to various editors or QC technicians. Another option is to limit the features validated to just the changed features, that is, those features that were created or edited within the current job version.

 

  • Where will the errors be stored? If you used the Create a Reviewer Session step to create a new session for the job, you will want to enter the same values for the Reviewer Session Name and Reviewer Workspace when
    you set up the Run Reviewer Batch Job step.

Note: If you didn’t use the Create a Reviewer Session step you will want to make sure these properties point to a Reviewer session and workspace that already exist.

After executing the Validate Changed Features step, the workflow provides you with two different paths based on the results of the validation. To configure these paths, you’d use return codes. In this sample workflow, a return code of 1 means the data validation has failed and the contractor needs to be notified of the errors so they can be fixed. A return code of 0 means the validation has passed and the workflow automatically moves to the next step, which is to create a
sample set of features for visual verification.

Because we found errors in our data during validation, the workflow moves to the Fix Errors step that uses the third custom step, Start Reviewer Session in ArcMap. Workflow Manager provides a Launch ArcMap step that has an optional parameter to run a command when ArcMap is opened. By adding the WMXReviewer.StartSession command in the step properties, the Reviewer session associated with the job automatically loads when ArcMap is launched. This enables you to simply open the Reviewer table and begin reviewing the errors.

While the overall sample workflow contains many additional steps, we’ve focused on three custom steps for integrating Data Reviewer and Workflow Manager to automate your data validation process. I hope you find this information helpful and are now thinking about  integrating QC tasks into your workflows (if you haven’t already!).

Content contributed by Amber Bethell

This entry was posted in Editing and tagged , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

One Comment

  1. abethell says:

    I just want to add a little information about the Launch ArcMap step. When using the Run ArcMap Command, you can use either the progid text description of the command (in this case WMXReviewer.StartSession) or you can use the guid for the command. If you are using the Workflow Manager 10 SP2, you can only use the guid. However, when Workflow Manager 10 SP3 is released, you will again have the option to use either. So if you have 10 SP2 installed, you may need to enter 9B23804E-DF53-4EA6-A5A3-704BD88FBA42 instead of WMXReviewer.StartSession.

    – Amber Bethell, Esri Production Mapping Team