Field mapping and Python scripting

A number of geoprocessing tools including Spatial Join (Analysis), Append (Management), Merge (Management), Feature Class To Feature Class (Conversion), and Table To Table (Conversion), have a parameter for controlling how fields from the input dataset(s) are processed and written, or mapped, to the output dataset – the Field Map parameter. In addition to the simple moving of attributes from input to output, field mapping can also be useful for some common tasks such as field concatenation and calculating statistics like mean, sum, and standard deviation.

Field Map in the Append (Managemnet) tool dialog

If you haven’t used the Field Map before, you should! Understanding and using field mapping will often reduce the number of processing steps in a workflow, and ensure that, in any scenario, attributes are handled in an appropriate way. Yes, the Field Map parameter is a complicated one, but it is well worth the time it takes to figure it out.

Because the Field Map is a complicated parameter, working with it in Python can also be complicated. The best way to interact with field mapping in Python scripting is with the FieldMappings object. In Python, most geoprocessing tool parameter types are seen as simple numbers or strings (specifying a feature class input is as easy as providing the feature class’ path). But several of the more complex parameters have objects that exist to help you effectively work with the parameter. The Field Map parameter can accept a long structured string indicating the field map settings (you may have seen this long string equivalent in geoprocessing messages), however, working with the field mapping string is inefficient and error-prone, so use the FieldMappings object for the best experience.

String equivalent of Field Map parameter

FieldMappings object has many properties and methods to efficiently work with field mapping

Problem

I was recently presented with a data migration problem where field mappings and Python scripting literally saved me weeks of work. The goal was to convert a collection of hundreds of VPF (Vector Product Format) databases containing many feature classes to a handful of geodatabases, and because of the large scale of the migration it had to be accomplished in an automated fashion (this is where the many weeks of work would be saved). The schema of the geodatabases was already set up with a number of feature datasets and empty feature classes into which the VPF feature class data would be imported using the Append (management) tool.

The iteration through the collection of VPF databases was solved with some simple looping techniques involving the arcpy.ListDatasets() and arcpy.ListFeatureClasses() functions.  However, there was a fundamental problem that nearly derailed the automation of this process: VPF feature classes can have spaces in their field names, while geodatabase datasets cannot. When the empty geodatabase feature classes were created from the schema of the VPF feature classes, the spaces in the field names were automatically changed to underscores ( _ ) in the geodatabase feature classes. This very subtle difference caused huge ripples in the automated process, since the Append (Management) tool can not automatically match fields like ‘mcc description’ to ‘mcc_description’; in the output geodatabase feature class, all the values in the ‘mcc_description’ field are NULL because the fields were not matched.

VPF and geodatabase feature class fields

Viewing the Field Map parameter on the Append (Management) tool helps illustrate the issue. The fields ‘f_code description’, ‘mcc description’, and ‘mcs description’ are not automatically matched to the geodatabase target schema because of the difference of the space and underscore characters.

Append tool default Field Map, showing the target geodatabase feature class schema

Solution

This situation can be rectified by manually adding the matching input fields using the Field Map controls on the Append tool. For each field that does not have a match, right-click and add the appropriate input field from the VPF feature class.

 

Use the Field Map controls to do manual field mapping

These manual steps resulted in a correct field mapping and import of the VPF feature class data to geodatabase for a single VPF feature class. To successfully perform the automated migration, I incorporated this additional field mapping work into the migration script using the FieldMappings object. For each iteration before a VPF feature class is appended into a geodatabase feature class, the script iterates through each VPF feature class field, does a find and replace for those fields with a space and the corresponding field with an underscore, and matches the two fields in the FieldMappings object. A good way to understand the usage of the FieldMappings object is to think about these steps in the same context as when the field mapping was done in the tool dialog: the script goes through each output field in the Field Map (these are the fields from the Target Dataset), and adds an input field from the VPF feature class that matches, regardless of the difference with space and underscore characters, before finally executing the Append process with the customized field mapping.

import arcpy
import os

folder = r"C:testingfieldmapautodnc13"
gdbfolder = r"C:testingfieldmapautogdbs"

arcpy.env.workspace = folder
listvpf = arcpy.ListDatasets()

for vpf in listvpf:
    # First character of the VPF determines the geodatabase to append to
    #
    if vpf[0] == "h":
        gdb = os.path.join(gdbfolder, "DNCHarbor.gdb")
    elif vpf[0] == "a":
        gdb = os.path.join(gdbfolder, "DNCApproach.gdb")
    elif vpf[0] == "c":
        gdb = os.path.join(gdbfolder, "DNCCoastal.gdb")
    elif vpf[0] == "g":
        gdb = os.path.join(gdbfolder, "DNCGeneral.gdb")

    # The characters after the : in the VPF name determine the feature dataset to append to
    #
    fd = vpf.split(":")[1]
    arcpy.env.workspace = os.path.join(folder,vpf)
    listvpffc = arcpy.ListFeatureClasses()
    for fc in listvpffc:
        targetd = os.path.join(gdbfolder,gdb,fd,fc)

        # Create FieldMappings object and load the target dataset
        #
        fieldmappings = arcpy.FieldMappings()
        fieldmappings.addTable(targetd)

        # Loop through each field in the input dataset
        #
        inputfields = [field.name for field in arcpy.ListFields(fc) if not field.required]
        for inputfield in inputfields:
            # Iterate through each FieldMap in the FieldMappings
            #
            for i in range(fieldmappings.fieldCount):
                fieldmap = fieldmappings.getFieldMap(i)
                # If the field name from the target dataset matches to a validated input field name
                # 
                if fieldmap.getInputFieldName(0) == inputfield.replace(" ", "_"):
                    # Add the input field to the FieldMap and replace the old FieldMap with the new
                    # 
                    fieldmap.addInputField(fc, inputfield)
                    fieldmappings.replaceFieldMap(i, fieldmap)
                    break

        # Perform the Append
        # 
        arcpy.management.Append(fc, targetd, "NO_TEST", fieldmappings)

Summary

Several geoprocessing tools use the complex but powerful Field Map parameter to control how attributes from the input dataset(s) are mapped to the output dataset. Understanding and using the field mapping parameter on geoprocessing tools can enable time-saving workflows, will often reduce the number of required processing steps, and ensures that in any scenario attributes are never lost and always handled in an appropriate way. In Python scripting, the FieldMappings object provides efficient, automatable access to all the same functionality as the Field Map control on the tool dialog, and can be used in many scenarios including programmatically matching fields with different names.

Use field mapping, it will make your job easier!

This entry was posted in Analysis & Geoprocessing, Python and tagged , , . Bookmark the permalink.

Leave a Reply

20 Comments

  1. rviger says:

    Hi Drew,
    Interesting post. Can you explain line 37 a little more?
    Regards,
    Roland

    • Drew Flater says:

      Hi rviger,
      Thanks for your feedback and comment.

      Line 37 makes a list of the field names in the VPF feature class, excluding any required fields (like ObjectID and Shape fields). The line uses a shorthand method of making a list, called list comprehension.
      http://docs.python.org/tutorial/datastructures.html#list-comprehensions

      Like you can see in the examples on the Python doc site above, list comprehensions are a readable and concise way to reduce the amount of code it takes to make a list, where “each element is the result of some operations applied to each member of another sequence”. So in this case, it is a shortcut to make a list of field names from each element (field) in another list (the list of field objects returned by arcpy.ListFields).

      Thanks again, I hope this makes sense.
      -Drew

  2. curtvprice says:

    Drew, could you show us how you posted the code with the nifty formatting? Might be a handy trick for our own code sharing.

  3. u-kvick says:

    ok, but this append does or not considered TEST (thus equal schemes)?

    • Drew Flater says:

      Hi u-kvick,
      Thanks for the good question. To do any custom field mapping like what is demonstrated above you must use the NO TEST option for schema matching. With the TEST option, the schemas (field names, types, etc) must be an absolute match or the process will raise an error. You can see where the NO TEST option is specified in the last line of the script sample above, line 54.

      Thanks again!
      Drew

      • u-kvick says:

        Hi Drew and thanks for your quick reply.

        Is there a way to do the same but considering the TEST option?. or do you an approach how to be 100 % assure that both features (input and target) have matching schema in order to run the Append tool? I have been doing (trying to append an input feature to an existing target feature without success; I got the following error: “000466 : does not match the schema of target “) this in the last couple of days and even though I am sure that they have matching schema, it raises an error telling that it does not match. I have check data types, field names, domain, etc…

        Regards,

        Felipe Verdú

  4. bravodelta says:

    Drew – this is very useful thanks. Can you explain in more detail the process/tool(s) you used to create your empty DNC geodatabases from the DNC schema/data/etc

    • Drew Flater says:

      Hello bravodelta,
      Sorry, I cannot be specific about the steps to set up the geodatabase because I was not involved in that part of the project. From what I understand it was very close to the steps you outlined in your next comment.

  5. bravodelta says:

    Drew,

    I need to accomplish a task similar to the one you describe, and have been running your script with DNC data.

    Here is how I prepared the geodatabase templates:
    1. Used ArcCatalog to create geo database
    2. Created all the dataset folders (CUL, HYD, ECR, etc.)
    3. Picked a library (ie. GEN02B) from a DNC like 02 and imported all its dataset and feature classes
    4. Went through each dataset, made sure I had all the feature classes for that dataset. Some feature classes were on one DNC but not on others.
    5. Deleted data for all feature classes within ArcCatalog

    For the most part your script runs very well.
    However it seems to throw arbitrary exceptions during execution (000224 (failed to insert), 999998, 999).
    By arbitrary I mean the exceptions do not occur in the same spot on successive runs always on the arcpy.management.Append() call though.
    Sometimes on re-running the script it will append a feature with no problem, on a spot it had thrown an exception on a previous run.

    I added some Print statements to your script. Here is an example of where it faults.
    gen02c:ecr

    Feature Classes…
    ecrarea

    Input Fields…
    Shape
    f_code
    f_code description
    tile_id
    fac_id

    Traceback (most recent call last):
    File “C:\esriargdb\dnc_loader.py”, line 62, in
    arcpy.management.Append(fc, targetd, “NO_TEST”, fieldmappings)
    File “C:\ArcGIS\Desktop10.1\arcpy\arcpy\management.py”, line 3560, in Append
    raise e
    ExecuteError: ERROR 999998: Unexpected Error.
    Failed to execute (Append).

    Any idea what might be causing this problem, or how to fix it?

    • Drew Flater says:

      Hi bravodelta,
      Sorry, no idea what might be causing the problem or how to fix it. In my testing I did not encounter any errors as you describe. Are you on the latest service pack? I would suggest contacting Esri technical support to log an issue if this keeps occurring — they will try to reproduce the issue then pass it to the development staff to get any issue fixed. All the best

      • bdister says:

        There appears to be a problem with ESRI versions 10.x and the VPF import code which apparently did not get updated. When we went back to 9.3.1 it fixed the problem.

  6. se2539 says:

    When I first saw this blog I entry I was quite happy as this seemed to be the perfect solution for a field mapping issue I am having. Basically I am trying to append one feature class to another, with the only difference between the two FC’s are some field names have a prefix. For instance, one feature class will have a field name like “HOUSE” and the other will have a name like “SAMPLE_HOUSE”. My thinking was to omit the prefix entirely so the field names match … something along the lines of:

    inputfield.replace(“SAMPLE_”, “”):

    Unfortunately, the fields with the prefix do not map; everything else works though. Am I missing something?

    • Drew Flater says:

      Hello se2539,
      Sounds like you are applying to principle correctly to your scenario. I tried this case using some local data using the “SAMPLE_” prefix, and it executed correctly after modifying that line in the script. Unfortunately I can’t think why your scenario doesn’t work. My working code looks like this, maybe check your code against this to see if there are any differences. (sorry, comments can’t keep white-space, have to simulate indentation with > symbols).


      import arcpy
      import os

      inputd = r"E:\fieldmap\auto\se2539.gdb\input"
      targetd = r"E:\fieldmap\auto\se2539.gdb\target"

      fieldmappings = arcpy.FieldMappings()
      fieldmappings.addTable(targetd)

      inputfields = [field.name for field in arcpy.ListFields(inputd) if not field.required]
      for inputfield in inputfields:
      >>>>for i in range(fieldmappings.fieldCount):
      >>>>>>>>fieldmap = fieldmappings.getFieldMap(i)
      >>>>>>>>if fieldmap.getInputFieldName(0) == inputfield.replace("SAMPLE_", ""):
      >>>>>>>>>>>>fieldmap.addInputField(inputd, inputfield)
      >>>>>>>>>>>>fieldmappings.replaceFieldMap(i, fieldmap)
      >>>>>>>>>>>>break

      arcpy.management.Append(inputd, targetd, "NO_TEST", fieldmappings)

    • Drew Flater says:

      Just thought of a possible issue: which feature class in the append operation has the “SAMPLE_” prefix? Input dataset or target dataset? In my scenario described in the blog and the previous comment, the Input dataset is the one that has some prefix or subtle difference from the target — in my test the Input dataset has the “SAMPLE_HOUSE” field and the Target dataset has the “HOUSE” field. If this is reversed in your scenario it will take some slightly different logic. Let me know.

  7. ethanone says:

    I’m not understanding how this works. I’m expecting something like table1.fieldA = table2.fieldB. How do you specify which field from your first table goes to which field from the second table? FieldMap object only has a addInputField (table_dataset, field_name) method. There doesn’t seem any way to specify the corresponding field from another table.

    • Drew Flater says:

      Hi ethanone, in a simple case you could create an empty FieldMap object, then use addInputField() to add both table1.fieldA and table2.fieldB to it, then you would need to set the field map output name and type properties at a minimum before adding it to a FieldMappings object. Something like this:

      fieldmappings = arcpy.FieldMappings()
      fieldmap = arcpy.FieldMap()
      fieldmap.addInputField(table1, fieldA)
      fieldmap.addInputField(table2, fieldB)
      fld = fieldmap.outputField
      fld.type, fld.name, fld.aliasname, fld.isNullable = “DOUBLE”, “Ouput Field”, “Output Field Alias”, True
      fieldmap.outputField = fld
      fieldmappings.addFieldMap(fieldmap)

      In the example above, I use the FieldMappings object addTable() method which creates a FieldMap object per field from that first table (and those FieldMap objects have one input field each), then use addInputField() to add the matching input field from the second table to the FieldMap object.

  8. gonzague32 says:

    Can we get an example on when you want to append a dataset to an other with different field names in the target dataset.

    Dataset1, fields: a, b, c
    Dataset2, fields: d, e, f
    Thanks