A very common task is to concatenate two or more field values into a new field. A classic example of field concatenation occurs with street features. Street features typically have separate fields for the street prefix, street name, and street type, and your task is to concatenate these three field values into another field.
The example below shows this, where FULL_NAME is the concatenation of PREFIX, STREETNAME, and FTYPE.
To concatenate field values, you use the Calculate Field tool with an expression, as illustrated below. The field containing the results of the concatenation must be a text field.
!PREFIX! + ' ' + !STREETNAME! + ' ' + !FTYPE!
instructs Calculate Field to take the value of the PREFIX field, add (concatenate) a space, add the STREETNAME field, add a space, and, finally, add the FTYPE field. The exclamation points (!) surrounding the field names mean “use the value found in this field”. The expression is executed once for each feature in the input feature class. If there are selected records in the input table, only the selected records are updated. You can either type the expression directly into the text box, or click the calculator button to open the calculator dialog box to build your expression with the calculator.
The Expression Type parameter is set to PYTHON_9.3 since all the examples in this post use Python. (An Expression Type of PYTHON would also work for all these examples. The differences between PYTHON and PYTHON_9.3 are subtle, and are documented here. I use PYTHON_9.3 from force of habit.)
Using Code Blocks
The example above showed the most basic use of Calculate Field to perform concatenation; there are only three fields, they are all TEXT fields (as opposed to numeric fields), there are no null values, and the separator is a single space. But what if things aren’t so simple?
The table below is the result of an overlay (using the Intersect tool) of a vegetation and a soils feature class. The Veg_Soil field is a concatenation of VEGTYPE and SoilClass, but with a few special transformations thrown in.
- Leading spaces in VEGTYPE have been removed (” Englemann” transformed to “Englemann”)
- Initial capitals on VEGTYPE (“Whitebark pine” transformed to “Whitebark Pine”)
- Spaces removed in VEGTYPE (“Whitebark Pine” transformed to “WhitebarkPine”)
- Values separated by an underscore.
The illustration below shows the Calculate Field dialog used to create the Veg_Soil values. In this dialog, the Expression parameter is actually calling a Python function defined in the Code Block parameter. The name of this function is concat_veg_and_soil and takes two parameters, the VEGTYPE value and the SoilClass value. The name of the function is arbitrary – you can name it whatever you like, just as long as the name is the same in both the Expression and Code Block.
The first line of the code defines the function and the parameters it uses. In this case, there are only two parameters. Here is the entire code block:
def concat_veg_and_soil(veg, soil): # First strip off any leading/trailing spaces veg = veg.strip() # Remove hyphens ("Non-forest" to "Nonforest") veg = veg.replace('-', '') # Make initial caps ("Lodgepole pine" to "Lodgepole Pine") veg = veg.title() # Remove spaces ("Lodgepole Pine" to "LodgepolePine") veg = veg.replace(' ', '') # Special case for null soiltype if soil == '': return veg else: # Concatenate veg and soil type using "_" as separator return veg + '_' + soil
Like the Expression, the function defined in the Code Block executes once for each selected record. At the bottom of the function, the value for the Veg_Soil field is returned. The code block function must return a value, even if it’s an empty string.
Python has strict rules about indentation; a line of code that ends with a colon (“:”) is the beginning of a logical block, and this block is defined by any number of lines that have the same indentation. If you are writing a long function, it’s best to write the code in a Python IDE (such as IDLE or PythonWin) that auto-indents, then copy and paste the code into the code block.
Common string functions
The above code shows the most common string functions you’ll use in string concatenation. The table below describes these functions as well as some others you may need.
sVal = “Hello world”
|sVal = sVal.strip()
sVal = sVal.lstrip()
sVal = sVal.rstrip()
|strip() Removes leading and trailing spaces. lstrip() removes spaces on the left and rstrip() removes spaces on the right. You can supply specific characters to strip, as in lstrip(‘0‘) to remove leading zeros.|
|sVal = sVal.title()||Makes initial capitals for each separate word in the string.|
|sVal = sVal.upper()
sVal = sVal.lower()
|Makes entire string upper or lower case.|
|sVal = sVal.replace(<old>, <new>)||Replaces all occurrences of the string <old> with <new>. sVal.replace(‘orld‘, ‘alls‘) would result in “Hello walls“.|
|sVal = str(5.8299)||Returns the string representation of any value.|
|sVal = str(int(5.8299))||Returns the integer value of a real number. So, str(int(5.8299)) returns the string ’5′. If you need to round up, add 0.5 to the number before using int.|
|sVal = sVal[begin : end]||Returns the substring between positions begin and end. sVal[0:5]
would return “Hello“
You can use these Python functions in the Expression parameter as well as in the code block. For example, to concatenate a street name field with an integer field containing a zip code:
For more information on string functions in Python, see http://docs.python.org/release/2.6.5/library/string.html
For more information on manipulating substrings using list notation ([begin: end]), see section 5.6 of http://docs.python.org/library/stdtypes.html
Dealing with null values
Fields containing null values cause problems in concatenation. Null values appear as “<Null>” when you view the table in ArcMap. When Calculate Field evaluates your expression, it looks for null values and if any are found, the expression is not evaluated. This prevents divide-by-zero errors and other undefined operations.
If your field contains null values, you need to turn it into a string (using str()) in the expression, even if it’s already a string field. For example, if the VEGTYPE field in the example above contained null values, then the expression would look as follows:
A general purpose function for concatenation
Below is a function you can use for general purpose field concatenation—just copy and paste the code into your code block. It takes any number of fields and puts them together using an underscore as a separator. The nice thing about this function is that you don’t have to specify the number of fields you want to concatenate – simply call the function with any number of fields. You can change the separator to any characters you like, or provide your own logic for removing spaces or special characters, or changing capitalization. Here’s an example table where 6 fields were concatenated into the AllTypes field.
Here’s the Calculate Field dialog for populating the AllTypes field.
And here is the code.
# "*args" allows this routine to accept any number of field values. # the values are passed as a Python tuple, essentially a # non-editable list # def concat(*args): # Initialize the return value to an empty string, # then set the separator character # retval = "" sep = "_" # For each value passed in... # for t in args: # Convert to a string (this catches any numbers), # then remove leading and trailing blanks # s = str(t).strip() # Add the field value to the return value, using the separator # defined above # if s <> '': retval += sep + s # Strip of any leading separators before returning the value # return retval.lstrip(sep)
What is Python? (Includes links to other resources for learning Python)
Barbara Bicking, a product engineer on the geoprocessing team, contributed to this post.