Machine Learning in ArcGIS

What do we mean by Machine Learning?

Machine Learning (ML) refers to a set of data-driven algorithms and techniques that automate the prediction, classification, and clustering of data. Machine learning can play a critical role in spatial problem solving in a wide range of application areas, from image classification to spatial pattern detection to multivariate prediction.

In addition to traditional Machine Learning techniques, ArcGIS also has a subset of ML techniques that are inherently spatial. These spatial methods that incorporate some notion of geography directly into their computation can lead to deeper understanding. The spatial component often takes the form of some measure of shape, density, contiguity, spatial distribution, or proximity. Both traditional and inherently spatial machine learning can play an important role in solving spatial problems, and ArcGIS supports their use in a number of ways.

Machine learning can be computationally intensive and often involves large and complex data. Esri’s continued advancements in data storage and both parallel and distributed computing make solving problems at the intersection of ML and GIS increasingly possible.

Examples of Machine Learning in ArcGIS


Prediction is about using the known to estimate the unknown. ArcGIS includes a number of regression and interpolation techniques that can be used to perform prediction. Applications include creating an air pollution surface based on sensor measurements and estimating home values based on recent sales data and related home and community characteristics.

In ArcGIS: Empirical Bayesian Kriging, Areal Interpolation, EBK Regression Prediction, Ordinary Least Squares Regression and Exploratory Regression, Geographically Weighted Regression


Classification is the process of deciding to which category an object should be assigned based on a training dataset. ArcGIS includes many classification methods focused on remotely sensed data. These tools analyze pixel values and configurations to categorize pixels. Some examples include delineating land use types or identifying areas of forest loss.

In ArcGIS: Maximum Likelihood Classification, Random Trees, Support Vector Machine


Clustering is the grouping of observations based on similarities of values or locations. ArcGIS includes a broad range of algorithms that find clusters based on one or many attributes, location, or a combination of both. These methods can be used to do analysis such as segment school districts based on socioeconomic and demographic characteristics or find areas with dense social media activity after a natural disaster.

In ArcGIS: Spatially Constrained Multivariate Clustering, Multivariate Clustering, Density-based Clustering, Image Segmentation, Hot Spot Analysis, Cluster and Outlier Analysis, Space Time Pattern Mining

Behind the Scenes…

In addition to these methods and techniques, machine learning is also used throughout the platform as a means of choosing smart, data-driven defaults, automating workflows, and optimizing results. For instance, EBK Regression Prediction uses principal component analysis (PCA) as a means of dimension reduction to improve predictions, the OPTICS method within Density-based clustering uses ML techniques to choose a cluster tolerance based on a given reachability plot, and the Spatially Constrained Multivariate Clustering tool uses an approach called evidence accumulation to provide the user with probabilities related to clustering results.


The field of machine learning is both broad and deep, and is constantly evolving. ArcGIS is an open, interoperable platform that allows for the integration of complementary methods and techniques, whether through the ArcGIS API for Python, ArcPy, or the R-ArcGIS Bridge. This integration empowers ArcGIS users to solve complex problems by combining powerful built-in tools with any machine learning package they need, from scikit-learn and TensorFlow in Python to caret in R to IBM Watson and Microsoft AI – all while benefiting from the spatial validation, geoenrichment, and visualization of results in ArcGIS. The combination of these complementary packages and technologies with the system of record, insight and engagement that the ArcGIS platform provides is greater than the sum of its parts.


What’s next

There are many key initiatives within Esri to advance machine learning methods and integration approaches across the platform. Methods such as random forests, neural networks, logistic regression, and time-series forecasting are on the roadmap, as well as simplified user experiences for integrating with popular machine learning libraries and packages. A continued focus on distributed processing also plays a major role in these advancements.

In addition to building on traditional machine learning within ArcGIS and ease of integration, Esri is actively working at broadening the intersection of GIS and ML. This focus on innovation in the realm of spatial ML, where the algorithms and approaches incorporate space into their computation, will continue to empower ArcGIS users to take advantage of the latest advances in technology and computing, while still focusing on solving problems in a fundamentally spatial way.

This entry was posted in Analysis & Geoprocessing, ArcGIS Pro, Developer, Python, Sciences, Spatial Statistics and tagged , . Bookmark the permalink.

Leave a Reply


  1. Stratos Tso says:

    Hello Lauren, nice article. The toolset has been there for a while. It might get a fancy name in the future. But, what about performance??? It will be very interesting to compare ML performance between different platforms!

  2. johnmdye says:

    Lauren, can you list some of the ML libs and packages Esri is looking at integration with? We’re big users (and supporters) of

    • Lauren Bennett says:

      Hey John! I’d say the ones listed are good examples of places we’re spending some time/energy, but its also really great to get your feedback, so thanks for sharing your use of We’ve done some work with, but I’d be super interested in hearing some of your use cases, so maybe we can connect to talk a bit more about it. Thanks again, and hope all is well with you!

  3. mehdikerbelae says:

    Thanks a lot, I’m very interesting for the use ML in ArcGIS. I need more details, how can I obtain it?

  4. says:

    This is great. While in ESRI’s advancement into ML, are there any some sort of Python Script source code as a guideline made available to Desktop users as a way forward to develop Convolution Neural Network? In my online research for better feature extraction tools where i got lost:-
    1 #!/usr/bin/env python
    3 # Import the system library
    4 import sys
    5 # Import the python Argument parser
    6 import argparse
    7 # Import the RIOS applier interface
    8 from rios import applier
    9 # Import the RIOS progress feedback
    10 from rios import cuiprogress
    11 # Import the numpy library
    12 import numpy
    13 # Import the GDAL library
    14 from osgeo import gdal
    16 # Define the applier function
    17 def rulebaseClassifier(info, inputs, outputs):
    18 # Create an output array with the same dims
    19 # as a single band of the input file.
    20 out = numpy.zeros(inputs.image1[0].shape)
    21 # Use where statements to select the
    22 # pixels to be classified. Give them a
    23 # integer value (i.e., 1, 2, 3, 4) to
    24 # specify the class.
    25 out[numpy.where((inputs.image1[0] > 0.4 )&(inputs.image1[0] < 0.7))] = 1
    26 out[numpy.where(inputs.image1[0] 0.1 )&(inputs.image1[0] 0.7 )] = 4
    29 # Expand the output array to include a single
    30 # image band and set as the output dataset.
    31 outputs.outimage = numpy.expand_dims(out, axis=0)
    33 # A function to define the image as thematic
    34 def setThematic(imageFile):
    35 # Use GDAL to open the dataset
    36 ds = gdal.Open(imageFile, gdal.GA_Update)
    37 # Iterate through the image bands
    38 for bandnum in range(ds.RasterCount):
    39 # Get the image band
    40 band = ds.GetRasterBand(bandnum + 1)
    41 # Define the meta-data for the LAYER_TYPE
    42 band.SetMetadataItem(‘LAYER_TYPE’, ‘thematic’)
    44 # This is the first part of the script to
    45 # be executed.
    46 if __name__ == ‘__main__’:
    47 # Create the command line options
    48 # parser.
    49 parser = argparse.ArgumentParser()
    50 # Define the argument for specifying the input file.
    51 parser.add_argument(“-i”, “–input”, type=str,
    52 help=”Specify the input image file.”)
    53 # Define the argument for specifying the output file.
    54 parser.add_argument(“-o”, “–output”, type=str,
    55 help=”Specify the output image file.”)
    56 # Call the parser to parse the arguments.
    57 args = parser.parse_args()
    59 # Check that the input parameter has been specified.
    60 if args.input == None:
    61 # Print an error message if not and exit.
    62 print(“Error: No input image file provided.”)
    63 sys.exit()
    65 # Check that the output parameter has been specified.
    66 if args.output == None:
    67 # Print an error message if not and exit.
    68 print(“Error: No output image file provided.”)
    69 sys.exit()
    71 # Create input files file names associations
    72 infiles = applier.FilenameAssociations()
    73 # Set image1 to the input image specified
    74 infiles.image1 = args.input
    75 # Create output files file names associations
    76 outfiles = applier.FilenameAssociations()
    77 # Set outImage to the output image specified
    78 outfiles.outimage = args.output
    79 # Create a controls objects
    80 aControls = applier.ApplierControls()
    81 # Specify that stats shouldn’t be calc’d
    82 aControls.calcStats = False
    83 # Set the progress object.
    84 aControls.progress = cuiprogress.CUIProgressBar()
    86 # Apply the classifier function.
    87 applier.apply(rulebaseClassifier,
    88 infiles,
    89 outfiles,
    90 controls=aControls)
    92 # Set the output file to be thematic
    93 setThematic(args.output)