GIS Tools for Hadoop

On Sunday, David Kaiser and his Big Data crew released the GIS Tools for Hadoop Project on GitHub.

The project contains an open source framework and API that enables big data developers to author custom spatial applications for Hadoop.

The GIS Tools project also enables the ArcGIS platform to leverage big data on Hadoop using tools that combine custom Hadoop applications with the ArcGIS Geoprocessing environment.

The project supports processing of simple vector data (Points, Lines, Polygons) and basic analysis operations, e.g. relationship analysis on that data, running in a Hadoop distributed processing environment.

An overview page, including sample tools, can be found here:

Upcoming Presentations
David and Michael Park are also presenting the project, its design and implementation, plus a demo, during a DevSummit talk on Thursday at 10am in Catalina/Madera.
If you’re down in Palm Springs this week, go check it out:
Big Data: Using ArcGIS with Apache Hadoop

This entry was posted in Geodata and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply


  1. dkaiser says:

    This project really is the work of the “Big Data crew”, and I have to call out a whole list of awesome people here:
    Geometry Devs: Sergey, Aaron and Paul
    Geoprocessing Devs: Monica and Alex
    Geodatabase Devs: Mike and Randall
    and many other people… Mansour, Andrew and other remote devs, and a number of other interested people, team leads and product managers.
    I’m just one guy that was part of a conversation that involved this great group of people and see that it was released as we had planned. Thanks.

  2. dhollema says:

    2 questions. Is there any streamlined means to leverage these tools in Amazon EMR?
    The tools currently operate on point, line, and polygon. Are there plans to reach into the raster processing world?

    • schalker says:

      I don’t really know the answers of the questions. But I suspect that raster processing isn’t that easy to use in a Hadoop resp MapReduce environment. MapReduce was orginally developed for massively parallel text processing. But what do I know…

    • dkaiser says:

      @dhollema: Yes, an AMI for Amazon EMI is in the works. Watch here or follow the github repository pages to see when it is ready.

      re: vector vs. raster: Our initial plans were to allow customers to be able to use our Hadoop Framework to process massive amount of text data (and you can see this in our demos where we are showing accessing geometries out of tweets and webserver logs, etc.).

      Having said that, we are definitely interested in raster processing. There is work being done in academia and within the greater big data software industry where certain raster computations are already being processed on Hadoop, and we are looking at where our next steps will lead in this space.

      • hlzhang525 says:


        Any updates on raster processing (GP or ArcPy) for ArcGIS (Pro) to use Hadoop framework, after two years?


        • sambrose88_1 says:

          Hi Larry,

          I believe I may have just replied to you GeoNet post:,
          As I mentioned there, the tools are still geared towards vector data, and at this stage we are not currently adding raster capabilities to this project. We still have not ruled them out for the future. What functions/capabilities are you interested in having?


          • hlzhang525 says:

            Thanks again, Sarah,
            With Mosaic Dataset model in ArcGIS Pro, we would like to enable MapReduce jobs on raster, in particular, conducting color balancing among massive adjacent high-resolution images . ..

  3. adidassler2011 says:

    Does anyone have a tutorials or any information to share on loading Esri raster data on hdfs and performing mapreduce jobs on raster data? I would like to run map reduce jobs on LandScan data but I’m only familiar with running jobs on text data.

  4. tikos91 says:

    I would like to ask, can I run NetCDFS, HDF and GRIB formats in Gis tools for Hadoop?

    Thanks :)


    BTW: This project is really nice :)