• Lidar Solutions in ArcGIS_part6: Updating a portion of a terrain dataset with new measurements

    This is the sixth blog in a series on Lidar Solutions in ArcGIS. The author of this blog is Clayton Crawford, lead Product Engineer on ESRI's 3D Analyst team in the Software Products group in Redlands.

    Updating a portion of a terrain dataset with new measurements

    The ability to update a surface is important to people responsible for providing accurate, up to date surfaces and people performing analysis on those surfaces. Updates come in different forms:

    • Adding ancillary data (e.g., breaklines)
    • Removing or replacing bad data
    • Using newer or more accurate data
    • Increasing extent with additional data
    • Inserting design/modeled data to perform ‘what-if’ analysis

    These kinds of updates are best performed on the measurements used to construct a surface rather than on derivatives like raster DEMs. Those can be recreated as needed after the measurement edits have taken place. Terrain datasets support this editing model because they maintain a direct link to the source measurement data. When you modify the measurements, you are automatically modifying the terrain in the same process.

    How terrains are edited

    Editing a terrain dataset is really about editing measurements. Using standard feature edit tools, you can manipulate the measurements that reside in feature classes that participate in a terrain.

    Terrains are made from one or more feature classes with some simple rules for each that control how they get used to shape the terrain surface. For example, a multipoint feature class containing lidar points can get added as mass points, a line feature class containing streams and lake shores is used as a source of breaklines, and a polygon feature class can control the data area boundary.

    Most feature classes used to define a terrain are what we call referenced. This means that the terrain maintains a pointer, or handle, to them. The terrain prevents its referenced feature classes from being deleted, and pays attention to any edits that occur on them including the addition, deletion, or modification of feature geometry. You can use the feature editor in ArcMap as well as geoprocessing tools to modify these feature classes. A terrain will automatically flag itself as ‘dirty’ in areas where edits were made. Then the terrain can be rebuilt to bring its pyramid in sync with the updated features. It does this based on the dirty areas so it’s a local, or partial, process; the entire terrain does not need to be reconstructed.

    Multipoint feature classes have the option of being embedded. When multipoints are embedded the terrain build process copies the points into pyramid tables held private by the terrain and it becomes the container for the points. The terrain does not reference the source feature class. That can be deleted, allowing you to retrieve what is typically a substantial amount of disk space; approximately 1GB per 150 million points. Terrain specific tools, Add Terrain Points (which can both add and replace) and Remove Terrain Points, are used to edit the embedded points based on an area of interest. These tools also offer the benefit of being BLOB attribute aware so if you have any LAS attributes stored with those multipoints (for more on this topic see blog on creating intensity images) the tools know how to keep the BLOB based values in sync with the points relative to the edits. For example, if a few vertices of a multipoint are deleted from an embedded feature class, the terrain will delete the corresponding BLOB based attribute values for those points.

    Appending measurements

    Measurements can be added to a terrain via the Append and Add Terrain Points geoprocessing tools. The Append tool operates on regular (referenced) feature classes. The Add Terrain Points tool is used to add or replace points in embedded feature classes. You can also add a feature class to an existing terrain via the Add Feature Class To Terrain tool but be aware that this is treated as a schema edit that invalidates the entire terrain, requiring a full rebuild. If data is to be added incrementally, it’s best to append it to a feature class that already participates in the terrain than add a new feature class to the terrain for each new set of data.

    Let’s take a scenario where data is provided in phases. In this example the bare earth lidar points are made available first. The breaklines come later in several deliveries. Knowing this schedule, you can create a terrain referencing the lidar multipoint feature class plus an empty line feature class held in anticipation of the breaklines. See Figure 1 with a zoomed-in view of a terrain made solely with bare earth lidar points.

     

    When some of the breaklines are made available they are added to the terrain by adding them to the line feature class referenced by the terrain. This is done using the geoprocessing Append tool (Figure 2).

     

    After running Append, the terrain will become ‘dirty’ in the areas where the lines were added. To see the dirty areas you add a dirty area renderer from the Symbology tab of the terrain layer Properties dialog (Figures 3a and 3b).

     

    At this point the terrain needs to be rebuilt. This is done using either the Build Terrain geoprocessing tool or the Build button on the Update tab of the terrain Properties dialog in ArcCatalog. Once the terrain is re-built the improvement made by the breaklines is evident (Figure 4).

     

    Replace

    With lines and polygons in referenced feature classes the replacement of measurements is a two step process. First you delete the old then append the new. If you’re only dealing with a handful of features then you can select and delete them using the Editor in ArcMap. For larger collections rely on geoprocessing tools. For example, use Select By Location followed by Delete Features and Append.

    It’s easiest to replace lidar points if they’re embedded. There’s a geoprocessing tool called Add Terrain Points that has a Replace option. This will replace all the points within a given area. So, if you discover something was wrong with a few source point files that were used to build a terrain you can replace them without needing to rebuild the entire terrain from scratch. Figure 5 shows an example where one area, in an otherwise bare earth model, that was inadvertently loaded with first return data.

    To fix this problem load the replacement data into a new multipoint feature class. Then run the Add Terrain Points tool with the Replace option. By default, the replacement area will come from the extent of the input feature class (Figure 6).

    Once the points have been replaced the terrain needs to be rebuilt to update the affected area. Run the Build Terrain geoprocessing tool or use the Build Terrain button on the Update tab of the terrain Properties dialog in ArcCatalog – either one fixes the terrain.

    Conclusion

    There are times when updates to a surface model are needed, be it for quality improvement or what-if scenario analysis. It’s hard to make these types of updates to derivative products like raster DEMs without ending up with some anomalies around the update area. It’s more appropriate to modify the source measurement data from which the surface model is derived. For larger datasets, like those coming from lidar, it’s also desirable that datasets only be reprocessed where the updates occur rather than rebuilding everything. Terrain datasets support this by maintaining links to their source measurements in the geodatabase and their use of dirty areas.

    That concludes part 6 of Lidar Solutions in ArcGIS. Subscribe to this blog or check back in a month or so for a discussion on how to minimize noise from lidar for contouring and slope analysis.

     

     

     

     

  • Lidar Solutions in ArcGIS_part5: Creating Intensity Images from Lidar

    This is the fifth blog in a series on Lidar Solutions in ArcGIS. It was written by Clayton Crawford, Product Engineer on the 3D Analyst team in ESRI’s Software Products group in Redlands.

    Creating Intensity Images from Lidar

    Intensity is a measure, collected for every point, of the return strength of the laser pulse that generated the point. It’s based, in part, on the reflectivity of the object struck by the pulse. Other descriptions for intensity include ‘return pulse amplitude’ and ‘backscattered intensity of reflection’. Keep in mind, the reflectivity is a function of the wavelength used which is most commonly in the near infrared. Intensity is used as an aid in feature detection and extraction, lidar point classification, and as a substitute for aerial imagery when none is available. If your lidar points include intensity values you can make images from them that look something like black and white aerial photos (Figure 1).

            

    Loading points with intensity 

    You'll need source lidar in either LAS or ASCII XYZI format. It’s common to use first returns. If your data are in LAS format use the LAS To Multipoint tool with Intensity selected as an attribute to import (Figure 2). If your data are in ASCII XYZI format use the ASCII 3D To Feature Class tool.

              

    The data loading process results in a multipoint feature class with an attribute field containing the intensity values (Figure 3). You can't read these directly because they're packed into BLOBs (Binary Large OBjects). Every vertex of a multipoint has its intensity mapped into the BLOB stored for that multipoint record.

              

    Substituting intensity for z

    A custom VBA script is used to replace every multipoint vertex z with intensity. It does this on a copy of the data. Find the script here; to use it follow the instructions included in the script.

    For the ArcObjects geeks out there, take a look at the code to see what it’s doing. Note, there’s a TerrainBlobReader object used to decode the BLOBs. See also that the multipoint vertex ID’s (automatically assigned during import) are used to map into the BLOB value arrays. For the adventurous, note that ArcObjects also provides a TerrainBlobWriter object. If you have a lot of points that need to be stored as multipoints for the sake of efficiency, but also need per-point attribution, these utility objects come in handy.

    Another potentially useful script converts, or explodes, multipoints for an area of interest (AOI) into points. Its real utility is that it decodes the lidar attributes stored in BLOBs into readable/usable numeric values placed in the output point feature class attribute table. A word of warning: make sure your AOI is not too large. Keep it small enough so it contains fewer than roughly three million points (adjust according to how patient you consider yourself to be). Find this script here.

    Note: Starting with ArcGIS 9.4, the portion of the workflow involving the substitution of intensity for z will not be necessary. The Point to Raster tool will be able to rasterize using the intensity BLOB field directly.

    Rasterize the points

    Use the Point to Raster tool on the feature class created by the script that swaps intensity with z and set the Value field to Shape (Figure 4). This tells the tool to rasterize using the shape z's which we know are actually intensity. Select MEAN as a rasterization method. The resulting raster is the intensity image. [As a side note: you may find one of the other options is useful for analysis (e.g., using RANGE of intensity as a variable used in feature detection)].

            

     

    Before moving on, review the presence of NoData. Significant numbers of NoData cells will result if the output cellsize you specified is too small relative to the density of the lidar points. You can see NoData by assigning a color to it on the Symbology tab of the raster layer Properties dialog. If there’s too much, the easiest thing to do is go back and re-run Point to Raster with a larger cellsize. Alternately, you can use a method described in a previous post to fill in missing values using an expression in the Spatial Analyst calculator (see Lidar Solutions in ArcGIS_part2).

    Image Display

    The range of values in the image is hard to predict without knowing details about how your data was collected and processed. For one thing, the original intensity values are sensor dependent. Secondly, the values may have been adjusted by the vendor (e.g., normalized to a range of 0-255). Because of this it’s hard to say what the best display options are. You will need to experiment with the raster layer stretch type and contrast. Turning on bilinear re-sampling is probably a good idea. If you’re looking for more display possibilities consider combining intensity with another variable such as hillshade (Figure 5).

            

    Conclusion

    The return intensity collected for every lidar point can be used to make images. These images have a variety of uses in GIS applications including feature detection and extraction. ArcGIS provides tools to make these images.

    That concludes this part of Lidar Solutions in ArcGIS. Subscribe to this blog or check back in a month or so for a discussion on how to update terrain datasets with new measurements.

     

  • Using Python 2.5.4 with ArcGIS 9.3.1

    This blog post was written by David Wynne, Product Engineer on the Geoprocessing Team in ESRI’s Software Products group in Redlands.

    The Python Community announced the release of Python 2.5.4 late in December of 2008. Python 2.5.4 is a bug fixes release.

    While ArcGIS 9.3.1 was shipped with Python 2.5.1, Python 2.5.4 has now been certified and is supported. In limited testing with ArcGIS 9.3, no issues were encountered.  To take advantage of fixes in Python 2.5.4, it can simply be installed on top of Python 2.5.1. That means, it is not necessary to uninstall Python 2.5.1 first.* 

    To see and to download the release notes go here.

    * If you uninstalled Python 2.5.1 and you work with the Spatial Statistics geoprocessing tools, you must reinstall Numerical Python (NumPy) from the ArcGIS installation DVDs.

    For more information, see FAQ: Can Python 2.5.4 be used with ArcGIS?

  • Spatial Statistics: What’s so HOT about Spatial Pattern Analysis?

    This blog post was written by Lauren Scott, Geoprocessing/Spatial Statistics Product Engineer in the Software Products Group at ESRI in Redlands.

    Hot Spot Analysis is just one of the pattern analysis tools in the Spatial Statistics Toolbox.You can use these tools to explore spatial patterns in order to answer questions like:

    • Where are crime rates unexpectedly high?
    • Are there regions in the country where people live longer
    • Where do we find anomalous spending patterns?
    • Are there sharp boundaries between affluence and poverty?
    • Is the disease remaining geographically fixed or is it spreading?
    • Which features are most concentrated?
    • Does the spatial pattern of the virus mirror the spatial pattern of the population at risk?
    • Which site is most accessible?
    • Where is the population center?
    • Which species has the broadest territory?

    To learn more about spatial pattern analysis, check out some of these resources:

    - The ESRI Guide to GIS Analysis, Volume 2
    - Understanding Spatial Statistics in ArcGIS 9, a free one-hour Web seminar
    - The Spatial Statistics Toolbox online documentation
    - View a five-minute video showing a hot-spot analysis of 911 emergency call data (click on “Using Spatial Statistics Tools”)
    - Download a hot-spot analysis model from the Geoprocessing Resource Center.
    - Extend Crime Analysis with ArcGIS Spatial Statistics Tools , Spatial Statistics Provide New Insights, or Spatial Patterns of Disease Inspire New Ideas on Possible Causes in ArcUser Online
    - Spatial Pattern Analysis concepts are discussed in the ArcGIS 9.3 Web help and include Modeling Spatial Relations, What is a Z Score?  What is a P value?, and Spatial Weights.
    - Technical workshop slides are available from the ESRI Public Health and Homeland Security Conferences.

    Have fun!

  • Spatial Statistics: Free Virtual Campus Web Seminar on Regression Analysis Basics

    This blog post was written by Lauren Scott, Geoprocessing/Spatial Statistics Product Engineer in the Software Products Group at ESRI in Redlands.

    There are a number of good resources where you can learn about the new regression analysis tools in ArcGIS 9.3.  The newest resource is a free one-hour Web seminar,  Regression Analysis Basics in ArcGIS 9.3, available for download from the ESRI Virtual Campus.

     Other Regression Analysis resources include:

    Enjoy!

  • 2009 Developers Summit

    The 2009 Developers Summit has come and gone.  Thanks to all who attended!  We enjoyed meeting with you.  We've uploaded the geoprocessing presentations to the Model and Script tool gallery. Session videos have been posted to the Media Gallery. 

    Slides, scripts, and data from the two workshops on Designing and Building geoprocessing tools are available in the Model and Script tool gallery, or by clicking here.

    Slides and other materials from the Building Geoprocessing Services session are available here.

    Slides and scripts for the Python Scripting Advanced Techniques session are available here

  • Send us your feedback

    You may not have noticed, but you can send us feedback about this site. There's a feedback link at the bottom right of the home page, as illustrated at the bottom of this post.

    We're particularly interested in hearing from you about blog post topics you'd like to see and suggestions on how we could make this site more useful.

     

     

  • Lidar Solutions in ArcGIS_part4: Estimating Forest Density and Height

    This blog post is written by Clayton Crawford, a Product Engineer in the Software Products Group's 3D Team in Redlands.

    This is the fourth in a series on Lidar Solutions in ArcGIS.

    Estimating Forest Canopy Density and Height

    Canopy density and height are used as variables in a number of applications. These include the estimation of biomass, forest extent and condition, and biodiversity. Canopy density, or canopy cover, is the ratio of vegetation to ground as seen from the air. Canopy height measures how far above the ground the top of the canopy is. Lidar can be used to determine both.

    What follows are steps to calculate canopy density and height from lidar points. First, you need lidar that's been classified into ground hits (bare earth) vs. non-ground hits. This type of classification is usually performed by your data provider. Secondly, you need to consider when the lidar was collected and the type of vegetation in the study area. If there are a lot of deciduous trees and the data collection was performed during leaf off, then the density calculation is not going to work.

    Loading points into the Geodatabase
    To calculate canopy density load the ground, or bare earth, lidar points into one multipoint feature class and above ground points into another. Assuming your data are in LAS format, you do this with the LAS To Multipoint tool. Specify the proper class codes to filter on. Here are the LAS class codes as defined in the LAS 1.1 Standard:

    Any LAS file made in the last few years should use these codes if the points have been classified. Unfortunately, there’s still some ambiguity in the standard. For example, we know class ‘2’ is ground but class ‘8’ is ground as well. Class ‘8’, or model key, points are a special set of ground points used for contouring or other application requiring a thinned set of ground points. Whether you have them depends on how the data was processed. If you don’t know specify both classes. If it turns out there aren’t any model key points it won’t hurt. Vegetation has a similar issue. Sometimes vendors place everything that’s above ground into class ‘1’ because they haven’t performed a more detailed classification on them. So, if you’re unsure of the specifics of your data’s classification, load non-ground points using classes 1, 3, 4, and 5; that’s a reasonable catch-all to get your vegetation points. Note: If buildings or other manmade non-ground features are in class ‘1’ you’ll get them too and they’ll skew the results somewhat.

    Calculating the density
    The most effective way to determine the canopy density is to divide the study area into many small equal sized units. Do this through rasterization. In each raster cell you compare the number of above ground hits to total hits. The trick is to figure out an appropriate cell size. It needs to be at least 4 times the average point spacing. You can go larger but not smaller.

    1. Use the Point To Raster tool on the above ground points with the COUNT option.

     

    2. Convert any resulting NoData cells to 0 so that subsequent operations treat zero points in a cell as 0. This is accomplished using the Is Null tool followed by Con.

     

    Next use the Con Tool

    3. Repeat steps 1 and 2 with the ground points.

    4. Add the above ground and ground rasters to get a total count per cell using the Plus tool.

     

    5. All the rasters we’ve made so far are longs. We need one to be floating point in order to get floating point output from the Divide function that we’ll use in step 6. Do this by sending the output from Plus through the Float tool.

     

    6. Now use the Divide tool between the above ground count raster and the floating point total count raster. This gives us the ratio from 0.0 to 1.0 where 0.0 represents no canopy and 1.0 very dense canopy.

    The following image represents canopy density. The lightest areas have little to no vegetation. These are areas where a large percentage of lidar shots could ‘see’ the ground. The dark green areas, where lidar could not penetrate to ground as well, indicate denser vegetation canopy.

    Calculating the height
    To determine canopy height you'll need to subtract the bare earth surface (DEM) from the 1st return surface (DSM). Take a look at a previous blog to learn how to make these surfaces. Find it at this link: Creating raster DEMs and DSMs from large lidar point collections.

    Once you have your first return and bare earth rasters the Minus tool gives you the difference which, over forest, represents the canopy height.

    The following image represents canopy height above ground. It ranges from blue for little to no height, to orange which is the tallest.

    Conclusion
    Lidar can be used to calculate the density and height of vegetation. This is useful for a variety of purposes including biomass and carbon estimates as well as forest management.

    That concludes part four of Lidar Solutions in ArcGIS. Subscribe to this blog or check back in a month or so for a discussion on the creation of intensity images from lidar.

  • Tips and tricks - accessing feature shape in Calculate Field

    With the Calculate Field tool, you can easily create expressions that use some property of a feature's shape, such as length or area. You can also convert length and area to different units. The illustration below shows calculating the MILES field to the length (in miles) of each line feature.

     

    Details about the Expression syntax

    The expression syntax -- !shape.length@miles! -- starts and ends with the special 'decode field' character. This decode field character is different depending on the value of the Expression Type parameter. For Expression Type of PYTHON or PYTHON_9.3, the character is an exclamation point ("!"), as shown in the illustration above. For VB, the decode field characters are "[" and "]".

    Inside the decode field characters, you can put the name of any field found on the input table. You can also use the reserved field name "shape" followed by a period and an attribute of the shape. In this case, we're using the length attribute: !shape.length!

    Finally, we convert shape.length to miles with the use of the special unit conversion syntax '@<unit>'. Unit can be any of the following: CENTIMETERS | DECIMALDEGREES | DECIMETERS | FEET | INCHES | KILOMETERS | METERS | MILES | MILLIMETERS | NAUTICALMILES | POINTS | YARDS.

    If your shape type is polygon, you can use these areal unit keywords: ACRES | ARES | HECTARES | SQUARECENTIMETERS | SQUAREDECIMETERS | SQUAREINCHES | SQUAREFEET | SQUAREKILOMETERS | SQUAREMETERS | SQUAREMILES | SQUAREMILLIMETERS | SQUAREYARDS

    More things you can do with the shape object and its attributes

    The shape field is actually a geometry object with the following properties.

     

    If, for example, you wanted the X-coordinate of the first point of a line, you would use !shape.firstpoint.x! To find the minimum X-coordinate of the line, use !shape.extent.xmin!

    Simple equations

    The expression can contain simple equations. For example, to calculate minutes it would take to traverse a street segment going 30 miles per hour, use:

    !shape.length@miles! / (30.0 / 60.0)

     Note that:

    !shape.length@miles! / (30 / 60)

    won't work - when doing real arithmetic you need the decimal point after 30 and 60.

    Code blocks

    See http://www.esri.com/news/arcuser/0507/files/pythonscript.pdf for information on using code blocks in Calculate Field

  • ModelBuilder posting at the Mapping Center

    Here's a great post on the mapping center about ModelBuilder:

    What I wish I had known about ModelBuilder before I started using it

  • Lidar Solutions in ArcGIS_part3: Data Area Delineation from Lidar Points

    This blogs is written by Clayton Crawford, a Product Engineer in the Software Products Group's 3D Team in Redlands. 

    This is the third blog in a series on Lidar Solutions in ArcGIS. Links to the first and second posts are below. 

    Data Area Delineation from Lidar Points

    It's common for lidar or photogrammetric data for a survey to be delivered without a detailed data area boundary. Often, the xy extents of the survey are defined by a tile system that covers an area of interest and the data fills these tiles (Figure 1) or the data are simply guaranteed to cover some minimal extent and there is no explicit or absolute boundary other than what can be inferred (Figure 2). Either way, the area of coverage is usually not a cleanly filled rectangle.

     

    The problem

    If a surface is made without declaring the data area up front (i.e., by including a clip polygon when defining a terrain dataset or TIN) then some of what are actually voids around the perimeter get treated as data areas. Analytic results in these areas are unreliable because height estimates are based on samples that can be far away.

    Take, for example, the data shown in Figure 3. The graphic on the left depicts a dense collection of lidar points shown in green. The gaps in the interior are water bodies (where lidar is typically omitted). The irregularly shaped data boundary is easy to see but unless an explicit extent is provided, in the form of a clip polygon, TIN and terrain related tools will fill in voids, greatly oversimplifying the actual data extent.

    We know areas outside the data collection extent should be excluded from the surface. The problem is coming up with the polygon that provides an accurate representation of this extent. Hand digitizing is laborious. There’s an easier way.

    The solution

    What we need to do is synthesize a data boundary from the points that can be used to enforce a proper interpolation zone in the surface (Figure 4).

    The key to the equation is the average point spacing. This is the primary variable to use when going after the data area. Surveys usually have explicit minimums on point spacing in order to provide control for interpolators. Areas that don't meet density requirements are exceptions. They usually fall in one of the following categories: water bodies, obscured areas, and ‘holidays’ (send the latter back to the data provider for repair). The vast majority of the data will meet sample density specifications. Point spacing is usually reported in metadata. If you don’t know it try the 3D Analyst Point File Information tool. Alternately, eyeball it using the Measure tool in ArcMap. To learn more about point spacing use the link at the top.

    Once you know the point spacing you follow these steps:

    1. Rasterize the points with Point To Raster.
    2. Assign one value to all data cells with Con.
    3. Fill small NoData areas with Expand.
    4. Reduce the overall extent of data cells with Shrink.
    5. Vectorize the raster with Raster To Polygon.
    6. Use a VBA script to remove small holes (interior rings) from polygon(s).

    Note that a Spatial Analyst license is required to run a number of these tools.

    The remainder of this document goes into more detail on each of these steps.

    Point To Raster

    Rasterization of the points will help aggregate the area covered by the points and provides a good data structure to work with for subsequent steps. You just need to tell the tool what cell assignment type to use and the output cellsize. Use the COUNT as the assignment type. As far as cellsize goes, specify a value that's several times larger than average point spacing. Otherwise, you’ll get a lot of noise because the points aren’t evenly spaced. From the standpoint of processing efficiency and noise reduction, the larger cellsize you use the better, but there will be a tradeoff with the tightness of fit in the end result. A good place to start is 4 times the average point spacing (Figure 5).

    Con

    The use of the Con tool in this workflow is to simply turn any and all data cells into cells with one value. This value defines a raster ‘zone’ that will be expanded in the next step. All that’s needed is to take the output from Point To Raster and provide a constant value for a positive expression. All non-zero value cells will be considered positive and be assigned the constant. Since COUNT was used as the cell assignment method during rasterization, any cell with a point in it must have a value greater than zero (Figure 6).

    Expand

    Unless you used a very coarse cellsize relative to your average point spacing, there's a likelihood many NoData cells remain. Most of these can be eliminated using Expand (Figure 7). You want to remove most of these so the polygon produced during vectorization in a later step isn’t full of a million holes. That would be unnecessarily expensive.

     

     Expand will grow the zone of interest outward. In our case the zone is all the data cells, coded with a value of 1. This effectively eliminates small gaps found in the interior.

    It’s okay if some isolated NoData areas remain in the output. You just don’t want thousands upon thousands. The remainder will be handled in the last step.

    Shrink

    While Expand eliminates isolated NoData cells it also grows the data area outward. It needs to actually be brought in a little. Clip polygons need to be smaller than the actual point extent, so when terrain or TIN tries to estimate z values along the polygon boundary points can be found on both sides. This is needed to get good z estimates. To reduce the raster’s data boundary use Shrink (Figure 9).

    At this point you should have a relatively clean raster with the extent of its data cells slightly within the lidar point extent.

    Raster To Polygon

    The Raster to Vector tool will convert the raster to a polygon feature class. Make sure the Simplify polygons option is checked. If it’s not, the output will be stair-stepped, rather than smooth, and contain more vertices than necessary (Figure 10).

    At this point, the process is almost complete. You need to review the output for correctness. Chances are there’s one more thing that needs to be done: the removal of remaining holes inside your clip polygon.

    Remove Interior Rings from Polygons VBA Script

    If holes are present in the resulting polygon, grab the RemoveInteriorRings VB script from ArcScripts online (http://arcscripts.esri.com/details.asp?dbid=16019). This will edit out the internal rings leaving just the exterior boundaries. To run the script, follow these steps:

    1. Have your polygon feature class output from Raster To Polygon as the first layer in a map document.
    2. Go into VBA inside ArcMap via Tools>Macros>Visual Basic Editor.
    3. From the Project window right click on the project, select Import File, and bring in the VBA module you downloaded from ArcScripts.
    4. Run the RemoveInteriorRings macro.

    You should now have a clip polygon that can be used when defining a terrain dataset or TIN. It should conform to the data extent of the points but fall slightly inside them (Figure 11).

    That concludes this part of Lidar Solutions in ArcGIS. Subscribe to this blog or check back in a month or so for a discussion on the estimation of forest canopy density and height from lidar.

    The first post can be found here: http://blogs.esri.com/Dev/blogs/geoprocessing/archive/2008/10/29/Lidar-Solutions-in-ArcGIS_5F00_part-1_3A00_-Assessing-Lidar-Coverage-and-Sample-Density.aspx 

    The second post can be found here: http://blogs.esri.com/Dev/blogs/geoprocessing/archive/2008/12/15/Lidar-Solutions-in-ArcGIS_5F00_part2_3A00_-Creating-raster-DEMs-and-DSMs-from-large-lidar-point-collections.aspx 

     

  • Lidar Solutions in ArcGIS_part2: Creating raster DEMs and DSMs from large lidar point collections

    This blog is written by Clayton Crawford, a Product Engineer in the Software Products Group’s 3D Team in Redlands.

    This is the second blog in a series on Lidar Solutions in ArcGIS. The first can be found here: Lidar Solutions in ArcGIS_part1: Assessing Lidar Coverage and Sample Density.

    Lidar Solutions in ArcGIS_part2: Creating raster DEMs and DSMs from large lidar point collections

    Raster, or gridded, elevation models are one of the most common GIS data types. They can be used in many ways for analysis and are easily shared. Lidar provides us with the opportunity to make high quality elevation models of two distinct flavors: first return and ground. A first return surface includes tree canopy and buildings and is often referred to as a Digital Surface Model (DSM). The ground, or bare earth, contains only the topography and is frequently called a Digital Elevation Model (DEM).

       

     

    ArcGIS provides tools to take large lidar point collections, and optionally other surface related information like photogrammetric breaklines, and use them to produce high quality raster surfaces.

    Coming up with a plan

    Before continuing some basic factors need to be evaluated:

    • Extent of lidar coverage
    • Number of lidar points and point density
    • Desired output raster resolution
    • Extent of output raster(s)
    • Format of output raster(s)

    Consideration of these factors will help determine whether you’ll be producing one raster or a collection. Part of this process requires you figure out how many rows and columns you’re willing to have in one raster. This depends on what you intend to do with the raster in terms of analysis, display, and potential sharing or distribution of the data. The desire to work off one dataset for analysis can be in conflict with practical constraints associated with dataset size. Another thing to consider is the amount of lidar you have. Trying to process 10 billion lidar points as one dataset, while possible, is likely to prove unwieldy. It’s pretty much a given you’ll be making multiple rasters from this amount of lidar, so consider splitting up the lidar processing as well. Not only does this keep individual datasets at reasonable sizes, it keeps process duration on those datasets shorter. The longer a process takes to execute, the more likely something’s going to go wrong (e.g., power outage). We all want to reduce risk, right?

    If you’ve determined you need to split up your data, the next question is how. Will it be based on a regular grid system, political boundaries, or division based on an anticipated application? Since lidar collections tend to have multiple uses, splitting them up based on a regular grid system or political divisions like county boundaries makes the most sense. An engineer can mosaic the different pieces he or she needs for an individual project. If the intended use is weighted heavily for one type of application, such as hydrology, then use divisions logical for the application. For example, in the case of hydrology, watershed boundaries are a good candidate.

    ArcGIS supports many raster formats, so you have a choice of what format to write to. This decision is best based on the intended use of the product. If it’s to be shared with the general public you might think about distributing in either TIFF or JPEG format. For analysis on the ArcGIS platform, the ESRI GRID format is best. This is the native format for many functions so to improve I/O efficiency, and therefore processing time, using GRID is the way to go.

    The first step in getting from lidar points to raster is loading the points into a GDB. If you haven’t already done so, review the first part of this series. Use the link at the beginning of this blog to get to it.

    Using the Point to Raster tool

    If your only source of data is the lidar you can use the Point to Raster tool to produce a raster elevation models. While this does not produce the highest quality result possible lidar tends to be so dense that for many applications the accuracy is good enough and the convenience and speed of this tool make it worthwhile.

    • If producing a bare earth surface, or DEM, use just the ground lidar points.
    • Set the Value field parameter on the tool to Shape to use the z values from the multipoint vertices.
    • Set the Cell assignment type to either MIN or MEAN. MIN will bias output heights to local lows while MEAN is more general purpose.
    • To produce a first return surface, or DSM, use the 1st return lidar points with the MAX option of the tool since you want to bias the output to local highs.

    The Point to Raster tool produces gridded elevation models from lidar point sets.

    While Point to Raster offers the easiest and fastest way to produce a raster from lidar, it has a significant drawback. You can end up with many NoData cells since it only returns values for cells that have one or more points in them. The problem is exacerbated when only using ground points to make a DEM because gaps in the data occur where there’s vegetation and buildings obscuring the ground. To reduce the salt & pepper effect of NoData vs. data cells you can increase the output cellsize relative to the average point spacing. You can also reduce the number of NoData cells after execution of Point to Raster by using this expression in Spatial Analyst calculator on the output (in this example the output from Point to Raster is called ‘point2ras):

               


    You can run the expression multiple times to fill in larger NoData areas, but I wouldn’t recommend running it more than a couple times. It’s better to just accept larger void areas as a consequence of using this approach.

            

     

    Using the Terrain Dataset

    If you have photogrammetric breaklines to go along with your lidar, or need higher quality results than can be produced with the Point to Raster tool, use the terrain dataset. For an overview of the terrain dataset and related help topics check out the online help here.

         

    On the left is a surface made without breaklines along the river banks. The image on the right has breakline enforcement. Breaklines are important for maintaining the definition of water related features in the elevation model.

    Breaklines are used to capture linear discontinuities in the surface. The most common types are edge of pavement, lake shorelines, single line drains for small rivers and double line drains for large rivers. Sometimes breaklines are also collected to help define and sculpt the surface without necessarily representing discontinuities. Examples of these include contour-like form lines and the crests of rounded ridges.

    Breaklines, while frequently used in bare earth models, tend to be detrimental when used with first return surfaces because they can be in conflict with the above ground points. For example, breaklines capturing road edge of pavement can be coincident in XY but different in Z to points in tree canopy overhanging the road. Because of this, consider excluding breaklines from your first return surface or at least those where you know there’s potential conflict.

    The most efficient means of organizing breaklines for use in a terrain dataset – see table below- is to separate them into different feature classes based on Surface Feature Type (SFType). SFType controls how the features are enforced in the model and how the natural neighbor interpolator, used during rasterization, interprets the surface as it crosses over these features. A distinct break in slope will occur across ‘hard’ features but not across ‘soft’ features.

    Common input measurement types and recommend feature class storage type and SFType settings for terrain dataset definition.

    It’s best for the sake of terrain performance to place all hardlines together in one feature class. It’s understood this might not be possible, for example, if you have the need to keep road and water features separate. It’s OK, just keep in mind the fewer feature classes used to define a terrain the better.

    The ‘Replace’ SFType deserves special mention. This type is used to force everything inside a polygon to be set flat at a constant height. It’s used mostly for lakes when there’s inadvertently other data inside them, such as lidar points, whose heights are not exactly the same as the shoreline and therefore prevent the water bodies from being flat. Use of the Replace SFType does incur more processing cost than normal hard or softlines so it’s best to avoid. Ideally there ought not be lidar samples in your water bodies (consider adding this as a stipulation in the contract with your data provider), but if you do, you can either use the Replace SFType to handle them or get rid of the offending points before building your terrain using the Erase geoprocessing tool.

    If you’ll be producing both bare earth and first return surfaces via terrain datasets, load the lidar points into two different multipoint feature classes, a feature class for the ground points and a feature class for the above ground points. Your bare earth terrain is defined with a reference to just the ground points. Your first return terrain references the same ground point feature class as the bare earth terrain and has the additional reference to the above ground points. Yes, this means two different terrains can reference the same feature class.

    Starting with ArcGIS 9.3, terrain datasets can be pyramided using one of two point thinning filters: z-tolerance and windowsize. For DEM production you can use either. If you intend to rasterize from the full resolution point set, then use the windowsize filter for terrain construction because it’s significantly faster. If you’re willing to use thinned data for analysis, which is reasonable if the lidar is oversampled for your needs, use the z-tolerance filter. While more time consuming, it’s most appropriate because it provides an estimate of vertical accuracy of the thinned representation. For DSM production use the windowsize filter with the MAX option.

    Use the Terrain To Raster tool to produce your rasterized elevation model. This provides options for interpolation, output cellsize, and which pyramid level to use from the terrain.

     


    The Terrain to Raster tool produces gridded elevation models from terrain datasets.

    For interpolation, the natural neighbors options is your best bet. While not as fast as linear interpolation, it generally produces better results both in terms of aesthetics and accuracy. Set the output cellsize relative to the lidar point sample density. You won’t gain anything by using a cellsize that’s substantially smaller than the average point spacing. Also, make sure to set the analysis extent, as set through the environment, for the extraction of subsets where appropriate. The use of a snap raster can also be of use for the sake of alignment of raster outputs.

    Conclusion

    Using either Point To Raster or Terrain To Raster geoprocessing tools you can process hundreds of millions, even billions, of lidar points into hi-resolution gridded DEMs and DSMs. These can then be used with the large collection of raster tools available in ArcGIS for analysis. They’re also great for making maps (see graphic below) and, due to their simple data structure, easy to share.

    Color hillshade for a DSM of Jasper County, South Carolina. Made from a terrain dataset built on top of 1.7 billion lidar points for the county.

    That’s it for the second part of this series on Lidar Solutions in ArcGIS. Check back for the next part: Data Area Delineation from Lidar Points.

  • Tips and tricks - Error handling in Python script tools

    I recently had a project that involved creating lots of Python script tools. Since other people were going to use these tools, I wanted my error handling to be robust and informative. This post is about some tips and tricks I discovered during this project.

    Error handling basics

    The basics of handling Python errors are covered in the 9.3 help topic Error handling with Python. The Python.org document Errors and Exceptions has more detailed information.

    Tip #1 - Use the arcgisscripting.ExecuteError exception

    In version 9.2, we introduced the arcgisscripting object along with a new exception class, arcgisscripting.ExecuteError. (This arcgisscripting.ExecuteError exception class wasn't documented in version 9.2, so few people knew about it.)  The arcgisscripting.ExecuteError exception is thrown whenever a geoprocessing tool or geoprocessing function encounters an error. What this means is that you can divide errors into two groups, geoprocessing errors (those that throw the arcgisscripting.ExecuteError exception) and everything else.  You can then handle the errors differently, as demonstrated in the code below:

    import arcgisscripting
    gp = arcgisscripting.create(9.3)
    try:
    result = gp.getcount("C:/blah.shp")
    # x = y

    # Return GEOPROCESSING specific errors
    #

    except arcgisscripting.ExecuteError:
    gp.AddError(gp.GetMessages(2))

    # Return any PYTHON or system specific errors
    #

    except:
    gp.AddError("Python or system error occurred")

    The code above is used as the source for a script tool. To keep things simple, the script tool has no parameters. When the script tool is executed, the call to getcount produces an error because the dataset "C:/blah.shp" doesn't exist. As a result, the progress dialog looks as follows:

     

    To prove how geoprocessing errors and Python errors are handled differently, change the two lines of code as follows:

        # result = gp.getcount("C:/blah.shp")
    x = y

    Now when the script is executed, an error will occur because the variable 'y' is undefined, and the progress dialog will display as follows:

     

    Tip #2 - Beware of getting error messages from a result object

    Before moving on, a quick word about the result object, shown below:

        result = gp.getcount("C:/blah.shp")

    If the call to getcount above raises an exception, the result object is null. This means you cannot retrieve error messages from the result object. For example:

    import arcgisscripting
    gp = arcgisscripting.create(9.3)
    try:
    result = gp.getcount("C:/blah.shp")

    # Return GEOPROCESSING specific errors
    # (this method is INCORRECT!)

    except arcgisscripting.ExecuteError:
    gp.AddError(result.GetMessages(2))

    # Return any PYTHON or system specific errors
    #

    except:
    gp.AddError("Python or system error occurred")

    The above code will fail with the message "name 'result' is not defined". This is due to the fact that the result object is null. Since the result object is null, a Python error will be raised since a null object doesn't have a GetMessages() method. Note -- a result object created by calling a geoprocessing service on an ArcGIS Server is never null. Null result objects a created only when a tool is run locally and it raises an error.  For more information about using the result object, see Getting results from a geoprocessing tool.

    Tip #3 - Use AddReturnMessage() to preserve links to error codes

    In version 9.3, geoprocessing error numbers shown in the progress dialog are hyperlinks to a help page that further describes the error. To enable hyperlinks for errors in your script, you need to use the AddReturnMessage() method instead of the AddError method, as follows:
    import arcgisscripting
    gp = arcgisscripting.create(9.3)
    try:
    result = gp.getcount("C:\\blah.shp")

    # Return GEOPROCESSING specific errors
    #

    except arcgisscripting.ExecuteError:
    for msg in range(0, gp.MessageCount):
    if gp.GetSeverity(msg) == 2:
    gp.AddReturnMessage(msg)


    # Return any PYTHON or system specific errors
    #

    except:
    gp.AddError("Python or system error occurred")

    Now when the script is executed, the progress dialog will look as follows:


     

    Tip #4 -- Use traceback to return more information about the error

    The help system topic Error handling with Python shows you how to use Python's traceback to retrieve the line number causing the error, the name of the .py file, and, for Python errors, a description of the error.

    At version 9.3, you can run script tools in-process, as described in Running a script in process. Scripts run in-process execute much faster than scripts run out-of-process, so you always want to run the script in-process. When a script is run in-process, ArcGIS reads the .py file into memory and then executes it. Because ArcGIS puts the .py files into memory, the Python traceback module doesn't know the name of the .py file and will report the file name as "<string>". If you want to report the name of the .py file causing the error (very useful for debugging), you'll have to provide it yourself.

    The script below has a local routine, trace(), that uses traceback to return the line number, the file name, and the error description. The routine also contains the the hard-coded file name (shown in bold) which you must change for your script.

    import arcgisscripting
    gp = arcgisscripting.create(9.3)
    try:
    def trace():
    import os, sys, traceback
    tb = sys.exc_info()[2]
    tbinfo = traceback.format_tb(tb)[0] # script name + line number
    line = tbinfo.split(", ")[1]

    # Construct the pathname to this script. Get the pathname
    # to the folder containing the script using sys.path[0].
    # Modify line below and replace 'myscript.py' with the name of
    # this script.
    #

    filename = sys.path[0] + os.sep + "myscript.py"

    # Get Python syntax error
    #

    synerror = traceback.format_exc().splitlines()[-1]
    return line, filename, synerror
    if __name__ == '__main__':
    # y = x
    result = gp.getcount("C:/blah.shp")

    except arcgisscripting.ExecuteError:
    # Call our trace routine to retrieve line number and filename.
    # (returned variable 'err' is ignored since this is a gp error.
    #

    line, filename, err = trace()
    gp.AddError("Geoprocessing error on " + line + " of " + filename + " :")
    for msg in range(0, gp.MessageCount):
    if gp.GetSeverity(msg) == 2:
    gp.AddReturnMessage(msg)

    except:
    line, filename, err = trace()
    gp.AddError("Python error on " + line + " of " + filename)
    gp.AddError(err)

    Below are illustrations of the progress dialog showing the line number and file name for both a geoprocessing error and a Python syntax error (by commenting out the "y = x" statement).

     

     

    Tip #5 -- Use Finally statement to clean up cursors

    Something I discovered by reading the Python doc Errors and Exceptions is the 'finally' statement. No matter what happens in your Python script, code in a 'finally:' block always gets executed. It's a great place to put clean-up actions, such as deleting cursors, as shown in the code below.

    import arcgisscripting
    gp = arcgisscripting.create(9.3)

    def trace():
    import os, sys, traceback
    tb = sys.exc_info()[2]
    tbinfo = traceback.format_tb(tb)[0]
    line = tbinfo.split(", ")[1]
    filename = sys.path[0] + os.sep + "testscript.py"
    synerror = traceback.format_exc().splitlines()[-1]
    return line, filename, synerror

    if __name__ == '__main__':
    # Define variables for a cursor and its row
    #

    cursor, row = None, None
    try:
    cursor = gp.searchcursor("E:/Data/CityOfSanFrancisco.gdb/FireStations")
    row = cursor.Next()
    while row:
    # Code here...
    # for demonstration...this next statement will cause an error

    y = x
    row = cursor.Next()
    except arcgisscripting.ExecuteError:
    line, filename, err = trace()
    gp.AddError("Geoprocessing error on " + line + " of " + filename + " :")
    for msg in range(0, gp.MessageCount):
    if gp.GetSeverity(msg) == 2:
    gp.AddReturnMessage(msg)
    except:
    line, filename, err = trace()
    gp.AddError("Python error on " + line + " of " + filename)
    gp.AddError(err)
    finally:
    # Close cursor and row objects.
    #

    del row, cursor

    # Print a message for demo purposes
    #

    gp.AddMessage("\n ** Cursor and row have been deleted ** \n")

    As illustrated below, the cursor gets deleted even if there is an error in the script.


    In the main body of the script, the cursor and row object variables are declared and initialized to None (shown in bold in the above code).  This ensures that these two variables exist when the del statement in the finally clause is executed; if you try to delete a non-existent variable, an exception will be thrown.  

    About cursors and the del statement

    The Python del statement deletes a variable. You can use it to delete any Python variable, such as a list or dictionary. However, you rarely need to delete native Python variables.  The del statement is mainly used for deleting non-native variables, such as geoprocessing cursors and rows obtained from the cursor. When you delete a cursor and its row, all pending changes to the row (caused by an update or insert cursor) are flushed and all locks on the dataset are removed.  You should always delete cursors and rows obtained from the cursor.  When deleting a cursor, delete the row object first, then the cursor.


  • Lidar Solutions in ArcGIS_part 1: Assessing Lidar Coverage and Sample Density

    This blog post is written by Clayton Crawford, Product Engineer in the Software Products Group’s 3D Team in Redlands.

    This post is the first in a series called “Lidar solutions in ArcGIS”. The series will cover Lidar processing tasks and workflows. And it will show you how to manage these vast point collections and outline approaches for mining information from them.

    Let me state an important point up front: This series is about Lidar point processing. If you have Lidar derived raster data then it won’t be of direct use, but if you need to learn how to make those rasters then read on. Also important to note: the type of Lidar involved in this discussion is collected from plane or helicopter with a laser scanner pointed downward. With this type of Lidar you can make bare earth surfaces for topographic mapping and 1st return surfaces that include vegetation and buildings. It’s not about the type of Lidar where data is collected at side-on angles.

    Note that some of the tasks covered in the series require a 3D Analyst extension license.

    Here are the topics I plan to cover:

    • Assessing Lidar coverage and sample density
    • Creating raster DEMs and DSMs from large Lidar point collections
    • Data area delineation from Lidar points
    • Using Lidar to estimate forest canopy density and height
    • Creating intensity images from Lidar
    • Updating a portion of a terrain dataset with new measurements
    • Minimizing noise from Lidar for contouring and slope analysis
    • Business partner solutions for Lidar

     

    Lidar Solutions in ArcGIS_part 1: Assessing Lidar Coverage and Sample Density

    One basic QA/QC process is to ensure the Lidar points delivered by your data provider have the coverage and density expected. You want to catch problems with this early on and have them resolved before continuing. Two geoprocessing tools are useful in this regard: Point File Information found in the 3D Analyst toolbox and Point To Raster located in core Conversion Tools.

    Point File Information
    The Point File Information tool reports basic statistics about one or more point data files on disk. The tool’s primary purpose is to help you review and summarize the data before loading it into your geodatabase.  LAS (the industry standard format for Lidar data) and ASCII format files are supported as input. Since Lidar projects often utilize collections of data files, sometimes in the hundreds or even thousands, the tool lets you specify folder names in addition to individual files. When given a folder, it reads all files inside it that have the suffix you specify.

     

    For each input point file it outputs one polygon with accompanying attribution to a target feature class. The polygon graphically depicts the xy extent, or bounding box, of the data in the file. Attributes include file name, point count, z-min, z-max, and point spacing.

     


    The point spacing reported by Point File Information is not exact and deserves some discussion. For the sake of performance it uses a rough estimate that simply compares the area of the file’s bounding box with the point count. It’s most accurate when the rectangular extent of the file being examined is filled with data. So, files with significant numbers of points excluded over large water bodies or on the perimeter of a study area, only partially occupied with data, will not have accurate estimates. Therefore, the reported point spacing is more meaningful as a summary when looking at trends for collections of files. Something useful to do with the output feature class is to display it in ArcMap, open its attribute table, and sort the point spacing field in ascending order. You can also symbolize on the point spacing field using a graduated color ramp.


     

    Point File Information works quickly on LAS files because it only needs to scan their headers to obtain the information it’s looking for. It takes significantly longer with ASCII files because with them the tool actually has to read all the data.

    Assuming everything checks out OK, the next thing to do is load your Lidar points into a multipoint feature class with the LAS To Multipoint or ASCII 3D To Feature Class tools. Put this feature class in a feature dataset if you intend to build a terrain dataset from the points. While you have the choice between using LAS or ASCII format files, LAS is generally a better way to go. They contain more information and, being binary, can be read by the importer more efficiently.

    Once the points are loaded into a multipoint feature class you can use the Point to Raster tool to get a more in-depth view of the point distribution.

    Point to Raster
    The Point to Raster tool creates rasters from points and it also supports multipoints. It’s a generic tool with many options and uses. For the sake of evaluating Lidar point density the tool’s COUNT option is the thing to go for. This uses the number of points falling in a raster cell as the cell value. Being able to look at this graphically over the extent of the project area is revealing.

    There’re a couple parameters on the Point to Raster tool whose values for this exercise aren’t obvious. First, is the Value Field parameter. It doesn’t matter what this is set to. That’s because the Value Field is ignored when the Cell Assignment type is set to COUNT. Then there’s the cellsize. You might think the average point spacing is good but this typically results in too many empty, or NoData, cells because Lidar points just aren’t that evenly spaced. Also, the output raster could end up being unnecessarily large. Instead, it’s better to go with a cellsize that’s several times larger than the average point spacing but small enough to identify gaps or voids that warrant further investigation. A reasonable size is four times the point spacing. As an example, let’s say your data is sampled at 1 meter. If you set the cellsize to 4 then you can expect, on average, to get 16 points in a cell.


    You can also evaluate the density for different types of points. While most of the time you’ll probably just check the density for all returns it can be useful to look at those that fall in a certain class like ‘ground’. For example, this can give you an idea of how good your ground penetration is in vegetated areas. The Point to Raster tool doesn’t know how to make the distinction between point types though. So, you control what points get used by how you go about creating the multipoint feature class with the LAS To Multipoint tool. It provides options for loading points by class code and return number.

    Once your raster has been created have a look at it in ArcMap. Use a color ramp renderer to display it so it’s easy to distinguish between cells with high counts and those with low. You can also set the NoData color to something that stands out. Look for variance in density and data voids. Have your vendor explain anything that doesn’t look right.


    Hopefully, you’ll find your data meets specifications and lacks surprises. It’s worth the effort to check.

    That’s it for this installment of Lidar Solutions in ArcGIS. Subscribe to this blog or check back in a couple weeks for a discussion on the creation of raster DEMs/DSMs from Lidar.

     

  • Tips and tricks - Unix, Python, and Geoprocessing

    This posting was written by Mark Zollinger, a Product Engineer and long time UNIX user with the Geoprocessing team.

    Do you like Solaris or Linux? Do you "do GIS" on one of those platforms? Maybe you've been asked to convert some ArcGIS Workstation AML applications over to ArcGIS Engine. Are you a command-line junkie? (There are more of us than you might think)

    If, for one of the above or any other reason, you find yourself wondering if you should try developing GIS on Solaris or Linux, then it's time to stop wondering. Join in with others who include a flavor of *nix in their list of working development platforms.

    There are several approaches to GIS development that work very well on Solaris and Linux.

    If you are into low-level programming, you should consider Java or C++, both of which can access ArcGIS's ArcObjects API. I've used Java quite a bit over the years, and can tell you that the doors are wide open on what you can do with it. If this sound like the thing for you, then you can move along now, there's nothing to see here. This post isn't about ArcObjects at all.

    If you lean towards scripting and higher level abstractions, then geoprocessing's more coarse-grained object model, then you are in the right place. The rest of this post will show you how to set up an IDE for Python development and write a very simple geoprocessing script.

    I mentioned a moment ago that I've written my share of Java ArcObjects code. I've got to confess though, that I've always been a fan of scripting languages. Python and GP is where I spend most of my time these days.

    Setting up

    "Okay", you ask, "How do I start writing GP code in Python on non-Windows platforms?"

    You can do Python and GP from the command line, and I often prefer the focus this brings. I've found, however, that most folks prefer to work in a graphic IDE (Integrated Development Environment).

    Setting up an IDE that you can use

    Probably your best choice is the Eclipse IDE with the Pydev plugin. Eclipse is an Open Source IDE, intended to work with a wide variety of programming languages. Pydev is the plugin that specifically enables Python development in Eclipse.

    The latest version of Eclipse, named Ganymede, has not been fully tested by ESRI for use with ArcGIS 9.3, and is not available for some older UNIX releases. We recommend using version 3.3.2, which you can find in the archive at:

    http://archive.eclipse.org/eclipse/downloads/

    To install Eclipse, follow the instructions in the "Installing Eclipse" paragraph at:

    http://wiki.eclipse.org/FAQ_Where_do_I_get_and_install_Eclipse%3F

    (These instructions are for Ganymede, but they basically the same for version 3.3.2)

    Once you have Eclipse installed and running, you need to add the Pydev plugin. Just follow the "Getting Started" link on the left side of the Pydev homepage.

    http://pydev.sourceforge.net/

    Note that you need the Open Source Pydev plugin. The Pydev Extensions plugin is commercial trialware, and is optional.

    When you get to the "Configuring the interpreter" page, you are told to choose the Python interpreter that Pydev should use to run Python programs. Use the Python installation you use for Engine development/testing. In most cases, this is the one that got installed with the Engine Runtime:

    $ARCGISHOME/python25/bin/python

    Note: you might get an error message while trying to choose the interpreter. If you do, click the error dialog's Details button. The message probably includes something like this:

    ld.so.1: python: fatal: libpython2.5.so.1.0: open failed: No such file or directory

    This is an indication that the python interpreter is not correctly set up. The most likely cause is that your ArcGIS Engine environment has not yet been established. Exit Eclipse and source the init_engine shell script that applies to the shell you are using. When you start Eclipse the problem should go away.

    Also, item 3 on the "Configuring the interpreter" page asks you to set the SYSTEM PYTHONPATH. When you do this, make sure to include $ARCGISHOME/bin. This matches the PYTHONPATH that gets set when you source init_engine.

    Writing a Python script that does GP stuff

    Now that you have a development environment, how do you use all this great GP stuff? Let's create a very simple Python script to get you started on the road to GP development. In most Python scripts, the first thing to do is to import the modules that you will need. For GP, we need to import the arcgisscripting module. Let's also import the os and system modules:

    import arcgisscripting
    import os, sys

    To access GP, we need to create a Geoprocessor object. This is the object through which you will call any GP functions or objects

    gp = arcgisscripting.create(9.3)

    By passing in the 9.3 argument, you are getting the newer version of the Geoprocessor object, which is a little more Python friendly. For this example we won't be using any of the Python friendly features, but it is never too early to start a good habit.

    Now that you have a GP object, what do you do with it? Let's set GP's workspace property and print it out, just to prove to ourselves that it really worked. For this example, let's set the workspace to a "data" directory just underneath the current Python directory.

    gp.workspace = os.path.join (sys.path[0], "data")
    print "GP Workspace = " + gp.workspace

    (Of course you can set the workspace to any accessible directory, not just the one returned by sys.path[0]. I'm just using sys.path[0] because I know it's accessible.)

    sys.path is a list of strings where Python looks for modules. The first item in the list is the current Python directory. Calling os.path.join is a great way to build up filesystem paths, since it correctly handles all the path separator issues. (More on path separators later.) When you run this script (Run -> Run As -> Python Run), you should see "GP Workspace= /some/path/data" printed in the Console tab near the bottom of the Eclipse window.

    You can use Eclipse or the operating system to create the data directory, since it probably doesn't exist yet.

    For this example, let's suppose you have point data representing hawk sitings in a national forest. Your study area is a subset of the forest where a particular plant in known to grow. What you need to do is clip your point data using the polygon of your study area.

    Once you have copied your data into the aforementioned data directory, your script could execute the clip tool like this:

    gp.clip_analysis("nat_for.gdb/hawk_sightings", "study_area.shp", "nat_for.gdb/study_sightings")

    Just to be sure it worked, you can print out the messages that the tool generated like this:

    print (gp.getmessages())

    Notice that in calling the clip tool, we used forward slashes (/). Back slashes (\) and forward slashes work equally well in ArcGIS on ALL platforms, so you don't need to worry about switching back and forth when you change to Windows. You can even use back slashes on Linux and Solaris. The problem is that in Python, back slashes are a special character, so to use them, you have to treat them differently than most other characters.

      - You can escape them: "\\some\\path"
      - You can use raw strings: r"\some\path"
      - You can use unicode strings: u"\some\path"

    If you stick to forward slashes on all platforms, you improve the look of your code and reduce the effort it takes to type it.

    So now you have a script that looks like this:

    import arcgisscripting
    import os, sys

    gp = arcgisscripting.create(9.3)

    gp.workspace = os.path.join (sys.path[0], "data")
    print "GP Workspace = " + gp.workspace

    gp.clip_analysis("nat_for.gdb/hawk_sightings", "study_area.shp", "nat_for.gdb/study_sightings")
    print (gp.getmessages())

    Run it, and it works, creating a new featureclass in the file geodatabase. Now wasn't that easy? To get more complicated, you just string together as many of those tools as you need to get the job done.

    Welcome to the world of Geoprocessing with Python!

    Links

    Automating workflows with scripts - Introduction to writing Python scripts for geoprocessing
    Techniques for sharing Python scripts - Useful techniques for sharing Python scripts
    The Python Tutorial - If you're new to Python, this is a good tutorial to start
    How to duplicate AML functions in Python - For AML programmers
    How to duplicate AML directives in Python- For AML programmers

    Troubleshooting

    Color issues

    Upon using Eclipse on Solaris for the first time, you may see some color problems. For example, the text insertion caret might be white on a white background, making it invisible. Another common symptom is that you cannot see the outline of text input fields. If you run into this or something similar, check the color depth of your display. If it is set to 8 bits, change it to 24.

    init_engine

    You should have your Engine environment already established in the shell in which you start Eclipse. (source init_engine.sh or source init_engine.csh) If you see inexplicable error messages, this is the most likely cause.

More Posts Next page »