Wednesday, August 01, 2007 1:38 PM -
MappingCenterTeam
Symbolizing the Results of a Hot Spot Analysis
There are a number of spatial statistical analysis tools now available in ArcGIS. Some, like the Hot Spot Analysis tool produce specially structured results that can be misinterpreted or misrepresented if you approach symbolizing them in a generic way, such as using the Natural Breaks classification method. Specifically, the Hot Spot Analysis tool produces results that are in the form of Gi* Z Scores -- values that indicate whether a feature is within a statistically significant hot or cold spot. While the version of this tool called Hot Spot Analysis with Rendering produces a layer that is symbolized correctly given the data it represents, you can fine tune this symbology if you know what how to avoid inadvertently misrepresenting the analysis results. If you’re working with point features, you can interpolate a raster surface from those points; you will also need to know how to symbolize the hot spot analysis raster surface properly. Here are some tips to guide you.
First, the Gi* Z Score values have specific meaning; values greater than -1.96 and less than +1.96 are statistically insignificant. Features with these values should be symbolized with a neutral color. The reason it is not appropriate to further classify the Z Scores within this range of values is that the tool did not distinguish truly random data versus evenly distributed data; in other words, there is no way to distinguish coincidence from trend.
Second, the values that are less than -1.96 are statistically significant cold spots, and the values above +1.96 are statistically significant hot spots. These values are also comparable, meaning that the lower the value below the break, the colder the cold spot, and the higher the value above the break the hotter the hot spot. In the Classification dialog below, class breaks have been inserted at + and – 1.96 and at + and – 2.72 (1/2 of a standard deviation) allowing the significant Z Scores to be symbolized to show where the hottest and coldest spots are located.
Notice the data range allows for another class break on the hot side of the data distribution at 3.48, but the cold side of the distribution does not. It is entirely appropriate to add another class on the hot side using the same interval (0.76). The result would look like this in the Layer Properties’ Symbology tab:
The important thing about setting the colors for these symbols is to maintain symmetry with respect to the lightness or value of the colors such that each pair of colors beyond the break for the insignificant values matches. Notice here how the third class of red is darker than the second class of red, which matches the value (color lightness) of the second class of blue.
If you wish to interpolate a raster surface from point data that you’ve performed a Hot Spot Analysis on, you will also need to address this issue of symmetry in the color ramp you use to symbolize your raster surface. Here is an example:
This alone will not get the job done. The issue is that this color ramp shows only a narrow band of neutral color, while the data range actually has a relatively wide range of neutral values. By clicking the Histograms button on the Symbology tab, you can edit how the color ramp is stretched over the data distribution.
In this histogram, the narrow section of neutral color is stretched along the nearly horizontal line in this histogram. Further, the bends near either end emphasize the hottest and coldest spots. The gray in this diagram indicates the original data distribution, while the magenta indicates the data being squeezed into the narrow section of neutral color.
The Hot Spot 911 map shows what the results can look like and includes more details about how to make a poster depicting the results of a hot spot analysis.
CF