Understanding the Skip Factor when Calculating Statistics

What happens when you use the skip factor to calculate statistics?

The skip factor is used to speed up the time it takes to calculate statistics for a given raster dataset. This can be really helpful when you have a large dataset, or if you’re working remotely. The reality is that you often don’t have to use every single pixel to calculate the stats. However, you may not get the true minimum and maximum values which might be necessary for your analysis. (Calculating statistics is used primarily for determining how to display the image, and are necessary if you want to use any of the stretch functions except for Dynamic Range Adjustment.)

So how do you know which number should you go with? The X and Y values are for each direction, so if you go with a value of 1 for your skip factor, you will use every single pixel, 2 uses 25%, 10 is 1% and 100 is 0.01% of the pixels. The geek in me wanted to see what happens as I increase the Skip Factor–by no means should these values be used carte blanche as a best practice for using the Skip Factor, there’s another blog next week that gets into that–so I ran the tool on the same dataset using skip factors of 1, 10, 100, 250, 350, 500, 1000, 1250, 1500 and 2000 to see if there was any pattern that developed as I decreased the quantity of pixels sampled. The results surprised me because although the error increases as the Skip Factor increases, there isn’t a clear linear pattern with regards to how the errors manifest themselves.

From a Skip Factor of 100 – 750 you’ll get results that look reasonable enough. This was a pretty large dataset, so these sampling rates are not too extreme. But when you compare them to one another and to the true image, it’s difficult to predict which pixels will be accentuated. If you look at the means and standard deviations, they oscillate rather than steadily increase or decrease. Interestingly, the outputs don’t linearly get worse as you decrease the sampling rate. I’d argue that a Skip Factor of 750 looks more like the true image than the Skip Factor of 500 or even 250 does–after a certain point the outputs are almost random in terms of which Skip Factor gives an output that most represents the true values.

Here’s a table with the descriptive stats for each Skip Factor:

Skip Factor of 1

Skip Factor of 10

Skip Factor of 100

Skip Factor of 250

Skip Factor of 350

Skip Factor of 500

Skip Factor of 750

Skip Factor of 1000

Skip Factor of 1250

Skip Factor of 1500

Skip Factor of 2000

This entry was posted in Analysis & Geoprocessing, Imagery, Uncategorized and tagged , , , . Bookmark the permalink.