Scalable hosted feature layers in ArcGIS Online: Tile queries and response caching

Background

Hosted feature layers come in all shapes and sizes; some start off big, some grow over time as data is updated. Regardless of their origins, it’s important that datasets remain performant, which brings me to the topic of today’s blog: response caching.  The hosted feature service team has been hard at work over the past few releases getting all the pieces in place for response caching.   Response caching might sound a bit boring on the surface but it’s pretty exciting as it allows hosted feature layers to scale even when your maps go viral.

Before I dive in, though, let’s talk about what a response is.  A response is simply the feature layer’s answer to a question you asked it.  While you can ask many questions, the most common one that is asked is: What features are in the map I’m looking at?

ArcGIS Javascript API clients, and by extension ArcGIS Online, work well with layers of all sizes.  When feature layers have a small number of features, they can be retrieved in a single request because the response from the feature layer is small. For larger layers, features are requested using tiled requests (splitting the query up into several smaller spatial queries). Tile requests have the added benefit of being consistent, even across different users and apps.  This consistency allows those responses to be cached once on the server and shared between all users. In both cases (single or tiled) those requests can also be cached on the client.  Tile requests also use quantization to improve performance but that’s a whole topic to itself that we’ll save for another blog post.

Because ArcGIS Online uses server-side response caching when multiple users request the same information, that cache kicks in.  This frees up the resources on both the server and the underlying database, allowing feature layers to scale out to millions of users and clients without the need to explicitly generate tiles ahead of time (which ArcGIS Online also supports when you need it).  These cached responses from the server are automatically invalidated as the data is edited.  This ensures that clients using the layer always get the latest information.

For many of you, the simple story is: “ArcGIS Online uses tile queries and response caching strategically to optimize performance and reduce load.”  For some of you, knowing that is enough.  But for the more curious geogeeks, here’s a deeper dive under the hood to understand all the moving parts that work in harmony to deliver performant feature layers.

Client-side caching: Your personal cache in your browser

Client-side caching is like your own personal cache of responses from the server, managed by your web browser.  When you query a feature layer, the feature service responds with a set of features that is downloaded and stored in your browser’s cache.  As you pan around the map, or zoom in and out, the browser uses those downloaded features in the browser cache whenever possible in order to avoid having to re-download the features every time the map view changes.

That browser cache has a shelf life though. If you pan around your data while it is being edited, the browser asks the servers whether anything has changed since the last time it queried. If so, the browser gets the new features and updates its cache accordingly.  This ensures that what you’re seeing on the map is current, and reduces the load on your computer and ArcGIS Online.  This browser cache only persists on the client and is not stored anywhere else. Therefor, this only benefits you and won’t make anyone else’s experience faster.   That brings us to the CDN (Content Delivery Network).  Its goal is to improve the experience for everyone.

CDN: A cache for everyone, anywhere

Content delivery networks, or CDN for short, are the backbone of a speedy internet.  For publicly shared hosted feature layers, the CDN acts similarly to client-side caching but with a few differences. First, the CDN response cache is reused by anyone using the layer and sending the same query, so everyone benefits from it, not just you.  Second, the CDN is distributed all over the world, and the cache is mirrored, which means even if the servers hosting the data are halfway around the world, the cache is most likely much closer to you.  So, in addition to being cached, the data has fewer hops over the internet to reach you so it’s faster to download.

Server-side caching: Shared and stored in ArcGIS Online

As mentioned in the previous section of this post, only publicly shared layers use the CDN, so layers shared within your organization do not make use of the CDN response cache.  This ensures that your private data is not cached on external servers around the world.  Server-side caching caches the responses to tile requests within ArcGIS Online so that other users in your organization and the public can reuse the cache when the browser cache and the CDN can’t be used.  This cache is shared between all authorized users and maintained internally as part of ArcGIS Online’s infrastructure.  As a result, queries come back quickly, put less load on the underlying databases, and keep everything running smoothly at scale even under heavy load.  As your data changes, the feature tile cache is invalidated to ensure that clients using the feature layer always see the most current information.

Wrap up: Response caching in a nutshell

When you put all the caching methods together for hosted feature layers, the workflow looks something like this:

This entry was posted in ArcGIS Online, Uncategorized and tagged , , , . Bookmark the permalink.

Leave a Reply

6 Comments

  1. johnmdye says:

    This is a great article Paul. Love geeky stuff like this that demystifies what’s going on under the hood. Is all of this information accurate as it might pertain to ArcGIS Enterprise?

    • Paul Barker says:

      johnmdye, I’m glad you enjoyed the article. To answer your question everything mentioned in the blog is possible with ArcGIS Enterprise but isnt as turn key. The client side caching as you might expect just works. With respect to the CDN that would be up to your organizations IT staff to setup, either through a CDN provider or self hosted. Lastly Server-side caching could be achieved through the use of SOI’s (Server Object Interceptors), however that is something we want to add in as core feature of ArcGIS Enterprise in a future release

  2. david.runneals_iowadot says:

    In using hosted feature views the last month or so for our high availability, near real-time application, we have had many issues where it would require people to login even if the service was shared with everyone (don’t know if it’s a caching issue or a view issue, but either way feature collections have seemed to remain solid). The other issue that we have for near-real time services/views, is that we are unsure as to how often the caching mechanisms invalidate the stored cache.

    • Paul Barker says:

      david,

      Thanks for the question. The issue you are describing is not related to response caching for feature layers. The issue you encountered should be resolved and was fixed late December. If you encounter it again, please reach out to support and they can help investigate further.

      Regarding the caching question. How often, really depends on how frequently your data is changing. When you send a query , the server checks to see if the cache has the latest information. If the cache is up to date, it uses the cached response. If not a new query is executed, then when that query finishes it updates the cache accordingly.