This post describes how we at the Esri Technical Marketing group addressed the architectural challenge of frequently updating web applications hosted on the Amazon Elastic Compute Cloud (EC2). Because you may encounter similar scenarios when hosting your own ArcGIS Server applications on the cloud, we wanted to share our approach.
Esri Technical Marketing maintains an Amazon cloud based, load balanced, scalable, application server that hosts web mapping applications. These apps require high amounts of revision due to the nature of the type of applications we generally work on. Many of our apps are built to provide information about emergency situations. These apps are released very soon after the emergency, allowing for little testing beforehand. Between the bugs inherent in fast development and the changing nature of the emergency situations, the apps experience a lot of iteration.
Updating these apps with traditional methods would require updating the staging instance, generating an AMI, and then launching new instances to replace the existing live instances: a tedious and time consuming task. Another option would be to manually update each machine individually, but this can lead to human error on live machines. In a load balanced environment like we use, this results in different experiences for users and it can be difficult to determine which machine is the cause of the problem. These solutions are not practical for our release cycle, which often consists of several changes per week or even per day. We needed a way of updating multiple live instances seamlessly, while maintaining the ability to scale our servers as demand increases.
To address this challenge we decided to store our application code using Amazon Simple Storage Service (S3), then transfer this code automatically to our live servers. Because our application servers were already deployed in Amazon EC2, it made sense to continue to use Amazon’s technology. We’ve also found that Amazon S3 is a relatively inexpensive solution for small amounts of data.
While S3 is capable of hosting simple web applications entirely within its framework, our applications often require functionality it does not provide. Consequently, we just treat S3 as a repository for our most current application code. We synchronize our local updates with our repository in S3, ensuring that our code base contains the most up-to-date code available.
We’ve also configured our live servers to periodically request updates from S3. We do this through a free Python-based tool called S3cmd provided by the good folks at S3tools.org. I can actually run this tool locally to automatically send new or updated files to S3, or I can run it on the live instances to download any updates that S3 may have received.
Implementation and script:
The implementation of this process is fairly straightforward. Within our local file server we set up a folder for staging applications and a second folder for live applications. Within these folders, each application has its own folder. We then have a scheduled task run every 1 minute to check if there are changes to anything contained in the staging folder or the live folder.
If changes are detected, we then run the command to sync to the S3 bucket in the cloud. Having the script check the file system cuts down on traffic to the S3 bucket to determine if changes were made, and it also cuts down on transfer costs with Amazon.
Meanwhile, the application servers also have scheduled tasks that run every 5 minutes to check the S3 bucket for updates. Because we cannot tell locally if there are any changes to be made to the files, we increased the duration between executions of the scheduled task to cut down on traffic to Amazon.
We built a Python script that can do either of these tasks as well as log any errors that may come up. It’s available for you to download here.
Contributed by David McGuire of the Esri Technical Marketing team