I am not a fan buzzwords, particularly those corporate buzzwords which weasle their way into the business culture to become the jargon-mantras of mediocrity. So I've looked with a dubious eye at the hype surrounding Cloud Computing. But now that I have had a chance to spend some time with it in developing a cloud web app on Amazon Elastic Cloud Compute, I must confess that this stuff is really exciting.
Building a Cloud Application can truly benefit data- and computation-heavy workflows in profound ways. But jumping into development can be somewhat different than other web development at first, and the Amazon EC2 documentation and tools can be a little daunting. Fortunately, however, I have identified some tools that really help smooth out the process for me.
Before we get into the tools, though, let's take a quick look at what cloud computing has to offer.
What Cloud Computing does for web applications
- Repetitive computationally-intensive tasks
- Varying scale/quantity of tasks
- Segmentable massive computation
- Redundantly store data on servers for client access on any computer
What this means for your application
- Much greater capacity for high intensity and high volume computation
- The ability to scale on demand and seamlessly absorb exponential growth.
- The ability to meet deadlines by scaling computing bandwidth according to demand.
- Automated redundancy of data and digital assets on the cloud.
Amazon Web Services
At Higher Media we use Amazon Web Services to host our cloud applications. Amazon offers a quiver of complimentary services that make the creation of automated cloud-friendly applications straightforward. To make an application ready for the cloud, you need to break it down into processes that can be spread across multiple computers that distribute the load in specialized tasks. Here's a summary of Amazon’s tools to do just that:
EC2 - Elastic Cloud Compute
The backbones of cloud computing are the virtualized computer instances on the Amazon servers at the data center. It’s elastic because you can deploy as many instances of the same computer “image” as you need to get a job done. Amazon’s GUI tools for configuring and managing instances are easy to use, and they provide command-line tools and APIs to allow you to automate control over instances in software. Your web app will use several instances of different computer images to perform the constituent tasks of your application. Each task will require an executable on your custom EC2 image.
S3 / EBS - Simple Storage Service / Elastic Block Storage
The next component of a cloud application is a place to store data, particularly the assets that your application will be manipulating. Amazon provides a couple of options here. For scalable storage of files and assets that do not depend upon a specific directory structure, Simple Storage Service provides a very affordable and simple method to store and transfer files. S3 redundantly stores all of its data across multiple servers across multiple facilities, automatically protecting your data with redundancies. Amazon claims “99.999999999% durability over a given year” for S3. It makes data redundancy very very easy.
Elastic Block Storage fulfills a somewhat different role, as “removable” drives that EC2 instances can mount like a local hard-drive. EBS provides fast I/O and maintains complete directory structure, which can be useful for some applications that need fast direct access to a wider range of assets for a single task. EBS volumes can be stored as snapshots on S3 to assure the integrity of your data and to allow all your EC2 instances access to the same data.
SQS - Simple Queue Service
Next you need a way to pass commands between EC2 instances so they can push the assets through the automated processes of your application. Simple Queue Service provides a way for instances to send commands asynchronously to one another. And not just between instances. Any computer on the internet can issue commands into the queue, which will be distributed out to the instances waiting to receive the message. Because of this, SQS is much simpler than any other queueing software I have dealt with, which generally require explicit network configuration between computers in the farm. It also makes it very easy to scale applications, because SQS will quickly distribute tasks to as many instances as are launched to take them on. You can even automate your applications to scale in response to the load on the queue by launching additional instances as needed without human intervention.
SDB / RDS - Simple Database / Relational Database Service
Along the way, your cloud application will probably need to store some information in a database, particularly metadata about your assets on S3 that required the heavier computation. Amazon gives you two options for databasing within their cloud. Simple Database provides a flexible data storage format which essentially amounts to a fast associative array that is stored to disk and can be queried in SQL syntax. It is easy to setup (columns can be added or removed for any given row arbitrarily) and easy to use for small chunks of data, and a convenient way to keep concise metadata for less complex cloud applications. And it's optimized for data stores that get really really big. Amazon Relational Database Service provides all the functionality of MySQL relational databases, but provides the added service of automated maintenance and backup on the cloud. Being on the cloud, RDS is easy to replicate for increased availability if you have massive amounts of queries from large amounts of users. This option is particularly great if you are migrating an existing relational database application to the cloud.
Other Helpful Tools
Ubuntu - a great Unix distribution for beginners and power users.
I like Ubuntu because it is well-supported by a community that writes in human-readable sentences. When I have Unix questions, I most often find useful answers on Ubuntu forums. And installation and package management on Ubuntu is very easy. And if you are just finding your Linux feet, it is very straightforward to boot Ubuntu from a thumbdrive so you can practice Ubuntu via its GUI tools on your laptop before switching over to command-line only for your cloud instances. See notes for getting started with Ubuntu on Amazon EC2 here, and the list of public AMIs for the Ubuntu 10.10 "Maverick Meerkat" here.
AWS SDK for PHP on CodeIgniter - a simple and elegant AWS API on PHP
Formerly called CloudFusion (and Tarzan before that), the AWS SDK for PHP makes it very easy to integrate all the AWS tools into a web application. On CodeIgniter, it becomes even smoother. I have been very happy with how quickly and easily I picked up on how to use this SDK. This is how you will likely build the web GUIs for your cloud application, and it’s nice just how easy it is.
We will be describing the process of putting the SDK on CodeIgniter in a future article, but in the meantime you can turn to this blog for a kick-start in getting it set up. Some modification is required for the new version.
AWS SDK for Java - a great way to get started writing cloud executables
A key part of your cloud application are the executables that run on your instances. Amazon provides a nice template for SQS-driven executables written in Java, which is easy to extend with your functionality. The AWS SDK for Java provides you with a powerful toolkit for manipulating the components of your cloud application inside of a queue-driven executable. Although it is not as simple as the PHP SDK (at least for Java-as-a-Second-Language folks like me), these Java tools are very useful.