Cloud Computing in Practice
King Chung Huang
Technical Solutions Analyst, University of Calgary
“Using services in the cloud, develop a large-scale video conversion system.”
What is Cloud Computing?
“Cloud computing is simply a buzzword used to repackage grid computing and utility computing, both of which have existed for decades.”
whatis.com
definition of Cloud Computing
“The interesting thing about cloud computing is that we’ve redefined cloud computing to include everything that we already do. […] The computer industry is the only industry that is more fashion-driven than women’s fashion. Maybe I’m an idiot, but I have no idea what anyone is talking about. What is it? It’s complete gibberish. It’s insane. When is this idiocy going to stop?
Larry Ellison
during Oracle’s Analyst Day
“The interesting thing about cloud computing is that we’ve redefined cloud computing to include everything that we already do. […] The computer industry is the only industry that is more fashion-driven than women’s fashion. Maybe I’m an idiot, but I have no idea what anyone is talking about. What is it? It’s complete gibberish. It’s insane. When is this idiocy going to stop?
Larry Ellison
during Oracle’s Analyst Day
Getting Started
Infrastructure Services
Payments & Billing
Web Search & Information
Elastic Compute Cloud
Flexible Payments Service
Web Search
Simple Storage Service
DevPay
Web Information Service
Simple Queue Service SimpleDB
Top Sites Fulfillment & Associates
Site Thumbnail
Fulfillment Web Service Associates Web Service
On-Demand Workforce Mechanical Turk
Infrastructure Services
Payments & Billing
Web Search & Information
Elastic Compute Cloud
Flexible Payments Service
Web Search
Simple Storage Service
DevPay
Web Information Service
Simple Queue Service SimpleDB
Top Sites Fulfillment & Associates
Site Thumbnail
Fulfillment Web Service Associates Web Service
On-Demand Workforce Mechanical Turk
Infrastructure Services
Payments & Billing
Web Search & Information
Elastic Compute Cloud
Flexible Payments Service
Web Search
Simple Storage Service
DevPay
Web Information Service
Simple Queue Service SimpleDB
Top Sites Fulfillment & Associates
Site Thumbnail
Fulfillment Web Service Associates Web Service
On-Demand Workforce Mechanical Turk
Amazon Mechanical Turk • Marketplace for work that requires human intelligence ■ Enables programmatic distribution of tasks that require human intelligence • Human Intelligence Tasks (HITs) ■ Desired output ■ Format of output ■ Pay rate • Only pay for quality work ■ Review results and pay for accepted work
Ten Thousand Cents • Drawing collected from November 2007 to March 2008 • Total labour paid: US$100 ■ 1¢ per part × 10 000 parts • Prints available for purchase ■ $100 each
Video Conversion Plan • Upload source content to Simple Storage Service (S3) • Instantiate virtual machines on Elastic Compute Cloud (EC2) to perform conversions ■ Be able to instantiate more conversion machines as needed to scale to demand ■ Use Simple Queue Service (SQS) to queue requests • Log status and results in SimpleDB (SDB) ■ Was not able to gain access during beta period • Deliver conversion results from S3
Accessing Infrastructure Services • Web service APIs ■ SOAP and REST • Many applications/plugins support S3 ■ Transmit on Mac OS X ■ S3Fox for Firefox • AWS Developer Community provides lots of documentation and sample code ■ JavaScript, PHP, Python • Many public machine images for EC2 are available ■ Use them as a starting point for custom machines • In actuality, you rarely have to start from scratch
Pricing • Pay only for what you use ■ Data transfer ■ Data storage ■ Instance time ■ Transactions
Pricing Simple Storage Service (United States)
Storage Data Transfer In
15¢ per GB-month 10¢ per GB
Data Transfer Out
10¢–17¢ per GB
PUT, POST, or LIST
1.0¢ per 1000 requests
GET
1.0¢ per 10000 requests
Pricing Simple Storage Service (Europe)
Storage Data Transfer In
18¢ per GB-month 10¢ per GB
Data Transfer Out
10¢–17¢ per GB
PUT, POST, or LIST
1.2¢ per 1000 requests
GET
1.2¢ per 10000 requests
Pricing Simple Queue Service
Storage Data Transfer In Data Transfer Out Requests
none 10¢ per GB 10¢–17¢ per GB 1.0¢ per 10000 requests
Pricing Simple Queue Service (Old API, prior to 2008)
Storage Data Transfer In Data Transfer Out
none 10¢ per GB 10¢–17¢ per GB
Requests
1.0¢ per 10000 requests
Messages
10¢ per 1000 messages
Pricing Simple Queue Service (Old API, prior to 2008)
Impact of the Price Change We examined the effect that the new pricing would have had on Amazon SQS charges billed at the end of December 2007. Under the new plan, 76% of customers with bills greater than $1 would have received lower bills, saving an average of 71% each compared to their actual bill. source: http://aws.amazon.com/sqs/
Pricing Elastic Compute Service
Instances Data Transfer In Data Transfer Out
10¢–80¢ per hour 10¢ per GB 10¢–17¢ per GB
Pricing Freebies • Data transfer between EC2 instances are free ■ Instances must be in the same availability zone ■ Instances must address each other by private IP addresses • Data transfer between EC2 instances and S3 are free • Data transfer between EC2 instances and SQS are free
Pricing Elastic IP Addresses (EC2) • IPv4 Addresses are scarce • Instances are dynamically assigned a public IP address and a private IP address • IPv4 addresses can be statically allocated and used for instances ■ No cost while in use ■ 1¢ per non-attached Elastic IP address per hour
Nuances
Simple Storage Service (S3) • Storage is organized in buckets ■ Like a namespace for the objects it contains ■ Accessible via http://bucketname.s3.amazonaws.com • It’s not file storage, it’s a key-value store ■ Like a big hash table or dictionary ■ Key-value pairs ■ Movies/My Movie.mp4 ➜ ftypmp42isom5eiomdat9af… ■ UserHomePage ➜ http://www.ucalgary.ca/~kchuang/ • Implicit BitTorrent seeding for all keys
Elastic Compute Cloud (EC2) • Machine images are templates ■ Run multiple instances from the same image ■ Appliance model • Data can be passed to instances at startup ■ Provide input to an instance ■ Negates the need to communicate to an instance • Storage is independent of the machine ■ By default, instance storage is transient ■ Elastic Block Storage can be attached/detached as needed ■ Lifecycle is independent of instances
Beyond Raw Utilities • Additional facilities around traditional infrastructure elements increases their usability ■ Fits well with on-demand, only what you need usage model ■ Enables uses that were previously not possible
Speed Bumps
Reality Bites Simple Storage Service • S3 objects have a maximum size of 5 GB ■ Split objects like machine images • Changing one byte means reinserting the entire object • Renaming (re-keying) an object also means reinserting ■ New beta API allows for object moves within a bucket • Various bugs in third-party apps and S3 itself ■ Inserting objects between 2 to 4 GB can be difficult • Bandwidth can be a significant barrier ■ Project goal of inserting 1 TB would have taken 168 days
Reality Bites Elastic Compute Cloud • Making a change means recreating an entire image • Default storage is transient ■ Lasts only as long as the instance it is attached to ■ Elastic Block Storage can only be attached to a single instance at a time • Instances are billed by the hour ■ Constantly starting and stopping short-run instances can be costly
All Together, Now!
Cumulus • Completely implemented in JavaScript * ■ Calls Amazon services directly via REST APIs • Contacts S3 to retrieve stored videos • Converts video on demand ■ Writes jobs to an SQS queue ■ Invokes EC2 instances, which reads from job queue and performs conversion
* time constraints necessitated the use of some existing PHP sample code
Demo
Project Timeline
August 26 31 7 14 21 28
September 29
Project Timeline 8/26: project kickoff 8/29: project planning August 26 31 7 14 21 28
September 29
Project Timeline 9/2: work begins August 26 31 7 14 21 28
September 29
Project Timeline
August 26 31 7 14 21 28
September 29
9/5: slow upload speeds 9/8: daapd server abandoned
Project Timeline
August 26 31 7 14 21 28
September 29
9/11: attempting video conversion 9/12: visiting clouds
Project Timeline 9/18: bandwidth increased 9/18: first video converted August 26 31 7 14 21 28
September 29
Project Timeline 9/22: creating front-end web app August 26 31 7 14 21 28
September 29
9/26: starting presentation
Project Timeline
August 26 31 7 14 21 28
September 29
9/29: still working on presentation
Project Timeline
August 26 31 7 14 21 28
September 29
By the Numbers 19 2 70 GB 321 $41.60 16324
working days weekends transferred instance hours spent miles
as of September 27, 2008
Lessons Learned
Cloud Computing is more than Raw Utilities
Cloud Computing Enables Tinkering
Bandwidth Can Be Problematic
Centralization Results in Centralization of Risk
Be Aware of Service Level Agreements
More Information King Chung Huang
Technical Solutions Analyst, University of Calgary
[email protected]
Patrick Mann
Chief Technology Officer, Cybera
[email protected]
Cloud Computing White Paper coming soon http://www.cybera.ca
Q&A