King Chung Huang

Web Search & Information ... Upload source content to Simple Storage Service (S3). • Instantiate virtual .... Contacts S3 to retrieve stored videos. • Converts video ...
2MB taille 2 téléchargements 260 vues
Cloud Computing in Practice

King Chung Huang

Technical Solutions Analyst, University of Calgary

“Using services in the cloud, develop a large-scale video conversion system.”

What is Cloud Computing?

“Cloud computing is simply a buzzword used to repackage grid computing and utility computing, both of which have existed for decades.”

whatis.com

definition of Cloud Computing

“The interesting thing about cloud computing is that we’ve redefined cloud computing to include everything that we already do. […] The computer industry is the only industry that is more fashion-driven than women’s fashion. Maybe I’m an idiot, but I have no idea what anyone is talking about. What is it? It’s complete gibberish. It’s insane. When is this idiocy going to stop?

Larry Ellison

during Oracle’s Analyst Day

“The interesting thing about cloud computing is that we’ve redefined cloud computing to include everything that we already do. […] The computer industry is the only industry that is more fashion-driven than women’s fashion. Maybe I’m an idiot, but I have no idea what anyone is talking about. What is it? It’s complete gibberish. It’s insane. When is this idiocy going to stop?

Larry Ellison

during Oracle’s Analyst Day

Getting Started

Infrastructure Services

Payments & Billing

Web Search & Information

Elastic Compute Cloud

Flexible Payments Service

Web Search

Simple Storage Service

DevPay

Web Information Service

Simple Queue Service SimpleDB

Top Sites Fulfillment & Associates

Site Thumbnail

Fulfillment Web Service Associates Web Service

On-Demand Workforce Mechanical Turk

Infrastructure Services

Payments & Billing

Web Search & Information

Elastic Compute Cloud

Flexible Payments Service

Web Search

Simple Storage Service

DevPay

Web Information Service

Simple Queue Service SimpleDB

Top Sites Fulfillment & Associates

Site Thumbnail

Fulfillment Web Service Associates Web Service

On-Demand Workforce Mechanical Turk

Infrastructure Services

Payments & Billing

Web Search & Information

Elastic Compute Cloud

Flexible Payments Service

Web Search

Simple Storage Service

DevPay

Web Information Service

Simple Queue Service SimpleDB

Top Sites Fulfillment & Associates

Site Thumbnail

Fulfillment Web Service Associates Web Service

On-Demand Workforce Mechanical Turk

Amazon Mechanical Turk • Marketplace for work that requires human intelligence ■ Enables programmatic distribution of tasks that require human intelligence • Human Intelligence Tasks (HITs) ■ Desired output ■ Format of output ■ Pay rate • Only pay for quality work ■ Review results and pay for accepted work

Ten Thousand Cents • Drawing collected from November 2007 to March 2008 • Total labour paid: US$100 ■ 1¢ per part × 10 000 parts • Prints available for purchase ■ $100 each

Video Conversion Plan • Upload source content to Simple Storage Service (S3) • Instantiate virtual machines on Elastic Compute Cloud (EC2) to perform conversions ■ Be able to instantiate more conversion machines as needed to scale to demand ■ Use Simple Queue Service (SQS) to queue requests • Log status and results in SimpleDB (SDB) ■ Was not able to gain access during beta period • Deliver conversion results from S3

Accessing Infrastructure Services • Web service APIs ■ SOAP and REST • Many applications/plugins support S3 ■ Transmit on Mac OS X ■ S3Fox for Firefox • AWS Developer Community provides lots of documentation and sample code ■ JavaScript, PHP, Python • Many public machine images for EC2 are available ■ Use them as a starting point for custom machines • In actuality, you rarely have to start from scratch

Pricing • Pay only for what you use ■ Data transfer ■ Data storage ■ Instance time ■ Transactions

Pricing Simple Storage Service (United States)

Storage Data Transfer In

15¢ per GB-month 10¢ per GB

Data Transfer Out

10¢–17¢ per GB

PUT, POST, or LIST

1.0¢ per 1000 requests

GET

1.0¢ per 10000 requests

Pricing Simple Storage Service (Europe)

Storage Data Transfer In

18¢ per GB-month 10¢ per GB

Data Transfer Out

10¢–17¢ per GB

PUT, POST, or LIST

1.2¢ per 1000 requests

GET

1.2¢ per 10000 requests

Pricing Simple Queue Service

Storage Data Transfer In Data Transfer Out Requests

none 10¢ per GB 10¢–17¢ per GB 1.0¢ per 10000 requests

Pricing Simple Queue Service (Old API, prior to 2008)

Storage Data Transfer In Data Transfer Out

none 10¢ per GB 10¢–17¢ per GB

Requests

1.0¢ per 10000 requests

Messages

10¢ per 1000 messages

Pricing Simple Queue Service (Old API, prior to 2008)

Impact of the Price Change We examined the effect that the new pricing would have had on Amazon SQS charges billed at the end of December 2007. Under the new plan, 76% of customers with bills greater than $1 would have received lower bills, saving an average of 71% each compared to their actual bill. source: http://aws.amazon.com/sqs/

Pricing Elastic Compute Service

Instances Data Transfer In Data Transfer Out

10¢–80¢ per hour 10¢ per GB 10¢–17¢ per GB

Pricing Freebies • Data transfer between EC2 instances are free ■ Instances must be in the same availability zone ■ Instances must address each other by private IP addresses • Data transfer between EC2 instances and S3 are free • Data transfer between EC2 instances and SQS are free

Pricing Elastic IP Addresses (EC2) • IPv4 Addresses are scarce • Instances are dynamically assigned a public IP address and a private IP address • IPv4 addresses can be statically allocated and used for instances ■ No cost while in use ■ 1¢ per non-attached Elastic IP address per hour

Nuances

Simple Storage Service (S3) • Storage is organized in buckets ■ Like a namespace for the objects it contains ■ Accessible via http://bucketname.s3.amazonaws.com • It’s not file storage, it’s a key-value store ■ Like a big hash table or dictionary ■ Key-value pairs ■ Movies/My Movie.mp4 ➜ ftypmp42isom5eiomdat9af… ■ UserHomePage ➜ http://www.ucalgary.ca/~kchuang/ • Implicit BitTorrent seeding for all keys

Elastic Compute Cloud (EC2) • Machine images are templates ■ Run multiple instances from the same image ■ Appliance model • Data can be passed to instances at startup ■ Provide input to an instance ■ Negates the need to communicate to an instance • Storage is independent of the machine ■ By default, instance storage is transient ■ Elastic Block Storage can be attached/detached as needed ■ Lifecycle is independent of instances

Beyond Raw Utilities • Additional facilities around traditional infrastructure elements increases their usability ■ Fits well with on-demand, only what you need usage model ■ Enables uses that were previously not possible

Speed Bumps

Reality Bites Simple Storage Service • S3 objects have a maximum size of 5 GB ■ Split objects like machine images • Changing one byte means reinserting the entire object • Renaming (re-keying) an object also means reinserting ■ New beta API allows for object moves within a bucket • Various bugs in third-party apps and S3 itself ■ Inserting objects between 2 to 4 GB can be difficult • Bandwidth can be a significant barrier ■ Project goal of inserting 1 TB would have taken 168 days

Reality Bites Elastic Compute Cloud • Making a change means recreating an entire image • Default storage is transient ■ Lasts only as long as the instance it is attached to ■ Elastic Block Storage can only be attached to a single instance at a time • Instances are billed by the hour ■ Constantly starting and stopping short-run instances can be costly

All Together, Now!

Cumulus • Completely implemented in JavaScript * ■ Calls Amazon services directly via REST APIs • Contacts S3 to retrieve stored videos • Converts video on demand ■ Writes jobs to an SQS queue ■ Invokes EC2 instances, which reads from job queue and performs conversion

* time constraints necessitated the use of some existing PHP sample code

Demo

Project Timeline

August 26 31 7 14 21 28

September 29

Project Timeline 8/26: project kickoff 8/29: project planning August 26 31 7 14 21 28

September 29

Project Timeline 9/2: work begins August 26 31 7 14 21 28

September 29

Project Timeline

August 26 31 7 14 21 28

September 29

9/5: slow upload speeds 9/8: daapd server abandoned

Project Timeline

August 26 31 7 14 21 28

September 29

9/11: attempting video conversion 9/12: visiting clouds

Project Timeline 9/18: bandwidth increased 9/18: first video converted August 26 31 7 14 21 28

September 29

Project Timeline 9/22: creating front-end web app August 26 31 7 14 21 28

September 29

9/26: starting presentation

Project Timeline

August 26 31 7 14 21 28

September 29

9/29: still working on presentation

Project Timeline

August 26 31 7 14 21 28

September 29

By the Numbers 19 2 70 GB 321 $41.60 16324

working days weekends transferred instance hours spent miles

as of September 27, 2008

Lessons Learned

Cloud Computing is more than Raw Utilities

Cloud Computing Enables Tinkering

Bandwidth Can Be Problematic

Centralization Results in Centralization of Risk

Be Aware of Service Level Agreements

More Information King Chung Huang

Technical Solutions Analyst, University of Calgary [email protected]

Patrick Mann

Chief Technology Officer, Cybera [email protected]

Cloud Computing White Paper coming soon http://www.cybera.ca

Q&A