A Review of Distributed computing Raghu Ramakrishnan Boss Researcher, Gathering of people and Distributed computing Rese

1608 days ago, 569 views
PowerPoint PPT Presentation
Also, represent how more cloud abilities (and relating foundation parts) ... information serious processing is progressively fundamental to everything Yahoo! ...

Presentation Transcript

Slide 1

An Overview of Cloud Computing Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Research Fellow, Yahoo! Examine Reflects numerous talks with: Eric Baldeschwieler, Jay Kistler, Chuck Neerdaels, Shelton Shugar, and Raymie Stata and joint work with the Sherpa group, specifically: Brian Cooper, Utkarsh Srivastava, Adam Silberstein and Nick Puz in Y! Inquire about Chuck Neerdaels, P.P. Suryanarayanan and numerous others in CCDI

Slide 2

Yahoo! Examine Raghu Ramakrishnan Brian Cooper Utkarsh Srivastava Adam Silberstein Nick Puz Rodrigo Fonseca CCDI Chuck Neerdaels P.P.S. Narayan Kevin Athey Toby Negrin Plus Dev/QA groups CCDI—Research Collaboration

Slide 3

SCENARIOS Pie-in-the-sky

Slide 4

Living in the Clouds We need to begin another site, FredsList.com Our site will give postings of things to deal, occupations, and so forth. Over the long haul, we'll include more elements And outline how more cloud capacities (and comparing framework segments) are utilized as required List of abilities/parts is illustrative, not comprehensive Our cloud gives a "dataset" deliberation FredsList doesn't stress over the hidden segments

Slide 5

Step 1: Listings FredsList needs to store postings as (key, class, portrayal) FredsList.com application DECLARE DATASET Listings AS ( ID String PRIMARY KEY, Category String, Description Text ) 5523442, childcare, Nanny accessible in San Jose 1234323, transportation, For deal: one bike, scarcely utilized 215534, needed, Looking for issue 1 of Superman comic book Simple Web Service API's Database Sherpa

Slide 6

Step 2: Search FredsList's clients rapidly request catchphrase seek FredsList.com application ALTER Listings SET Description SEARCHABLE "dvd's" "bike" "caretaker" Simple Web Service API's Database Search Sherpa Vespa Messaging YMB

Slide 7

Step 3: Photos FredsList chooses to add photographs to postings FredsList.com application ALTER Listings ADD Photo BLOB Simple Web Service API's Storage Database Search Foreign key photograph → posting MObStor Sherpa Vespa Messaging YMB

Slide 8

Step 4: Data Analysis FredsList needs to investigate its postings to get insights about classification, do geocoding, and so on. FredsList.com application ALTER Listings MAKE ANALYZABLE Hadoop program to produce favor pages for postings Hadoop program to geocode information Pig inquiry to investigate classes Simple Web Service API's Storage Compute Database Search Foreign key photograph → posting MObStor Grid Sherpa Vespa Messaging YMB Batch send out

Slide 9

Step 5: Performance FredsList needs to decrease its information get to inactivity FredsList.com application ALTER Listings MAKE CACHEABLE Simple Web Service API's Storage Compute Database Caching Search Foreign key photograph → posting MObStor Grid Sherpa memcached Vespa Messaging YMB Batch trade

Slide 10

EYES TO THE SKIES Motherhood-and-Apple-Pie

Slide 11

Why Clouds? On-request foundation to make a major move in the OE bend. How about we us: Do things we can't do Reduce time to market Build all the more powerfully, more effectively, more all inclusive, all the more totally, for a given spending Cloud administrations ought to do hard work of truly difficult work of scaling & high-accessibility Today, this is done at the application level, which is not profitable

Slide 12

Requirements for Cloud Services Multitenant. A cloud benefit must bolster numerous, hierarchically far off clients. Flexibility. Occupants ought to have the capacity to arrange and get assets/QoS on-request . Asset Sharing. In a perfect world, save cloud assets ought to be straightforwardly connected when an occupant's arranged QoS is deficient, e.g., because of spikes. Level scaling. It ought to be conceivable to include cloud limit in little additions; this ought to be straightforward to the occupants of the administration. Metering. A cloud benefit must bolster bookkeeping that sensibly credits operational and capital uses to each of the occupants of the administration. Security. A cloud administration ought to be secure in that inhabitants are not made powerless as a result of escape clauses in the cloud. Accessibility. A cloud administration ought to be exceedingly accessible. Operability. A cloud administration ought to be anything but difficult to work, with couple of administrators. Working expenses ought to scale directly or better with the limit of the administration.

Slide 13

Types of Cloud Services Two sorts of cloud administrations: Horizontal Cloud Services Functionality empowering inhabitants to construct applications or new administrations on top of the cloud Functional Cloud Services Functionality that is helpful all by itself to occupants. E.g., different SaaS examples, for example, Saleforce.com; Google Analytics and Yahoo's! IndexTools; Yahoo! properties went for end-clients and independent companies, e.g., flickr, Groups, Mail, News, Shopping Could be expand on top of level cloud administrations or starting with no outside help Yahoo! has been putting forth these for quite a while (e.g., Mail for SMB, Groups, Flickr, BOSS, Ad trades)

Slide 14

Horizontal Cloud Services Horizontal cloud administrations are establishments on which inhabitants construct applications or new administrations. They ought to be: without semantics. Must be "generic framework," and not fixing to particular application rationale. May give the capacity to infuse application rationale through very much characterized APIs Broadly appropriate. Must be comprehensively appropriate (i.e., it can't be planned for only maybe a couple properties). Blame tolerant over product equipment. Must be manufactured utilizing economical ware equipment, and ought to veil part disappointments. While every cloud benefit gives esteem, the force of the cloud worldview will rely on upon an accumulation of well-picked, inexactly coupled administrations that all in all make it simple to rapidly create and work imaginative web applications.

Slide 15

What's in the Horizontal Cloud? Security Simple Web Service API's Horizontal Cloud Services Provisioning & Virtualization e.g., EC2 Batch Storage & Processing e.g., Hadoop & Pig Operational Storage e.g., S3, MObStor, Sherpa Edge Content Services e.g., YCS, YCPI Other Services Messaging, Workflow, virtual DBs & Webserving ID & Account Management Shared Infrastructure Metering, Billing, Accounting Monitoring & QoS Common Approaches to QA, Production Engineering, Performance Engineering, Datacenter Management, and Optimization

Slide 16

Yahoo! CCDI Thrust Areas Fast Provisioning and Machine Virtualization: On request, convey an arrangement of hosts imaged with fancied programming and designed against standard administrations Multiple hosts might be multiplexed onto the same physical machine. Cluster Storage and Processing: Scalable information stockpiling enhanced for group handling, together with computational abilities Operational Storage: Persistent capacity that backings low-dormancy redesigns and adaptable recovery Edge Content Services: Support for managing system topology, correspondence conventions, reserving, and BCP Rest of today's discussion

Slide 17

Hadoop: Batch Storage/Analysis Why is clump preparing imperative? Whether it's reaction expectation for promoting machine-learned significance for Search, or substance streamlining for gathering of people, information concentrated figuring is progressively vital to everything Yahoo! does Hadoop is fundamental to tending to this need Hadoop is a contextual investigation in our cloud vision Processes gigantic measures of information Provides flat scaling and adaptation to non-critical failure for our clients Allows those clients to concentrate on their application rationale [Workflow] High-level question layer (Pig) Map-Reduce HDFS

Slide 18

SHERPA To Help You Scale Your Mountains of Data

Slide 19

The Yahoo! Capacity Problem Small records – 100KB or less Structured records - tens, hundreds or a huge number of fields Extreme information scale - Tens of TB Extreme ask for scale - Tens of a great many solicitations/sec Low dormancy all inclusive - 20+ datacenters overall High Availability - blackouts cost $millions Variable utilization designs - as applications and clients change 19

Slide 20

The Sherpa Solution The cutting edge worldwide scale record store Record-introduction: Routing, information stockpiling advanced for low-idleness record get to Scale out: Add machines to scale throughput (while keeping inertness low) Asynchrony: Pub-sub replication to far-flung datacenters to cover proliferation postpone Consistency display: Reduce unpredictability of asynchrony for the application software engineer Cloud arrangement show: Hosted, oversaw administration to decrease application time-to-market and empower on request scale and flexibility 20

Slide 21

What is Sherpa? A 42342 E A 42342 E B 42521 W B 42521 W C 66354 W D 12352 E F 15677 E A 42342 E 75656 C B 42521 W C 66354 W C 66354 W D 12352 E D 12352 E 75656 C E 75656 C F 15677 E F 15677 E CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Structured, adaptable outline Geographic replication Parallel database Hosted, oversaw framework 21

Slide 22

A 42342 E A 42342 E A 42342 E B 42521 W B 42521 W B 42521 W C 66354 W C 66354 W C 66354 W D 12352 E D 12352 E D 12352 E 75656 C E 75656 C E 75656 C F 15677 E F 15677 E F 15677 E What Will Sherpa Become? Files and perspectives CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Geographic replication Parallel database Structured, adaptable mapping Hosted, oversaw foundation

Slide 23

Sherpa Design Go