Distributed computing Forever Science

Cloud computing for life science l.jpg
1 / 53
1366 days ago, 595 views
PowerPoint PPT Presentation
Flexible Compute Cloud (

Presentation Transcript

Slide 1

Distributed computing for Life Science 2009 Bio-IT World Webinar Chris Dagdigian, chris@bioteam.net BioTeam Inc.

Slide 2

Fair Warning Giving me 20 minutes to talk is unsafe I'm to some degree scandalous I talk quick Typically have a crazy number of slides Latest slides will be here: http://blog.bioteam.net

Slide 3

BioTeam Inc. Autonomous Consulting Shop: Vendor/innovation rationalist Distributed substance - no physical office Staffed by: Scientists compelled to learn High Performance IT to lead inquire about Many years of industry & scholarly experience Our forte: Bridging the hole between Science & IT 2

Slide 4

High Level Topics For Today What "cloud" intends to me Getting our vocabulary straight Current State Report Good, terrible & revolting Mapping informatics onto the cloud An endeavor at some exhortation Hard lessons adapted Some certifiable illustrations 3

Slide 5

Terminology Blunt words: Cloud Computing Why I drank the Kool-Aid Amazon AWS Overview Cloud Sobriety Cloud Security AWS: Good, Bad & Ugly Examples Recommendations Topics - More Detail

Slide 6

Setting The Stage Burned by "OMG!! Matrix Computing" Hype In 2009 will make a decent attempt never to utilize the word " cloud " in any genuine specialized discussion. Vocabulary matters. Comprehend My Bias: Speaking of "utility figuring" as it resounds with framework individuals My building pieces are servers or gatherings of frameworks, not programming stacks, designer APIs or business items Goal : Replicate, copy, enhance or move complex frameworks 5

Slide 7

Lets Be Honest Not advanced science Fast getting to be acknowledged and standard Easy to comprehend the geniuses & cons

Slide 8

While I'm Being Honest … Amazon Web Services is the cloud Simple, down to earth, reasonable and usable today by pretty much anybody Rollout of components and capacities keeps on being impressive* AWS Oct. 27th declaration (today!): Cheaper EC2 estimating & High Memory alternatives AWS Relational Database Service Competitors are years behind … and have a tendency to accept their very own lot promoting materials 7

Slide 9

Utility/Cloud Computing: Getting Back On Topic Why I drank the Kool-Aid

Slide 10

Tipping Point: Hype to Reality 2007: Individual staff experimentation all year Including MPI applications (mpiblast) Q1 2008: Realized that each and every BioTeam specialist had freely utilized AWS to take care of a client confronting issue No order or focal arranging, it simply happened naturally 9

Slide 11

BioTeam AWS Use Today Running Our Business Development, Prototyping & CDN Effective asset for tech-driven firms Grid Training Practice Self-sorting out Grid Engine groups in EC2 Students get root all alone bunch Proof Of Concept Projects for Clients UnivaUD - UniCluster on EC2 Sun - SDM 'save pool' servers from EC2 Directed Efforts on AWS For ISV and Pharma customers 10

Slide 12

Amazon AWS Overview http://aws.amazon.com/items/Today's online course: Skip for time reasons (incorporated into slide deck as reference material … )

Slide 13

Amazon Web Services An accumulation of coordinated foundation administrations accessible to on-request New items and included components included month to month Recent upgrades: Two-calculate Authentication & Rotating Credentials Virtual Private Cloud ("VPC") Product EC2 auto-scaling & stack adjusting http://aws.amazon.com/about-aws/whats-new/

Slide 14

AWS Products/Services EC2 - Elastic Compute Cloud Scalable on-request virtual servers SimpleDB - Simple Database Service Simple inquiries on organized information S3 - Simple Storage Service Bucket/question based capacity EBS - Elastic Block Service Persistent square stockpiling (resembles a circle)

Slide 15

AWS Products/Services, cont. SQS - Simple Queue System Message passing administration stockpiling Elastic MapReduce Hadoop on AWS VPS - Virtual Private Cloud Connect your foundation to AWS by means of VPN passage (more imperative than it sounds … )

Slide 16

Elastic Compute Cloud ("EC2") An arrangement of APIs you can summon to control remote VM occasions Easy to dispatch existing pictures Easy to assemble your own custom server pictures Xen occurrences on-request Starting at .10/hour for 32bit framework 64bit frameworks begin at $.40/hour Fire up the same number of as you need, at whatever point you require them Many interfaces/control focuses Mozilla modules, CLI, Java, Perl, and so on

Slide 17

Elastic Compute Cloud Why it works Smart valuing Server example evaluating is sensible Traffic to/from S3 stockpiling cloud is free Experimenting is very economical 1 week of messing around == receipt for $9 USD Weeklong SGE preparing on huge machines == $79 USD Easy to utilize

Slide 18

Elastic Compute Cloud Why it works, proceeded with Rapid rate of improvements & new components Availability zones Reserved cases Live certification pivot Clever individuals can profit Amazon permits exchanging AMI case pictures I can fabricate a specific work process motor and charge a little expense on top of the Amazon costs All budgetary exchanges took care of by Amazon Limitations are entirely evident Easy to recognize what work processes are or are-not EC2 neighborly

Slide 19

Amazon EC2 "Aha! Minute" Consider a nonexclusive 100 CPU hour look into issue: EC2: 10 substantial servers @ .40/hr for 10 hours Work done in 10 HOURS at cost of $40 USD EC2: 100 vast servers @ .40/hr for 1 hour Work done in 1 HOUR at a cost of $40 USD Can you do THAT in your datacenter today?

Slide 20

Amazon S3 Add and expel stuff into "containers" 1 byte to 5GB for every protest Required for capacity more noteworthy than 1 terabyte Popular with web 2.0 outfits Standard REST and SOAP interfaces BitTorrent interface also Required part of EC2 utilization All EC2 AMI (server pictures) are put away in S3 Cheap to move information in/out Reasonable month to month expense for steady stockpiling Free to move information inside Amazon benefits Lots of interfaces

Slide 21

Amazon S3, cont. Comparative quick rate of upgrades as EC2 Hooks into Amazon CDN item ('CloudFront') Interesting access/download APIs Including "downloader pays" Of noteworthy enthusiasm to our group Physical ingest/outgest benefit Send your USB 2.0 or SATA gadget to Amazon for fast stacking of huge datasets

Slide 22

Elastic Block Store ("EBS") Block stockpiling (resembles a circle) 1GB to 1TB in size Raw piece gadget, Put your own filesystem on it Do whatever else that you would ordinarily do to disk(s) Persistent & depiction able Mount to any EC2 example in accessibility zone Notable improvements: Create EBS volumes from facilitated AWS datasets EBS preview share Can be utilized to clone/make/share volume information

Slide 23

Simple Queue Service ("SQS") One of the key "paste" administrations for work processes Message going between AMI occurrences Cheap, adaptable, solid Can include new message whenever 8KB size; any configuration Messages are bolted while being prepared If read comes up short, bolt is evacuated Message liberated to be re-perused

Slide 24

Elastic MapReduce * I have not utilized this administration Integrated Hadoop handling arrangement Has brought about some debate Designed to make life less demanding for individuals who would prefer not to custom form their own particular Hadoop frameworks inside AWS

Slide 25

Virtual Private Cloud ("VPC") * I have not utilized this administration yet Relatively new item offering Very fascinating to me Solves some terrible issues with cloud-blasting and other half and half nearby/cloud arrangements Different systems, IP address plans and subnets can be an issue while "spanning" neighborhood and cloud frameworks Most individuals doing this today execute an OpenVPN programming overlay system to bind together the system space Amazon VPS basically makes this a formal, bolstered item

Slide 26

Cloud Sobriety Important to think in handy terms. Utility figuring has pretty much the same number of negatives as positives.

Slide 27

Cloud Sobriety McKinsey presentation " Clearing the Air on Cloud Computing " is an unquestionable requirement read Tries to flatten the buildup a bit James Hamilton has a decent response: http://perspectives.mvdirona.com/Both finish up: IT staff needs to comprehend "the cloud" Critical to evaluate your own particular interior costs Perform your own particular due steadiness

Slide 28

Cloud Security … set mentality to "negative"

Slide 29

Cloud Security Pet Peeve Don't have any desire to deprecate concerns yet … A whiff of false reverence is noticeable all around Staff truly concerned or simply ensuring turf? Clever to see individuals requesting efforts to establish safety that they don't hone inside over their own framework

Slide 30

Cloud Security & Reality My own take: Amazon, Google & Microsoft most likely have preferred inward working controls over you do All of them are glad to talk as profoundly as you prefer about all issues identifying with security Do your own due perseverance & don't let legislative issues or IT domain issues cloud basic leadership Biggest issue for me might be per-nation information assurance and patient protection rules http://aws.amazon.com/security/

Slide 31

Cloud Security & HIPAA Short and sweet: HIPAA agreeable applications running today on AWS Amazon has distributed a HIPAA whitepaper Boils down to: Good & Bad: All truly difficult work done by you AWS is only the base foundation, no specialized snags to the security, encryption and review frameworks required for you to manufacture your applications

Slide 32

State of AWS The great, the terrible, the monstrous & what it implies for HPC sorts

Slide 33

State of Amazon AWS New elements are being taken off quick and incensed But … EC2 hubs still poor on plate IO operations EBS administration can utilize a few improvements Poor support for dormancy delicate things and work processes that incline toward tight system topologies This matters in light of the fact that: Compute power is anything but difficult to gain Life science has a tendency to be IO bound Life science is at present being covered in information

Slide 34

AWS & HPC Networking No certification that all your EC2 reservation cases will be dispensed from the same subnet Private IP, hostname, NAT and addressi