Asset Administration of Huge Scale Applications on a Framework

0
0
1606 days ago, 558 views
PowerPoint PPT Presentation
Sphinx Middleware for Resource Provisioning. Network Monitoring for better meta-booking ... Jang-uk In, Sanjay Ranka et. al.

Presentation Transcript

Slide 1

Asset Management of Large-Scale Applications on a Grid Laukik Chitnis and Sanjay Ranka (with Paul Avery, Jang-uk In and Rick Cavanaugh) Department of CISE University of Florida, Gainesville ranka@cise.ufl.edu 352 392 6838 (http://www.cise.ufl.edu/~ranka/)

Slide 2

Overview High End Grid Applications and Infrastructure at University of Florida Resource Management for Grids Sphinx Middleware for Resource Provisioning Grid Monitoring for better meta-booking Provisioning Algorithm Research for multi-center and lattice situations

Slide 3

Compute Intensive Applications MainFrame Applications The Evolution of High-End Applications (and their framework attributes) Geographically dispersed datasets High speed stockpiling Gigabit systems Data Intensive Applications Large groups Supercomputers Central centralized computers 1980 1990 2000

Slide 4

Some Representative Applications HEP, Medicine, Astronomy, Distributed Data Mining

Slide 5

Representative Application: High Energy Physics 1000+ 20+ nations 1-10 petabytes 1-

Slide 6

Representative Application: Tele-Radiation Therapy RCET Center for Radiation Oncology

Slide 7

Application Data Mining and Scheduling Services Data Transport Services . . Information Management Services Data Management Services Representative Application: Distributed Intrusion Detection NSF ITR Project: Middleware for Distributed Data Mining (PI: Ranka joint with Kumar and Grossman)

Slide 8

Grid Infrastructure Florida Lambda Rail and UF

Slide 9

Campus Grid (University of Florida) NSF Major Research Instrumentation Project (PI: Ranka, Avery et. al.) 20 Gigabit/sec Network 20+ Terabytes 2-3 Teraflops 10 Scientific and Engineering Applications Gigabit Ethernet Based Cluster Infiniband based Cluster

Slide 10

Grid Services The product part of the framework!

Slide 11

Security Services offered in a Grid Resource Management Services Monitoring and Information Services Data Management Services Note that the various administrations utilize security administrations

Slide 12

Resource Management Services Provide a uniform, standard interface to remote assets including CPU, Storage and Bandwidth Main segment is the remote occupation chief Ex: GRAM (Globus Resource Allocation Manager)

Slide 13

User Resource Management on a Grid GRAM LSF Site 2 Condor Site 1 PBS fork Site 3 Site n The Grid Narration: take note of the distinctive nearby schedulers

Slide 14

Scheduling your Application

Slide 15

Scheduling your Application An application can be keep running on a network site as a vocation The modules in matrix engineering, (for example, GRAM) permit uniform access to the lattice locales for your employment But… Most applications can be "parallelized" And these different parts of it can be booked to run at the same time on various destinations Thus using the force of the framework

Slide 16

Many work processes can be displayed as a Directed Acyclic Graph The measure of asset required (in units of time) is known to a level of sureness There is a little likelihood of disappointment in execution (in a lattice situation this could happen because of assets no more extended accessible) Directed Acyclic Graph Modeling an Application Workflow

Slide 17

Workflow Resource Provisioning Executing numerous work processes over appropriated and versatile (flawed) assets while overseeing arrangements Large Precedence Applications Time Constraints Data Intensive Access Control Priority Multi-center Heterogeneous Policies Resources Multiple Ownership Quota Faulty Distributed

Slide 18

UW MIT UI FNAL Caltech UCSD Rice UF BU UM UC BNL ANL IU LBL OU UTA SMU A Real Life Example from High Energy Physics Merge two matrices into a solitary multi-VO"Inter-Grid" How to guarantee that neither VO is hurt? both VOs really advantage? there are answers to inquiries like: "With what likelihood will my employment be planned and finish before my gathering due date?" Clear requirement for a booking middleware!

Slide 19

Typical situation VDT Client ? ? ? VDT Server VDT Server VDT Server

Slide 20

Typical situation @#^%#%$@# VDT Client ? ? ? VDT Server VDT Server VDT Server

Slide 21

Some Requirements for Effective Grid Scheduling Information necessities Past & future conditions of the application Persistent stockpiling of work processes Resource use estimation Policies Expected to shift gradually after some time Global perspectives of sets of responsibilities Request Tracking and Usage Statistics State data imperative Resource Properties and Status Expected to fluctuate gradually with time Grid climate Latency of estimation vital Replica administration System prerequisites Distributed, blame tolerant planning Customisability Interoperability with other booking frameworks Quality of Service

Slide 22

Incorporate Requirements into a Framework VDT Client ? ? ? Expect the GriPhyN Virtual Data Toolkit: Client (ask for/employment accommodation) Globus customers Condor-G/DAGMan Chimera Virtual Data System Server (asset guardian) MonALISA Monitoring Service Globus administrations RLS (Replica Location Service) VDT Server VDT Server VDT Server

Slide 23

Incorporate Requirements into a Framework ? Structure outline standards: Information driven Flexible customer server demonstrate General, however even minded and basic Avoid including middleware necessities matrix assets VDT Client Recommendation Engine VDT Server Assume the Virtual Data Toolkit: Client (ask for/employment accommodation) Clarens Web Service Globus customers Condor-G/DAGMan Chimera Virtual Data System Server (asset watchman) MonALISA Monitoring Service Globus administrations RLS (Replica Location Service) VDT Server VDT Server

Slide 24

Related Provisioning Software

Slide 25

Innovative Workflow Scheduling Middleware Modular framework Automated booking technique in light of tweaked administration Robust and recoverable framework Database foundation Fault-tolerant and recoverable from inward disappointment Platform free interoperable framework XML-based correspondence conventions SOAP, XML-RPC Supports heterogeneous administration environment 60 Java Classes 24,000 lines of Java code 50 test scripts, 1500 lines of script code

Slide 26

The Sphinx Workflow Execution Framework VDT Client Sphinx Server Sphinx Client Chimera Virtual Data System Clarens WS Backbone Request Processing Condor-G/DAGMan Data Warehouse Data Management VDT Server Site Globus Resource Information Gathering Replica Location Service MonALISA Monitoring Service

Slide 27

Sphinx Workflow Scheduling Server Sphinx Server Message Interface Functions as the Nerve Center Data Warehouse Policies, Account Information, Grid Weather, Resource Properties and Status, Request Tracking, Workflows, and so forth Control Process Finite State Machine Different modules alter occupations, diagrams, work processes, and so on and change their state Flexible Extensible Graph Reducer Control Process Job Predictor Graph Predictor Job Admission Control Graph Admission Control Graph Data Planner Data Warehouse Job Execution Planner Graph Tracker Data Management Information Gatherer

Slide 28

SPHINX Scheduling in Parallel for Heterogeneous Independent NetworXs

Slide 29

Policy Based Scheduling Submissions Resources Time Sphinx gives "delicate" QoS through time reliant, worldwide perspectives of Submissions (work processes, occupations, designation, and so on) Policies Resources Uses Linear Programming Methods Satisfy Constraints Policies, User-prerequisites, and so on Optimize a "goal" work Estimate probabilities to meet due dates inside strategy requirements J. In, P. Avery, R. Cavanaugh, and S. Ranka, "Policy Based Scheduling for Simple Quality of Service in Grid Computing", in Proceedings of the eighteenth IEEE IPDPS, Santa Fe, New Mexico, April, 2004 Policy Space Submissions Resources Time

Slide 30

Ability to endure assignment disappointments Jang-uk In, Sanjay Ranka et. al. "SPHINX: A blame tolerant framework for booking in element matrix environments", in Proceedings of the nineteenth IEEE IPDPS, Denver, Colorado, April, 2005 Significant Impact of utilizing criticism data

Slide 31

Grid Enabled Analysis SC|03

Slide 32

File Service File Service File Service File Service VDT Resource Service VDT Resource Service VDT Resource Service VDT Resource Service Fermilab Caltech Florida Iowa Sphinx RLS MonALISA ROOT Chimera Sphinx/VDT Monitoring Service Execution Service Replica Location Service Virtual Data Service Scheduling Service Data Analysis Client Distributed Services for Grid Enabled Data Analysis Distributed Services for Grid Enabled Data Analysis Clarens Globus Clarens GridFTP Globus MonALISA

Slide 33

Evaluation of Information accumulated from lattice observing frameworks

Slide 34

Limitation of Existing Monitoring Systems for the Grid Information amassed over different clients is not exceptionally valuable in compelling asset designation. A conclusion to-end parameter, for example, Average Job Delay - the normal lining postpone experienced by work of a given client at an execution site - is a superior gauge for looking at the asset accessibility and reaction time for a given client. It is likewise not exceptionally defenseless to checking latencies.

Slide 35

Effective DAG Scheduling The consummation time based calculation here utilizations the Average Job Delay parameter for planning As found in the bordering figure, it beats the calculations tried with other observed parameters.

Slide 36

Directed Acyclic Graph Work in Progress: Modeling Workflow Cost and creating proficient provisioning calculations 1. Building up a target measure of fulfillment time Integrating execution and dependability of work process execution P (Time to finish >=T) <= epsilon 2. Relating this measure to the properties of the longest way of the DAG in view of the mean and instability of time required for hidden errands because of 1) variable time necessities because of various parameter values 2) disappointment because of progress of

SPONSORS