Element Programming: Edit Distance

Dynamic programming edit distance l.jpg
1 / 85
0
0
633 days ago, 196 views
PowerPoint PPT Presentation
Dynamic Programming: Alter Separation. Plot. DNA Grouping Correlation: First Examples of overcoming adversity Change Issue Manhattan Traveler Issue Longest Ways in Charts Succession Arrangement Alter Remove Longest Regular Subsequence Issue Dab Networks. DNA Arrangement Correlation: First Example of overcoming adversity .

Presentation Transcript

Slide 1

Dynamic Programming: Edit Distance

Slide 2

Outline DNA Sequence Comparison: First Success Stories Change Problem Manhattan Tourist Problem Longest Paths in Graphs Sequence Alignment Edit Distance Longest Common Subsequence Problem Dot Matrices

Slide 3

DNA Sequence Comparison: First Success Story Finding grouping similitudes with qualities of known capacity is a typical way to deal with gather a recently sequenced quality's capacity In 1984 Russell Doolittle and associates discovered likenesses between tumor creating quality and ordinary development figure (PDGF) quality

Slide 4

Cystic fibrosis (CF) is a perpetual and oftentimes lethal hereditary infection of the body's bodily fluid organs (anomalous abnormal state of bodily fluid in organs). CF essentially influences the respiratory frameworks in kids. Bodily fluid is a foul material that coats numerous epithelial surfaces and is discharged into liquids, for example, salivation Cystic Fibrosis

Slide 5

In mid 1980s scientists conjectured that CF is an autosomal passive issue created by transformations in a quality that stayed obscure till 1989 Heterozygous bearers are asymptomatic Must be homozygously latent in this quality so as to be determined to have CF Cystic Fibrosis: Inheritance

Slide 6

Cystic Fibrosis: Finding the Gene

Slide 7

Finding Similarities between the Cystic Fibrosis Gene and ATP restricting proteins ATP restricting proteins are available on cell layer and go about as transport direct In 1989 scholars discovered comparability between the cystic fibrosis quality and ATP restricting proteins A conceivable capacity for cystic fibrosis quality, given the way that CF includes sweet emission with unusually high sodium level

Slide 8

Cystic Fibrosis: Mutation Analysis If a high % of cystic fibrosis (CF) patients have a specific change in the quality and the typical patients don't, then that could be a pointer of a change that is identified with CF A specific change was found in 70% of CF patients, persuading proof that it is a dominating hereditary diagnostics marker for CF

Slide 9

Cystic Fibrosis and CFTR Gene :

Slide 10

Cystic Fibrosis and the CFTR Protein CFTR (Cystic Fibrosis Transmembrane conductance Regulator) protein is acting in the cell film of epithelial cells that emit bodily fluid These cells line the aviation routes of the nose, lungs, the stomach divider, and so forth

Slide 11

Mechanism of Cystic Fibrosis The CFTR protein (1480 amino acids) controls a chloride particle channel Adjusts the "wateriness" of liquids emitted by the phone Those with cystic fibrosis are missing one single amino corrosive in their CFTR Mucus winds up being too thick, influencing numerous organs

Slide 12

Bring in the Bioinformaticians Gene similitudes between two qualities with known and obscure capacity ready researcher to a few potential outcomes Computing a closeness score between two qualities tells how likely it is that they have comparative capacities Dynamic writing computer programs is a procedure for uncovering likenesses between qualities The Change Problem is a decent issue to present the possibility of element programming

Slide 13

The Change Problem Goal : Convert some measure of cash M into given categories, utilizing the least conceivable number of coins Input : A measure of cash M , and a variety of d groups c = ( c 1 , c 2 , … , c d ), in a diminishing request of significant worth ( c 1 > c 2 > … > c d ) Output : A rundown of d whole numbers i 1 , i 2 , … , i d to such an extent that c 1 i 1 + c 2 i 2 + … + c d i d = M and i 1 + i 2 + … + i d is insignificant

Slide 14

Value 1 2 3 4 5 6 7 8 9 10 Min # of coins 1 Change Problem: Example Given the divisions 1, 3, and 5, what is the base number of coins expected to roll out improvement for a given esteem? Just a single coin is expected to roll out improvement for the qualities 1, 3, and 5

Slide 15

Change Problem: Example (cont " d) Given the sections 1, 3, and 5, what is the base number of coins expected to roll out improvement for a given esteem? Esteem 1 2 3 4 5 6 7 8 9 10 Min # of coins 1 2 1 2 1 2 However, two coins are expected to roll out improvement for the qualities 2, 4, 6, 8, and 10.

Slide 16

Change Problem: Example (cont " d) Given the groups 1, 3, and 5, what is the base number of coins expected to roll out improvement for a given esteem? Esteem 1 2 3 4 5 6 7 8 9 10 Min # of coins 1 2 1 2 1 2 3 2 3 2 Lastly, three coins are expected to roll out improvement for the qualities 7 and 9

Slide 17

minNumCoins(M-1) + 1 minNumCoins(M-3) + 1 minNumCoins(M-5) + 1 min of minNumCoins(M) = Change Problem: Recurrence This illustration is communicated by the accompanying repeat connection:

Slide 18

minNumCoins(M-c 1 ) + 1 minNumCoins(M-c 2 ) + 1 … minNumCoins(M-c d ) + 1 min of minNumCoins(M) = Change Problem: Recurrence (cont " d) Given the categories c : c 1 , c 2 , … , c d , the repeat connection is:

Slide 19

Change Problem: A Recursive Algorithm RecursiveChange ( M , c , d ) if M = 0 return 0 bestNumCoins  endlessness for i  1 to d if M ≥ c i numCoins  RecursiveChange ( M – c i , c , d ) if numCoins + 1 < bestNumCoins  numCoins + 1 return bestNumCoins

Slide 20

RecursiveChange Is Not Efficient It recalculates the ideal coin blend for a given measure of cash over and over i.e., M = 77, c = (1,3,7): Optimal coin combo for 70 pennies is registered 9 times!

Slide 21

The RecursiveChange Tree 77 76 74 70 75 73 69 73 71 67 69 67 63 74 72 68 66 62 70 68 64 68 66 62 60 56 72 70 66 72 70 66 64 60 66 64 60 . . . . . . 70

Slide 22

We Can Do Better We're re-processing values in our calculation more than once Save consequences of every calculation for 0 to M This way, we can do a reference call to discover an as of now registered esteem, rather than re-figuring each time Running time M * d , where M is the estimation of cash and d is the quantity of categories

Slide 23

The Change Problem: Dynamic Programming DPChange( M , c , d ) bestNumCoins 0  0 for m  1 to M bestNumCoins m  endlessness for i  1 to d if m ≥ c i if bestNumCoins m – c i + 1 < bestNumCoins m bestNumCoins m  bestNumCoins m – c i + 1 return bestNumCoins M

Slide 24

DPChange: Example 0 1 2 3 4 5 6 0 1 2 1 2 3 2 0 1 0 1 2 3 4 5 6 7 0 1 0 1 2 1 2 3 2 1 0 1 2 0 1 2 0 1 2 3 4 5 6 7 8 0 1 2 3 0 1 2 1 2 3 2 1 2 0 1 2 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 0 1 2 1 2 3 2 1 2 3 0 1 2 1 2 c = (1,3,7) M = 9 0 1 2 3 4 5 0 1 2 1 2 3

Slide 25

Manhattan Tourist Problem (MTP) Imagine looking for a way (from source to sink) to travel (just eastbound and southward) with the most number of attractions ( * ) in the Manhattan matrix Source * Sink

Slide 26

Manhattan Tourist Problem (MTP) Imagine looking for a way (from source to sink) to travel (just eastbound and southward) with the most number of attractions ( * ) in the Manhattan framework Source * Sink

Slide 27

Manhattan Tourist Problem: Formulation Goal : Find the longest way in a weighted network. Input : A weighted matrix G with two unmistakable vertices, one named " source " and the other marked " sink " Output : A longest way in G from " source " to " sink "

Slide 28

MTP: An Example 0 1 2 3 4 j arrange source 3 2 4 0 3 5 9 0 1 0 4 3 2 3 2 4 13 1 6 5 4 2 0 7 3 4 15 19 2 i facilitate 4 5 2 4 1 0 2 3 20 3 8 5 6 5 2 sink 1 3 2 23 4

Slide 29

MTP: Greedy Algorithm Is Not Optimal 1 2 5 source 3 10 5 2 5 1 3 5 3 1 4 2 3 promising begin, however prompts to terrible decisions! 5 0 2 0 22 0 sink 18

Slide 30

MTP: Simple Recursive Program MT ( n,m ) if n=0 or m=0 return MT(n,m) x  MT(n-1,m)+ length of the edge from (n-1,m) to (n,m) y  MT(n,m-1)+ length of the edge from (n,m-1) to (n,m) return max{x,y}

Slide 31

MTP: Simple Recursive Program MT ( n,m ) x  MT(n-1,m)+ length of the edge from (n-1,m) to (n,m) y  MT(n,m-1)+ length of the edge from (n,m-1) to (n,m) return min{x,y} What's the matter with this approach?

Slide 32

MTP: Dynamic Programming j 0 1 source 1 0 1 S 0,1 = 1 i 5 1 5 S 1,0 = 5 Calculate ideal way score for every vertex in the diagram Each vertex's score is the greatest of the earlier vertices score in addition to the heaviness of the particular edge in the middle of

Slide 33

MTP: Dynamic Programming (cont " d) j 0 1 2 source 1 2 0 1 3 S 0,2 = 3 i 5 3 - 5 1 5 4 S 1,1 = 4 3 2 8 S 2,0 = 8

Slide 34

MTP: Dynamic Programming (cont " d) j 0 1 2 3 source 1 2 5 0 1 3 8 S 3,0 = 8 i 5 3 10 - 5 1 5 4 13 S 1,2 = 13 5 3 - 5 2 8 9 S 2,1 = 9 0 3 8 S 3,0 = 8

Slide 35

MTP: Dynamic Programming (cont " d) j 0 1 2 3 source 1 2 5 0 1 3 8 i 5 3 10 - 5 - 5 1 - 5 1 5 4 13 8 S 1,3 = 8 5 3 - 3 - 5 2 8 9 12 S 2,2 = 12 0 3 8 9 S 3,1 = 9 covetous alg. fizzles!

Slide 36

MTP: Dynamic Programming (cont " d) j 0 1 2 3 source 1 2 5 0 1 3 8 i 5 3 10 - 5 - 5 1 - 5 1 5 4 13 8 5 3 - 3 2 3 - 5 2 8 9 12 15 S 2,3 = 15 0 - 5 0 3 8 9 S 3,2 = 9

Slide 37

MTP: Dynamic Programming (cont " d) j 0 1 2 3 source 1 2 5 0 1 3 8 Done! i 5 3 10 - 5 - 5 1 - 5 1 5 4 13 8 (demonstrating every single back-follow) 5 3 - 3 2 3 - 5 2 8 9 12 15 0 - 5 1 0 3 8 9 16 S 3,3 = 16

Slide 38

s i-1, j + weight of the edge between ( i-1, j ) and ( i, j ) s i, j-1 + weight of the edge between ( i, j-1 ) and ( i, j ) max s i, j = MTP: Recurrence Computing the score for a point (i,j) by the repeat connection: The running time is n x m for a n by m lattice ( n = # of lines, m = # of sections)

Slide 39

A 2 A 3 A 1 B s A1 + weight of the edge (A 1 , B) s A2 + weight of the edge (A 2 , B) s A3 + weight of the edge (A 3 , B) max of s B = Manhattan Is Not A Perfect Grid What about diagonals? The score at point B is given by:

Slide 40

max of s y + weight of vertex ( y , x ) where y є Predecessors( x) s x = Manhattan Is Not A Perfect Grid (con

SPONSORS