Vidéo pédagogique
Notice
Lieu de réalisation
Grenoble
Sous-titrage
Sous-titre
Langue :
Anglais
Crédits
François Rechenmann (Intervention)
Conditions d'utilisation
Ces ressources de cours sont, sauf mention contraire, diffusées sous Licence Creative Commons. L’utilisateur doit mentionner le nom de l’auteur, il peut exploiter l’œuvre sauf dans un contexte commercial et il ne peut apporter de modifications à l’œuvre originale.
DOI : 10.60527/eb9n-kj40
Citer cette ressource :
François Rechenmann. Inria. (2015, 5 février). 4.6. A path is optimal if all its sub-paths are optimal , in 4. Sequences comparison. [Vidéo]. Canal-U. https://doi.org/10.60527/eb9n-kj40. (Consultée le 5 août 2024)

# 4.6. A path is optimal if all its sub-paths are optimal

Réalisation : 5 février 2015 - Mise en ligne : 9 mai 2017
• document 1 document 2 document 3
• niveau 1 niveau 2 niveau 3
Descriptif

A sequence alignment between two sequences is a path in a grid. So that, an optimal sequence alignmentis an optimal path in the same grid. We'll see now that a property of this optimal path provides us with scanned lines for designing an optimization algorithm. The property is the following. A path which is optimal is made up of optimal sub-paths. To prove that, we can start byproving that if a path of length L is optimal then the path of length L minus one is also optimal. This can be proved quiteeasily ad arburdum. That is, you take the hypothesis that the path of length L is optimal and you say the path oflength L minus one is not optimal. And then, you arrive at a contradiction which means that this is proved. OK. So you can also make the samereasoning with the path of length L minus two, Lminus three and so on. And so, you arrive at the conclusionthat an optimal path is made up of optimal sub-paths. And this will explain how we candesign a schema of computation of the optimal path. The trick in this algorithm is to start with the last unitary path. That is, you remember our grid, thisis the last node of the grid. N and M are the lengths of the sequences. Here is the first sequence.

Intervention
Thème
Documentation

## Dans la même collection

• Vidéo pédagogique
00:06:38

### 4.7. Alignment costs

Rechenmann
François

We have seen how we can compute the cost of the path ending on the last node of our grid if we know the cost of the sub-path ending on the three adjacent nodes. It is time now to see more deeply why

• Vidéo pédagogique
00:04:54

### 4.1. How to predict gene/protein functions?

Rechenmann
François

Last week we have seen that annotating a genome means first locating the genes on the DNA sequences that is the genes, the region coding for proteins. But this is indeed the first step,the next very

• Vidéo pédagogique
00:09:26

### 4.10. How efficient is this algorithm?

Rechenmann
François

We have seen the principle of an iterative algorithm in two paths for aligning and comparing two sequences of characters, here DNA sequences. And we understoodwhy the iterative version is much more

• Vidéo pédagogique
00:04:22

### 4.4. Aligning sequences is an optimization problem

Rechenmann
François

We have seen a nice and a quitesimple solution for measuring the similarity between two sequences. It relied on the so-called hammingdistance that is counting the number of differencesbetween two

• Vidéo pédagogique
00:07:41

### 4.8. A recursive algorithm

Rechenmann
François

We have seen how we can computethe optimal cost, the ending node of our grid if we know the optimal cost of the three adjacent nodes. This is this computation scheme we can see here using the notation

• Vidéo pédagogique
00:04:29

### 4.2. Why gene/protein sequences may be similar?

Rechenmann
François

Before measuring the similaritybetween the sequences, it's interesting to answer the question: why gene or protein sequences may be similar? It is indeed veryinteresting because the answer is related

• Vidéo pédagogique
00:03:50

### 4.5. A sequence alignment as a path

Rechenmann
François

Comparing two sequences and thenmeasuring their similarities is an optimization problem. Why? Because we have seen thatwe have to take into account substitution and deletion. During the alignment, the

• Vidéo pédagogique
00:06:58

### 4.9. Recursion can be avoided: an iterative version

Rechenmann
François

We have written a recursive function to compute the optimal path that is an optimal alignment between two sequences. Here all the examples I gave were onDNA sequences, four letter alphabet. OK. The

• Vidéo pédagogique
00:03:59

### 4.3. Measuring sequence similarity

Rechenmann
François

So we understand why gene orprotein sequences may be similar. It's because they evolve togetherwith the species and they evolve in time, there aremodifications in the sequence and that the sequence

## Avec les mêmes intervenants et intervenantes

• Vidéo pédagogique
00:04:52

### 1.2. At the heart of the cell: the DNA macromolecule

Rechenmann
François

During the last session, we saw how at the heart of the cell there's DNA in the nucleus, sometimes of cells, or directly in the cytoplasm of the bacteria. The DNA is what we call a macromolecule, that

• Vidéo pédagogique
00:07:29

### 1.10. Overlapping sliding window

Rechenmann
François

We have made some drawings along a genomic sequence. And we have seen that although the algorithm is quite simple, even if some points of the algorithmare bit trickier than the others, we were able to

• Vidéo pédagogique
00:05:53

### 2.3. The genetic code

Rechenmann
François

Genes code for proteins. What is the correspondence betweenthe genes, DNA sequences, and the structure of proteins? The correspondence isthe genetic code. Proteins have indeedsequences of amino acids.

• Vidéo pédagogique
00:05:58

### 3.6. Boyer-Moore algorithm

Rechenmann
François

We have seen how we can make gene predictions more reliable through searching for all the patterns,all the occurrences of patterns. We have seen, for example, howif we locate the RBS, Ribosome

• Vidéo pédagogique
00:04:22

### 4.4. Aligning sequences is an optimization problem

Rechenmann
François

We have seen a nice and a quitesimple solution for measuring the similarity between two sequences. It relied on the so-called hammingdistance that is counting the number of differencesbetween two

• Vidéo pédagogique
00:04:45

### 5.2. The tree, an abstract object

Rechenmann
François

When we speak of trees, of species,of phylogenetic trees, of course, it's a metaphoric view of a real tree. Our trees are abstract objects. Here is a tree and the different components of this tree.

• Vidéo pédagogique
00:05:10

### 1.5. Counting nucleotides

Rechenmann
François

In this session, don't panic. We will design our first algorithm. This algorithm is forcounting nucleotides. The idea here is that as an input,you have a sequence of nucleotides, of bases, of letters,

• Vidéo pédagogique
00:06:09

### 2.4. A translation algorithm

Rechenmann
François

We have seen that the genetic codeis a correspondence between the DNA or RNA sequences and aminoacid sequences that is proteins. Our aim here is to design atranslation algorithm, we make the

• Vidéo pédagogique
00:05:41

### 3.1. All genes end on a stop codon

Rechenmann
François

Last week we studied genes and proteins and so how genes, portions of DNA, are translated into proteins. We also saw the very fast evolutionof the sequencing technology which allows for producing

• Vidéo pédagogique
00:05:35

### 3.9. Benchmarking the prediction methods

Rechenmann
François

It is necessary to underline that gene predictors produce predictions. Predictions mean that you have no guarantees that the coding sequences, the coding regions,the genes you get when applying your

• Vidéo pédagogique
00:07:41

### 4.8. A recursive algorithm

Rechenmann
François

We have seen how we can computethe optimal cost, the ending node of our grid if we know the optimal cost of the three adjacent nodes. This is this computation scheme we can see here using the notation

• Vidéo pédagogique
00:08:29

### 5.6. The diversity of bioinformatics algorithms

Rechenmann
François

In this course, we have seen a very little set of bioinformatic algorithms. There exist numerous various algorithms in bioinformatics which deal with a large span of classes of problems. For example,