Vidéo pédagogique

Notice

Lieu de réalisation

Grenoble

Sous-titrage

Sous-titre

Langue :

Anglais

Crédits

François Rechenmann (Intervention)

Conditions d'utilisation

Ces ressources de cours sont, sauf mention contraire, diffusées sous Licence Creative Commons. L’utilisateur doit mentionner le nom de l’auteur, il peut exploiter l’œuvre sauf dans un contexte commercial et il ne peut apporter de modifications à l’œuvre originale.

DOI : 10.60527/q63z-by39

Citer cette ressource :

François Rechenmann. Inria. (2015, 5 février). 4.4. Aligning sequences is an optimization problem , in 4. Sequences comparison. [Vidéo]. Canal-U. https://doi.org/10.60527/q63z-by39. (Consultée le 6 juillet 2025)

4.4. Aligning sequences is an optimization problem

Réalisation : 5 février 2015 - Mise en ligne : 9 mai 2017

document 1 document 2 document 3
niveau 1 niveau 2 niveau 3

Descriptif

We have seen a nice and a quitesimple solution for measuring the similarity between two sequences. It relied on the so-called hammingdistance that is counting the number of differencesbetween two sequences. But the real situation is a bitmore complex as we'll see now, it needs an adequatesolution and algorithm. Why is it a bit more complex? Let's have a look at thispair of two sequences. If we apply the hamming distance,compute the hamming between these two sequences,we find ten differences. OK. But you must remember thatmutation may be substitution, deletion and insertion. So if wetake into account the deletion and insertion, the situation isvery different in the case of these two sequences. Let's see why. We can compare now that it isthe same set of sequences but here and here we made the hypothesisof insertion or deletion. Insertion or deletion depend on thesequence you take into account and one substitution. Instead ofhaving ten differences, the same sequences show only twoinsertions/deletions and one substitution. It's because we have taken into accountthe notion of insertion/deletion by inserting this character,this blank character.

Intervention

Rechenmann

François

Ingénieur. Auteur d'une thèse de docteur-ingénieur en sciences appliquées (Grenoble INPG, 1976). - HDR. Directeur de thèse à Grenoble INPG (1990-1994-) et à l'université de Grenoble 1. Directeur de recherche au centre Inria Grenoble – Rhône-Alpes (2002, 2015)

Thème

Disciplines :

Documentation

Liens

Support de présentation au format PDF

Dans la même collection

Vidéo pédagogique

00:04:29

Favoris
4.2. Why gene/protein sequences may be similar?

Rechenmann

François

Before measuring the similaritybetween the sequences, it's interesting to answer the question: why gene or protein sequences may be similar? It is indeed veryinteresting because the answer is related
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:04:11

Favoris
4.6. A path is optimal if all its sub-paths are optimal

Rechenmann

François

A sequence alignment between two sequences is a path in a grid. So that, an optimal sequence alignmentis an optimal path in the same grid. We'll see now that a property of this optimal path provides
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:06:58

Favoris
4.9. Recursion can be avoided: an iterative version

Rechenmann

François

We have written a recursive function to compute the optimal path that is an optimal alignment between two sequences. Here all the examples I gave were onDNA sequences, four letter alphabet. OK. The
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:03:59

Favoris
4.3. Measuring sequence similarity

Rechenmann

François

So we understand why gene orprotein sequences may be similar. It's because they evolve togetherwith the species and they evolve in time, there aremodifications in the sequence and that the sequence
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:06:38

Favoris
4.7. Alignment costs

Rechenmann

François

We have seen how we can compute the cost of the path ending on the last node of our grid if we know the cost of the sub-path ending on the three adjacent nodes. It is time now to see more deeply why
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:04:54

Favoris
4.1. How to predict gene/protein functions?

Rechenmann

François

Last week we have seen that annotating a genome means first locating the genes on the DNA sequences that is the genes, the region coding for proteins. But this is indeed the first step,the next very
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:09:26

Favoris
4.10. How efficient is this algorithm?

Rechenmann

François

We have seen the principle of an iterative algorithm in two paths for aligning and comparing two sequences of characters, here DNA sequences. And we understoodwhy the iterative version is much more
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:03:50

Favoris
4.5. A sequence alignment as a path

Rechenmann

François

Comparing two sequences and thenmeasuring their similarities is an optimization problem. Why? Because we have seen thatwe have to take into account substitution and deletion. During the alignment, the
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:07:41

Favoris
4.8. A recursive algorithm

Rechenmann

François

We have seen how we can computethe optimal cost, the ending node of our grid if we know the optimal cost of the three adjacent nodes. This is this computation scheme we can see here using the notation
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3

Voir tout

Avec les mêmes intervenants et intervenantes

Vidéo pédagogique

00:06:06

Favoris
1.7. DNA walk

Rechenmann

François

We will now design a more graphical algorithm which is called "the DNA walk". We shall see what does it mean "DNA walk". Walk on to DNA. Something like that, yes. But first, just have a look again at
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:05:47

Favoris
2.6. Algorithms + data structures = programs

Rechenmann

François

By writing the Lookup GeneticCode Function, we completed our translation algorithm. So we may ask the question about the algorithm, does it terminate? Andthe answer is yes, obviously. Is it pertinent,
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:04:45

Favoris
3.3. Searching for start and stop codons

Rechenmann

François

We have written an algorithm for finding genes. But you remember that we arestill to write the two functions for finding the next stop codonand the next start codon. Let's see how we can do that. We
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:04:54

Favoris
4.1. How to predict gene/protein functions?

Rechenmann

François

Last week we have seen that annotating a genome means first locating the genes on the DNA sequences that is the genes, the region coding for proteins. But this is indeed the first step,the next very
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:06:58

Favoris
4.9. Recursion can be avoided: an iterative version

Rechenmann

François

We have written a recursive function to compute the optimal path that is an optimal alignment between two sequences. Here all the examples I gave were onDNA sequences, four letter alphabet. OK. The
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:04:52

Favoris
1.2. At the heart of the cell: the DNA macromolecule

Rechenmann

François

During the last session, we saw how at the heart of the cell there's DNA in the nucleus, sometimes of cells, or directly in the cytoplasm of the bacteria. The DNA is what we call a macromolecule, that
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:07:29

Favoris
1.10. Overlapping sliding window

Rechenmann

François

We have made some drawings along a genomic sequence. And we have seen that although the algorithm is quite simple, even if some points of the algorithmare bit trickier than the others, we were able to
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:05:53

Favoris
2.3. The genetic code

Rechenmann

François

Genes code for proteins. What is the correspondence betweenthe genes, DNA sequences, and the structure of proteins? The correspondence isthe genetic code. Proteins have indeedsequences of amino acids.
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:05:58

Favoris
3.6. Boyer-Moore algorithm

Rechenmann

François

We have seen how we can make gene predictions more reliable through searching for all the patterns,all the occurrences of patterns. We have seen, for example, howif we locate the RBS, Ribosome
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:03:50

Favoris
4.5. A sequence alignment as a path

Rechenmann

François

Comparing two sequences and thenmeasuring their similarities is an optimization problem. Why? Because we have seen thatwe have to take into account substitution and deletion. During the alignment, the
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:04:45

Favoris
5.2. The tree, an abstract object

Rechenmann

François

When we speak of trees, of species,of phylogenetic trees, of course, it's a metaphoric view of a real tree. Our trees are abstract objects. Here is a tree and the different components of this tree.
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3
Vidéo pédagogique

00:05:10

Favoris
1.5. Counting nucleotides

Rechenmann

François

In this session, don't panic. We will design our first algorithm. This algorithm is forcounting nucleotides. The idea here is that as an input,you have a sequence of nucleotides, of bases, of letters,
Biologie application informatique
DNA
Genome
Algorithm
Cell
09.05.2017
document 1 document 2 document 3
niveau 1 niveau 2 niveau 3