Vidéo pédagogique
Notice
Lieu de réalisation
Grenoble
Langue :
Anglais
Crédits
François Rechenmann (Intervention)
Conditions d'utilisation
Ces ressources de cours sont, sauf mention contraire, diffusées sous Licence Creative Commons. L’utilisateur doit mentionner le nom de l’auteur, il peut exploiter l’œuvre sauf dans un contexte commercial et il ne peut apporter de modifications à l’œuvre originale.
DOI : 10.60527/a84s-b459
Citer cette ressource :
François Rechenmann. Inria. (2015, 5 février). 1.8. Compressing the DNA walk , in 1. Genomic texts. [Vidéo]. Canal-U. https://doi.org/10.60527/a84s-b459. (Consultée le 21 juillet 2024)

# 1.8. Compressing the DNA walk

Réalisation : 5 février 2015 - Mise en ligne : 9 mai 2017
• document 1 document 2 document 3
• niveau 1 niveau 2 niveau 3
Descriptif

We have written the algorithm for the circle DNA walk. Just a precision here: the kind of drawing we get has nothing to do with the physical drawing of the DNA molecule. It is a symbolic representation. It is a way of representing the information content of the sequence as a drawing. Remember that the problem of the algorithm we designed is that it supposes the capacity of drawing several millions or billions of segments on the screen. This is not feasible. No screen will be large enough for that. So, how can we deal with this hardware constraint? Compression is the answer. Let's see that in more details. Remember, for each position here,we draw a segment according to the direction we defined at thebeginning of the first session. And so we get something like that. The idea here is, instead of drawing all these small segments, we will draw a segment like that. For example, every 10 small segments and so on. So of course we reduce the numberof segments which are necessary to draw the DNA walk fora complete sequence. How can we do that? We will define a window. The window is, at any time,a part of the sequence. It has a certain length and withinthis window, we will compute the number of A, C, G and T. And we know how to do that because we have done this kind of operation, in the previous session.

Intervention
Thème
Documentation

## Dans la même collection

• Vidéo pédagogique
00:06:06

### 1.7. DNA walk

Rechenmann
François

We will now design a more graphical algorithm which is called "the DNA walk". We shall see what does it mean "DNA walk". Walk on to DNA. Something like that, yes. But first, just have a look again at

• Vidéo pédagogique
00:04:52

### 1.2. At the heart of the cell: the DNA macromolecule

Rechenmann
François

During the last session, we saw how at the heart of the cell there's DNA in the nucleus, sometimes of cells, or directly in the cytoplasm of the bacteria. The DNA is what we call a macromolecule, that

• Vidéo pédagogique
00:05:10

### 1.5. Counting nucleotides

Rechenmann
François

In this session, don't panic. We will design our first algorithm. This algorithm is forcounting nucleotides. The idea here is that as an input,you have a sequence of nucleotides, of bases, of letters,

• Vidéo pédagogique
00:09:07

### 1.9. Predicting the origin of DNA replication?

Rechenmann
François

We have seen a nice algorithm to draw, let's say, a DNA sequence. We will see that first, we have to correct a little bit this algorithm. And then we will see how such as imple algorithm can provide

• Vidéo pédagogique
00:07:21

### 1.3. DNA codes for genetic information

Rechenmann
François

Remember at the heart of any cell,there is this very long molecule which is called a macromolecule for this reason, which is the DNA molecule. Now we will see that DNA molecules support what is called

• Vidéo pédagogique
00:04:28

### 1.6. GC and AT contents of DNA sequence

Rechenmann
François

We have designed our first algorithmfor counting nucleotides. Remember, what we have writtenin pseudo code is first declaration of variables. We have several integer variables that are variables which

• Vidéo pédagogique
00:05:24

### 1.1. The cell, atom of the living world

Rechenmann
François

Welcome to this introduction to bioinformatics. We will speak of genomes and algorithms. More specifically, we will see how genetic information can be analysed by algorithms. In these five weeks to

• Vidéo pédagogique
00:07:29

### 1.10. Overlapping sliding window

Rechenmann
François

We have made some drawings along a genomic sequence. And we have seen that although the algorithm is quite simple, even if some points of the algorithmare bit trickier than the others, we were able to

• Vidéo pédagogique
00:05:48

### 1.4. What is an algorithm?

Rechenmann
François

We have seen that a genomic textcan be indeed a very long sequence of characters. And to interpret this sequence of characters, we will need to use computers. Using computers means writing program.

## Avec les mêmes intervenants et intervenantes

• Vidéo pédagogique
00:05:48

### 1.4. What is an algorithm?

Rechenmann
François

We have seen that a genomic textcan be indeed a very long sequence of characters. And to interpret this sequence of characters, we will need to use computers. Using computers means writing program.

• Vidéo pédagogique
00:06:09

### 2.4. A translation algorithm

Rechenmann
François

We have seen that the genetic codeis a correspondence between the DNA or RNA sequences and aminoacid sequences that is proteins. Our aim here is to design atranslation algorithm, we make the

• Vidéo pédagogique
00:05:41

### 3.1. All genes end on a stop codon

Rechenmann
François

Last week we studied genes and proteins and so how genes, portions of DNA, are translated into proteins. We also saw the very fast evolutionof the sequencing technology which allows for producing

• Vidéo pédagogique
00:05:35

### 3.9. Benchmarking the prediction methods

Rechenmann
François

It is necessary to underline that gene predictors produce predictions. Predictions mean that you have no guarantees that the coding sequences, the coding regions,the genes you get when applying your

• Vidéo pédagogique
00:04:29

### 4.2. Why gene/protein sequences may be similar?

Rechenmann
François

Before measuring the similaritybetween the sequences, it's interesting to answer the question: why gene or protein sequences may be similar? It is indeed veryinteresting because the answer is related

• Vidéo pédagogique
00:04:59

### 5.4. The UPGMA algorithm

Rechenmann
François

We know how to fill an array with the values of the distances between sequences, pairs of sequences which are available in the file. This array of distances will be the input of our algorithm for

• Vidéo pédagogique
00:06:06

### 1.7. DNA walk

Rechenmann
François

We will now design a more graphical algorithm which is called "the DNA walk". We shall see what does it mean "DNA walk". Walk on to DNA. Something like that, yes. But first, just have a look again at

• Vidéo pédagogique
00:06:57

### 2.7. The algorithm design trade-off

Rechenmann
François

We saw how to increase the efficiencyof our algorithm through the introduction of a data structure. Now let's see if we can do even better. We had a table of index and weexplain how the use of these

• Vidéo pédagogique
00:06:22

### 3.4. Predicting all the genes in a sequence

Rechenmann
François

We have written an algorithm whichis able to locate potential genes on a sequence but only on one phase because we are looking triplets after triplets. Now remember that the genes maybe located on

• Vidéo pédagogique
00:06:38

### 4.7. Alignment costs

Rechenmann
François

We have seen how we can compute the cost of the path ending on the last node of our grid if we know the cost of the sub-path ending on the three adjacent nodes. It is time now to see more deeply why

• Vidéo pédagogique
00:06:58

### 4.9. Recursion can be avoided: an iterative version

Rechenmann
François

We have written a recursive function to compute the optimal path that is an optimal alignment between two sequences. Here all the examples I gave were onDNA sequences, four letter alphabet. OK. The

• Vidéo pédagogique
00:04:52

### 1.2. At the heart of the cell: the DNA macromolecule

Rechenmann
François

During the last session, we saw how at the heart of the cell there's DNA in the nucleus, sometimes of cells, or directly in the cytoplasm of the bacteria. The DNA is what we call a macromolecule, that