Notice
2.6. Algorithms + data structures = programs
- document 1 document 2 document 3
- niveau 1 niveau 2 niveau 3
Descriptif
By writing the Lookup GeneticCode Function, we completed our translation algorithm. So we may ask the question about the algorithm, does it terminate? Andthe answer is yes, obviously. Is it pertinent, that is, doesit return the expected answer? The answer is yes, if you giveas an input a sequence of DNA, you will get as an output asequence of amino acids unless, of course, one of the tripletsis not one of the 64 expected triplets and then you will get, ofcourse, a nonsense protein sequence. Is it efficient? Well, for measuring the efficiency of an algorithm, you can ask the question, how manybasic operations you have to execute. In that case the critical operationis comparison of letters. In the best case if you are lucky,the algorithm for looking up a triplet needs three comparisons,if you're not lucky, in the worst case, it needs 64 comparisons. These are not high numbers, ofcourse, but you have to remember that you have to execute the Lookup Algorithm for every triplet of the sequence so a sequencemay be very very long so it may be a good idea to ask the question: can we do better? And the answer is yes. For doing better, what we can introduce is a data structure. Here is the data structure inthe left part of this slide.
Thème
Documentation
Dans la même collection
-
2.4. A translation algorithm
RechenmannFrançoisWe have seen that the genetic codeis a correspondence between the DNA or RNA sequences and aminoacid sequences that is proteins. Our aim here is to design atranslation algorithm, we make the
-
2.8. DNA sequencing
RechenmannFrançoisDuring the last session, I explained several times how it was important to increase the efficiency of sequences processing algorithm because sequences arevery long and there are large volumes of
-
2.2. Genes: from Mendel to molecular biology
RechenmannFrançoisThe notion of gene emerged withthe works of Gregor Mendel. Mendel studied the inheritance on some traits like the shape of pea plant seeds,through generations. He stated the famous laws of inheritance
-
2.5. Implementing the genetic code
RechenmannFrançoisRemember we were designing our translation algorithm and since we are a bit lazy, we decided to make the hypothesis that there was the adequate function forimplementing the genetic code. It's now time
-
2.9. Whole genome sequencing
RechenmannFrançoisSequencing is anexponential technology. The progresses in this technologyallow now to a sequence whole genome, complete genome. What does it mean? Well let'stake two examples: some twenty years ago,
-
2.3. The genetic code
RechenmannFrançoisGenes code for proteins. What is the correspondence betweenthe genes, DNA sequences, and the structure of proteins? The correspondence isthe genetic code. Proteins have indeedsequences of amino acids.
-
2.7. The algorithm design trade-off
RechenmannFrançoisWe saw how to increase the efficiencyof our algorithm through the introduction of a data structure. Now let's see if we can do even better. We had a table of index and weexplain how the use of these
-
2.1. The sequence as a model of DNA
RechenmannFrançoisWelcome back to our course on genomes and algorithms that is a computer analysis ofgenetic information. Last week we introduced the very basic concept in biology that is cell, DNA, genome, genes
-
2.10. How to find genes?
RechenmannFrançoisGetting the sequence of the genome is only the beginning, as I explained, once you have the sequence what you want to do is to locate the gene, to predict the function of the gene and maybe study the
Avec les mêmes intervenants et intervenantes
-
1.1. The cell, atom of the living world
RechenmannFrançoisWelcome to this introduction to bioinformatics. We will speak of genomes and algorithms. More specifically, we will see how genetic information can be analysed by algorithms. In these five weeks to
-
1.9. Predicting the origin of DNA replication?
RechenmannFrançoisWe have seen a nice algorithm to draw, let's say, a DNA sequence. We will see that first, we have to correct a little bit this algorithm. And then we will see how such as imple algorithm can provide
-
2.3. The genetic code
RechenmannFrançoisGenes code for proteins. What is the correspondence betweenthe genes, DNA sequences, and the structure of proteins? The correspondence isthe genetic code. Proteins have indeedsequences of amino acids.
-
3.6. Boyer-Moore algorithm
RechenmannFrançoisWe have seen how we can make gene predictions more reliable through searching for all the patterns,all the occurrences of patterns. We have seen, for example, howif we locate the RBS, Ribosome
-
4.5. A sequence alignment as a path
RechenmannFrançoisComparing two sequences and thenmeasuring their similarities is an optimization problem. Why? Because we have seen thatwe have to take into account substitution and deletion. During the alignment, the
-
5.5. Differences are not always what they look like
RechenmannFrançoisThe algorithm we have presented works on an array of distance between sequences. These distances are evaluated on the basis of differences between the sequences. The problem is that behind the
-
1.4. What is an algorithm?
RechenmannFrançoisWe have seen that a genomic textcan be indeed a very long sequence of characters. And to interpret this sequence of characters, we will need to use computers. Using computers means writing program.
-
2.2. Genes: from Mendel to molecular biology
RechenmannFrançoisThe notion of gene emerged withthe works of Gregor Mendel. Mendel studied the inheritance on some traits like the shape of pea plant seeds,through generations. He stated the famous laws of inheritance
-
3.1. All genes end on a stop codon
RechenmannFrançoisLast week we studied genes and proteins and so how genes, portions of DNA, are translated into proteins. We also saw the very fast evolutionof the sequencing technology which allows for producing
-
3.9. Benchmarking the prediction methods
RechenmannFrançoisIt is necessary to underline that gene predictors produce predictions. Predictions mean that you have no guarantees that the coding sequences, the coding regions,the genes you get when applying your
-
4.2. Why gene/protein sequences may be similar?
RechenmannFrançoisBefore measuring the similaritybetween the sequences, it's interesting to answer the question: why gene or protein sequences may be similar? It is indeed veryinteresting because the answer is related
-
5.4. The UPGMA algorithm
RechenmannFrançoisWe know how to fill an array with the values of the distances between sequences, pairs of sequences which are available in the file. This array of distances will be the input of our algorithm for