4.1. How to predict gene/protein functions?

Last week we have seen that annotating a genome means first locating the genes on the DNA sequences that is the genes, the region coding for proteins. But this is indeed the first step,the next very important step is to be able to predict thefunctions of the genes. That is more correctly, the function of the protein coded by the genes. How can we predict thisgene or function protein? It is essentially based on thefact that we will retrieve genes or protein for which the sequenceis similar and for which we know the function. So we will seehow we can measure and compute the similarity between DNA or protein sequences. But first let's come back onthis idea of data basis. We have seen that people from labs can deposit DNA sequences they obtained through sequencing on data banks which can be used by other people. Such databanks are GenBank or EMBL in Europe for DNA sequencing, it's alsoUniprot for protein sequences. The interest, of course, is notonly to deposit a sequence but information we have on the sequence,especially if it is a gene or if it is a protein, what are thefunctions of the genes of protein. These functions can be describedas free text, as commands in free text, by keywords or by morespecific descriptive means like enzymatic classification entries. What does it mean? You remember this organisation of genes to proteins via the RNA. OK. Some of the proteins may be enzymes. What are enzymes? Enzymes are ableto catalyse biochemical reactions so as to accelerate them.

    Date de réalisation : 5 Février 2015
    Auteur(s) : RECHENMANN Francois
    Mots-clés : DNA, Genome, algorithm, cell, bioinformatics
