3.8. Probabilistic methods

Up to now, to predict our gene,we only rely on the process of searching certain strings or patterns. In order to further improve our gene predictor, the idea is to use, to rely onprobabilistic methods. What does it mean? I will firsttake an example, which is not related to genomic but I think it'sgood to understand the idea. Imagine you have a very long text which is known to be written in some human understandable language but you don't know which one but you know that some passages of this text only are written in a human understandable language,maybe English, maybe French and so on, whatever. You don't know. How ...
1.7. DNA walk

We will now design a more graphical algorithm which is called "the DNA walk". We shall see what does it mean "DNA walk". Walk on to DNA. Something like that, yes. But first, just have a look again at the typical, also quite short sequence of DNA, a long text offour letters: A, C, G, T, T and so on. When the first sequence of DNA were obtained, the idea of using computers very quickly emerged but people didn't know exactly what to do with this sequence of characters. Again, there is a meaning behind the sequence because it is genetic information. It means it is the information ...
2.3. The genetic code

Genes code for proteins. What is the correspondence betweenthe genes, DNA sequences, and the structure of proteins? The correspondence isthe genetic code. Proteins have indeedsequences of amino acids. There are 20 amino acidsin the living world. They can be named by a single letter,3 letters or their full name. It means that a protein can berepresented by a sequence of letters in a 20 letter alphabet. Let's come back again on thiscorrespondence between gene and protein. Genes are regions of DNA. These regions are first transcribedinto RNA and then RNA into proteins. And proteins’ sequences of aminoacids fold into 3D structures. Like here, some helixes. Translation is the process whichgoes from RNA to ...
