5.4. The UPGMA algorithm

We know how to fill an array with the values of the distances between sequences, pairs of sequences which are available in the file. This array of distances will be the input of our algorithm for reconstructing phylogenetic trees. The name of this algorithm israther complicated but the method itself is rather simple,too simple indeed. We will see that. The name standsfor Unweighted Pair Group Method with Arithmetic Mean, wewill understand these terms along the presentationof the algorithm. The algorithm starts withan array of distances. Let's take this very simpleexample, it implies seven species and here we have the values of thedistances between these different sequences associated with a species. As you ...
5.6. The diversity of bioinformatics algorithms

In this course, we have seen a very little set of bioinformatic algorithms. There exist numerous various algorithms in bioinformatics which deal with a large span of classes of problems. For example, read assembly. We have seen how NGS sequencers produce large sets of reads, small sequences which overlap. And the problem of assembly isto use the overlap in order to ordering this read and reconstructing the whole genomic sequence. This is the overlapping and you see that you can use this overlap to get a longer sequence. Of course, here the example issimple: you have to imagine a set of millions of reads to beassembled into genomic sequences of millions or ...
5.7. The application domains in microbiology

Bioinformatics relies on many domains of mathematics and computer science. Of course, algorithms themselves on character strings are important in bioinformatics, we have seen them. Algorithms and trees, for example,for reconstructing phylogenetic trees, algorithms on networks toreconstruct gene interaction networks, metabolic networks and maybe to simulate the dynamics of the time. We have seen also the implicationof probability and statistics. The implication of optimizationmethods, for example, for the computation of the optimalalignment of a pair of sequences. Constraint satisfaction is used forpredicting molecule structure. Automata and formal grammarswhich are some exotic parts of computer science are also usefulin bioinformatics, the same for signal processing. And soother domains may be listed here. We also ...
