Computational biology is a relatively new field, and is an amalgam of math and computer science applied to a biological problem. One major problem in this field is phylogenetic tree reconstruction. From a set of species (represented by a long string of characters signifying DNA sites), the goal is to reconstruct the evolutionary tree that shows the order of speciation.
Although the results are of great interest to biologists, this problem can (and often is) studied purely from a mathematical standpoint; the researcher need not know anything of biology to be adept at solving this algorithmical problem. Using techniques from computer science, reconstruction methods are developed and analyzed for accuracy and speed.
One standard reconstruction method is neighbor-joining, a distance-based method; the input to this method is the matrix of species-to-species distances. It is often useful to "correct" the distance matrix before it is used for reconstruction; this modification converts all entries from representing the observed probability that a site will change to the expected number of changes a site went through. However, if the species considered are sufficiently diverse, the sequences appear to be uncorrelated, and this correction does not result in a finite number.
Standard methods to cope with large distances include 1) using the uncorrected distances and 2) replacing values of infinity by very large finite constants. It is important to note that although the latter method is the preferred standard method (because as the sequence length approaches infinity, this method approaches absolute accuracy), we found that the uncorrected method usually outperforms it on realistic sequence lengths. Both of the two standard methods were then compared to a method we developed. Our method is a version of the large-constant method where the constant selected is surprisingly small and depends upon the input data. This method outperforms the large-constant method and often outperforms the uncorrected-distance method.
Go back to Kelly Ann Smith's homepage!