Computational molecular biology
- Speeding up dynamic programming.
D. Eppstein, Z. Galil, and R. Giancarlo.
29th IEEE Symp. Foundations of Comp. Sci., White Plains, New York, 1988, pp. 488–496.
Worksh. Algorithms for Molecular Genetics, Bethesda, Maryland, 1988.
Tech. Rep. CUCS-379-88, Computer Science Dept., Columbia University, 1988.
Appeared as "Efficient algorithms with applications to molecular biology" in Int. Advanced Workshop on Sequences, Positano, Italy, 1988.
Sequences: Combinatorics, Compression, Security, Transmission, R. M. Capocelli, ed., Springer, 1990, pp. 59–74.The FOCS and Positano versions of this paper merged my results on a dynamic program used for RNA secondary structure prediction, with Raffaele's on sequence comparison. The Bethesda talk and the TR version both used the longer title "Speeding up dynamic programming with application to the computation of RNA structure", and included only the RNA result, which used a mild convexity assumption on certain costs to save two orders of magnitude in total time. This work incited a boom in computational biology within the theory community that is still going strong. But the RNA results were quickly improved by a log factor [Aggarwal et al. at the same FOCS] and never made it into a journal paper.
- Efficient algorithms for sequence analysis with concave and convex gap costs.
D. Eppstein.
Ph.D. thesis, Columbia University, 1989.Includes results from "Speeding up dynamic programming", "Sequence comparison with mixed convex and concave costs", and "Sparse dynamic programming".
- Sequence comparison with mixed convex and concave costs.
D. Eppstein.
Tech. Rep. CUCS-382-88, Computer Science Dept., Columbia University, 1988.
J. Algorithms 11 (1): 85–101, 1990.Gives an algorithm for finding the minimum number of mutations needed to transform one input string into another, in a general model in which substrings may be inserted or deleted at a cost depending nonlinearly on the substring length. The time bound depends on the number of times the second derivative of the cost function changes sign.
- Sparse dynamic programming.
D. Eppstein, Z. Galil, R. Giancarlo, and G.F. Italiano.
1st ACM-SIAM Symp. Discrete Algorithms, San Francisco, 1990, pp. 513–522.
"Sparse dynamic programming I: linear cost functions", J. ACM 39: 519–545, 1992.
"Sparse dynamic programming II: convex and concave cost functions", J. ACM 39: 546–567, 1992.Considers sequence alignment and RNA structure problems in which the solution is constructed by piecing together some initial set of fragments (e.g. short sequences that match exactly). The method is to consider a planar point set formed by the fragment positions in the two input sequences, and use plane sweep to construct a cellular decomposition of the plane similar to the rectilinear Voronoi diagram.
- Efficient algorithms for sequence analysis.
D. Eppstein, Z. Galil, R. Giancarlo, and G.F. Italiano.
International Advanced Workshop on Sequences, Positano, Italy, 1991.
Sequences II: Methods in Communication, Security, and Computer Science, R.M. Capocelli, A. De Santis, and U. Vaccaro, eds., Springer, 1993, pp. 225–244.
Surveys results on speeding up certain dynamic programs used for sequence comparison and RNA structure prediction.