Citation Aldinucci M, Bracciali A, Marschall T, Patterson M, Pisanti N & Torquati M (2015) High-Performance Haplotype Assembly. In: Serio C, Lio P, Nonis A & Tagliaferri R (eds.) Computational Intelligence Methods for Bioinformatics and Biostatistics: 11th International Meeting, CIBB 2014, Cambridge, UK, June 26-28, 2014, Revised Selected Papers. Lecture Notes in Computer Science, 8623. 11th International Meeting, CIBB 2014, Cambridge, 26.06.2014-28.06.2014. Cham, Switzerland: Springer, pp. 245-258. http://link.springer.com/chapter/10.1007/978-3-319-24462-4_21; https://doi.org/10.1007/978-3-319-24462-4_21
Abstract The problem ofHaplotype Assemblyis an essential step in human genome analysis. It is typically formalised as theMinimum Error Correction(MEC) problem which is NP-hard. MEC has been approached using heuristics, integer linear programming, and fixed-parameter tractability (FPT), including approaches whose runtime is exponential in the length of the DNA fragments obtained by the sequencing process. Technological improvements are currently increasing fragment length, which drastically elevates computational costs for such methods. We presentpWhatsHap, a multi-core parallelisation ofWhatsHap, a recent FPT optimal approach to MEC.WhatsHapmoves complexity from fragment length to fragment overlap and is hence of particular interest when considering sequencing technology’s current trends.pWhatsHapfurther improves the efficiency in solving the MEC problem, as shown by experiments performed on datasets with high coverage.