Fuzzy k-mers and their application to comparative genome assembly
dc.contributor.advisor | Chambers, Desmond | |
dc.contributor.author | Healy, John | |
dc.date.accessioned | 2014-02-26T09:59:27Z | |
dc.date.available | 2014-02-26T09:59:27Z | |
dc.date.issued | 2013-10-22 | |
dc.identifier.uri | http://hdl.handle.net/10379/4218 | |
dc.description.abstract | The application of k-mer matching to problems in the field of bioinformatics is long established, with k-mer techniques underpinning standard heuristic approaches to sequence alignment and genome assembly. Despite their broad application, conventional k-mer matching techniques lack a native mechanism for accommodating sequence variability, requiring an exact match at pre-defined indices in a k-mer seed. This thesis presents a fuzzy approach for approximate k-mer matching and investigates its application to sequence alignment and comparative assembly. By combining the speed of hashing with the sensitivity of dynamic programming, fuzzy k-mers unify the two phases of the 'seed and extend' strategy into a single operation that executes in average constant time. In contrast with existing methods of k-mer matching, fuzzy k-mers provide native support for string variability. The fuzzy approach has been implemented in a prototype sequence aligner and genome assembler called Ferox. In addition to their exploitation for sequence alignment, the prototype directly integrates fuzzy k-mer alignments into the contig construction process by combining models of de novo and comparative genome assembly. | en_US |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 Ireland | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/3.0/ie/ | |
dc.subject | Fuzzy hash maps | en_US |
dc.subject | Comparative assembly | en_US |
dc.subject | Fuzzy k-mers | en_US |
dc.subject | Engineering & Informatics | en_US |
dc.title | Fuzzy k-mers and their application to comparative genome assembly | en_US |
dc.type | Thesis | en_US |
dc.local.note | This thesis describes how fuzzy string matching can be applied to the problems of sequence alignment and genome assembly, providing the sensitivity of approximate string matching with the execution speed of an exact search. | en_US |
dc.local.final | Yes | en_US |
nui.item.downloads | 1751 |