Methods and systems for data analysis using the Burrows-Wheeler Transform.
Markus J. Bauer, Anthony James Cox, Giovanna Rosone, Dirk Evers
Illumina Cambridge Limited - Nr Saffron Walden, GB
US20120330567 A1: granted on August 5, 2014
WO2012175245 A2: submitted
International Application No.:
US application serial no.: 13/459,968
PCT (EP) application serial no.: PCT/2012/057943
US 8,798,936 B2: granted on August 5, 2014
The present disclosure provides computer implemented methods and systems for analyzing datasets, such as large data sets output from nucleic acid sequencing technologies. In particular, the present disclosure provides for data analysis comprising computing the BWT of a collection of strings in an incremental, character by character, manner. The present disclosure also provides compression boosting strategies resulting in a BWT of a reordered collection of data that is more compressible by second stage compression methods compared to non-reordered computational analysis.
Based on the following publications:
- M. J. Bauer, A. J. Cox, G. Rosone: Lightweight BWT Construction for Very Large String Collections. CPM 2011. LNCS 6661 Springer 2011: 219-231. doi: 10.1007/978-3-642-21458-5_20
- Markus J. Bauer, Anthony J. Cox, Giovanna Rosone: Lightweight algorithms for constructing and inverting the BWT of string collections. Theor. Comput. Sci. 483: 134-148 (2013), doi: 10.1016/j.tcs.2012.02.002.
BEETL: Burrows-Wheeler Extended Tool Library
If you have any question/suggestion on BEETL, you can also use our group on yahoo: link