PADS: Practical Algorithms and Data structures on Strings
We develop fast and practical algorithms and data structures for fundamental problems arising in sequence analysis. The research is based on thorough understanding of both the combinatorial properties of the problems and the properties of modern computers. The goal is not only to obtain better algorithms but to understand why they are better.
Members
- Juha Kärkkäinen (contact person)
- Simon J. Puglisi
Former Members
- Dominik Kempa
- Pekka Mikkola
- Tommi Rantala
News
- 2017-06-04: New versions (0.2.0) of EM-SparsePhi and EM-SuccinctIrreducible external-memory LCP array construction algorithms available: link
- 2016-08-23: New external-memory (+parallel) algorithms for constructing the LCP array: link
- 2016-01-01: New version of LCPscan (0.2.0) available
Software
Suffix/(P)LCP array construction:- Better external memory (P)LCP array construction
- Parallel external memory suffix array construction
- LCP array construction in external memory
- Lightweight external memory suffix array construction algorithm
- Hybrid compression of bitvectors for the FM-Index
- Data compressor based on Burrows-Wheeler transform
- String sorting algorithms
Publications
-
Héctor Ferrada, Dominik Kempa, Simon J. Puglisi.
Hybrid Indexing Revisited.
In Proc. 2018 Algorithm Engineering and Experimentation (ALENEX 2018), SIAM, 2018, pp. 1-8.
[SIAM] -
Juha Kärkkäinen, Dominik Kempa.
Engineering External Memory LCP Array Construction: Parallel, In-Place and Large Alphabet.
In Proc. 16th International Symposium on Experimental Algorithms (SEA 2017), LIPIcs, 2017, pp. 17:1-17:14.
[LIPIcs] [code] -
Juha Kärkkäinen, Dominik Kempa.
Engineering a Lightweight External Memory Suffix Array Construction Algorithm.
In Mathematics in Computer Science, Volume 11(2), 2017, pp. 137-149.
[Springer] [code] -
Dominik Kempa, Dmitry Kosolobov.
LZ-End Parsing in Compressed Space.
In Proc. 2017 Data Compression Conference (DCC 2017), IEEE Computer Society, 2017, pp. 350-359.
[IEEE] -
Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi, Bella Zhukova.
Engineering External Memory Induced Suffix Sorting.
In Proc. 2017 Algorithm Engineering and Experimentation (ALENEX 2017), SIAM, 2017, pp. 98-108.
[SIAM] -
Juha Kärkkäinen, Dominik Kempa.
Faster External Memory LCP Array Construction.
In Proc. 24th European Symposium on Algorithms (ESA 2016), LIPIcs, 2016, pp. 61:1-61:16.
[LIPIcs] [code] -
Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi.
Lazy Lempel-Ziv Factorization Algorithms.
ACM Journal of Experimental Algorithmics (JEA), Volume 21(1), 2016, Article 2.4.
[ACM] [code] -
Juha Kärkkäinen, Dominik Kempa.
LCP Array Construction in External Memory.
ACM Journal of Experimental Algorithmics (JEA), Volume 21(1), 2016, Article 1.7.
[ACM] [code] -
Juha Kärkkäinen, Dominik Kempa.
LCP Array Construction Using O(sort(n)) (or Less) I/Os.
In Proc. 23rd Symposium on String Processing and Information Retrieval (SPIRE 2016), Springer, 2016, pp. 204-217.
[Springer] -
Djamal Belazzougui, Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi.
Lempel-Ziv Decoding in External Memory.
In Proc. 15th International Symposium on Experimental Algorithms (SEA 2016), Springer, 2016, pp. 63-74.
[Springer] -
Simon Gog, Juha Kärkkäinen, Dominik Kempa, Matthias Petri, Simon J. Puglisi.
Faster, Minuter.
In Proc. 2016 Data Compression Conference (DCC 2016), IEEE Computer Society, 2016, pp. 53-62.
[IEEE] -
Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi.
Parallel External Memory Suffix Sorting.
In Proc. 26th Symposium on Combinatorial Pattern Matching (CPM 2015), Springer, 2015, pp. 329-342.
[Springer] [code] -
Juha Kärkkäinen, Dominik Kempa.
LCP Array Construction in External Memory.
In Proc. 13th International Symposium on Experimental Algorithms (SEA 2014), Springer, 2014, pp. 412-423.
[Springer] [code] -
Juha Kärkkäinen, Dominik Kempa.
Engineering a Lightweight External Memory Suffix Array Construction Algorithm.
In Proc. 2nd International Conference on Algorithms for Big Data (ICABD 2014), CEUR, 2014, pp. 53-60.
[CEUR] [code] -
Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi.
Hybrid Compression of Bitvectors for the FM-Index.
In Proc. 2014 Data Compression Conference (DCC 2014), IEEE Computer Society, 2014, pp. 302-311.
[IEEE] [code] -
Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi.
Lempel-Ziv Parsing in External Memory.
In Proc. 2014 Data Compression Conference (DCC 2014), IEEE Computer Society, 2014, pp. 153-162.
[IEEE] [code] -
Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi.
Lightweight Lempel-Ziv Parsing.
In Proc. 12th International Symposium on Experimental Algorithms (SEA 2013), Springer, 2013, pp. 139-150.
[Springer] [code] -
Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi.
Linear Time Lempel-Ziv Factorization: Simple, Fast, Small.
In Proc. 24th Symposium on Combinatorial Pattern Matching (CPM 2013), Springer, 2013, pp. 189-200.
[Springer] [code] -
Dominik Kempa, Simon J. Puglisi.
Lempel-Ziv Factorization: Simple, Fast, Practical.
In Proc. 2013 Algorithm Engineering and Experimentation (ALENEX 2013), SIAM, 2013, pp. 103-112.
[SIAM] [code] -
Juha Kärkkäinen, Pekka Mikkola, Dominik Kempa.
Grammar Precompression Speeds Up Burrows-Wheeler Compression.
In Proc. 18th Symposium on String Processing and Information Retrieval (SPIRE 2012), Springer, 2012, pp. 330-335.
[Springer] [code] -
Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi.
Slashing the Time for BWT Inversion.
In Proc. 2012 Data Compression Conference (DCC 2012), IEEE Computer Society, 2012, pp. 99-108.
[IEEE] -
Juha Kärkkäinen, Simon J. Puglisi.
Fixed Block Compression Boosting in FM Indexes.
In Proc. 18th Symposium on String Processing and Information Retrieval (SPIRE 2011), Springer, 2011, pp. 174-184.
[Springer] -
Juha Kärkkäinen, Simon J. Puglisi.
Cache-Friendly Burrows-Wheeler Inversion.
In Proc. 1st International Conference on Data Compression, Communication and Processing (CCP 2011), IEEE Computer Society, 2011, pp. 38-42.
[IEEE] [pdf (preliminary)] -
Juha Kärkkäinen, Simon J. Puglisi.
Medium-Space Algorithms for Inverse BWT.
In Proc. 18th European Symposium on Algorithms (ESA 2010), Springer, 2010, pp. 451-462.
[Springer] -
Juha Kärkkäinen, Giovanni Manzini, Simon J. Puglisi.
Permuted Longest-Common-Prefix Array.
In Proc. 20th Symposium on Combinatorial Pattern Matching (CPM 2009), Springer, 2009, pp. 181-192.
[Springer] -
Juha Kärkkäinen, Tommi Rantala.
Engineering Radix Sort for Strings.
In Proc. 15th String Processing and Information Retrieval Symposium (SPIRE 2008), Springer, 2008, pp. 3-14.
[Springer] [code]