Lightweight Distributed Suffix Array Construction
Abstract
We present two new distributed suffix array construction algorithms. One of our algorithms requires only half as much memory as its competitor (PSAC) [Flick & Aluru, SC 2015], while achieving similar speed. In practice, we can compute on the same hardware suffix arrays for text twice as large as PSAC. The other algorithm still requires less memory than PSAC but is faster on some instances. As a by-product, we also engineered the first distributed string sorting algorithm. All of our algorithms are tested on text collections of up to 115 GB and running on 1280 cores.
BibTeX
@inproceedings{Kurpicz2019ALENEX,
author = {Johannes Fischer and Florian Kurpicz},
title = {Lightweight Distributed Suffix Array Construction},
booktitle = {ALENEX},
pages = {27–38},
publisher = {SIAM},
doi = {10.1137/1.9781611975499.3},
year = {2019}
}