Skip to main content Link Search Menu Expand Document (external link)

Lightweight Distributed Suffix Array Construction

DOI slides pdf code

Abstract

We present two new distributed suffix array construction algorithms. One of our algorithms requires only half as much memory as its competitor (PSAC) [Flick & Aluru, SC 2015], while achieving similar speed. In practice, we can compute on the same hardware suffix arrays for text twice as large as PSAC. The other algorithm still requires less memory than PSAC but is faster on some instances. As a by-product, we also engineered the first distributed string sorting algorithm. All of our algorithms are tested on text collections of up to 115 GB and running on 1280 cores.

BibTeX

@inproceedings{Kurpicz2019ALENEX,
  author    = {Johannes Fischer and Florian Kurpicz},
  title     = {Lightweight Distributed Suffix Array Construction},
  booktitle = {ALENEX},
  pages     = {27–38},
  publisher = {SIAM},
  doi       = {10.1137/1.9781611975499.3},
  year      = {2019}
}

PDF