Bit-Parallel (Compressed) Wavelet Tree Construction
Abstract
The wavelet tree is a data structure that indexes a text over an integer alphabet for efficient rank and select queries. Using the Huffman encoding, it can be stored in zero-order entropy-compressed space. We present a highly engineered open source implementation of an efficient sequential construction algorithm that makes use of bit parallelism via vector instructions. On hardware featuring ultrawide registers of up to 512 bits, it outperforms the currently fastest known practical sequential construction algorithms.
BibTeX
@inproceedings{Kurpicz2023DCC,
author = {Patrick Dinklage and Florian Kurpicz and Johannes Fischer and Jan-Philipp Tarnowski},
title = {Bit-Parallel (Compressed) Wavelet Tree Construction},
booktitle = {DCC},
pages = {81--90},
publisher = {IEEE},
doi = {10.1109/DCC55655.2023.00016},
year = {2023}
}