# On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching

## Abstract

We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with $p$ processors. Given a static text of length $n$, we first show how to compute the suffix array interval of a given pattern of length $m$ in $O(\frac{m}{p} + \lg p + \lg\lg p \cdot \lg\lg n)$ time for $p \leq m$. For approximate pattern matching with $k$ differences or mismatches, we show how to compute all occurrences of a given pattern in $O(\frac{m^k\sigma^k}{p} \max\{k, \lg\lg n\} + (1+m/p)\lg p \cdot \lg\lg n + occ)$ time, where $\sigma$ is the size of the alphabet and $p \leq \sigma^k m^k$. The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns $P$ and $P'$, we present a data structure for computing the interval of $PP'$ in $O(\lg\lg n)$ sequential time, or in $O(1 + \lg_p\lg n)$ parallel time. All our data structures are of size $O(n)$ bits (in addition to the suffix array).

## BibTeX

@inproceedings{Kurpicz2016CPM,
author    = {Johannes Fischer and Dominik Köppl and Florian Kurpicz},
title     = {On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching},
booktitle = {CPM},
series    = {LIPIcs},
volume    = {54},
pages     = {26:1–26:11},
publisher = {Schloss Dagstuhl - Leibniz-Zentrum für Informatik},
doi       = {10.4230/LIPIcs.CPM.2016.26},
year      = {2016}
}