Skip to main content Link Search Menu Expand Document (external link)

Slides for our paper "On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching" presented at CPM 2016

DOI arXiv slides pdf

Abstract

We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with pp processors. Given a static text of length nn, we first show how to compute the suffix array interval of a given pattern of length mm in O(mp+lgp+lglgplglgn)O(\frac{m}{p} + \lg p + \lg\lg p \cdot \lg\lg n) time for pmp \leq m. For approximate pattern matching with kk differences or mismatches, we show how to compute all occurrences of a given pattern in O(mkσkpmax{k,lglgn}+(1+m/p)lgplglgn+occ)O(\frac{m^k\sigma^k}{p} \max\{k, \lg\lg n\} + (1+m/p)\lg p \cdot \lg\lg n + occ) time, where σ\sigma is the size of the alphabet and pσkmkp \leq \sigma^k m^k. The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns PP and PP', we present a data structure for computing the interval of PPPP' in O(lglgn)O(\lg\lg n) sequential time, or in O(1+lgplgn)O(1 + \lg_p\lg n) parallel time. All our data structures are of size O(n)O(n) bits (in addition to the suffix array).

Preview

Preview of the slides.

Download