Presentation at CPM 2016

## On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching

Today, I presented our work on maximum common subgraph problems in series-parallel graphs at CPM 2016. We presented a new approach to parallelize pattern matching based on merging intervals in the suffix array that contain the results of subqueries. Our algorithm can also answer pattern matching with *k* errors. The slides are available as handout and heavily animated. You can finde the paper here (DOI) and here (arXiv).

### Slides in Handout Mode

### Abstract

We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with p processors. Given a static text of length n, we first show how to compute the suffix array interval of a given pattern of length m in O(m/p + lg p + lg lg p * lg lg n) time for p <= m. For approximate pattern matching with k differences or mismatches, we show how to compute all occurrences of a given pattern in O((m^k sigma^k)/p max (k, lg lg n) + (1+m/p) lg p * lg lg n + occ} time, where sigma is the size of the alphabet and p <= sigma^k m^k. The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns P and P', we present a data structure for computing the interval of PP' in O(lg lg n) sequential time, or in O(1 + lg_p lg n) parallel time. All our data structures are of size O(n) bits (in addition to the suffix array).