CS 269S, Fall 2021: Theory Seminar

October 8, 2021, 1:00 – 1:50pm: DBH 1427

Low-Span Parallel Algorithms for the Binary-Forking Model

Ryuto Kitagawa

Abstract: The binary-forking model is a parallel computation model, formally defined by Blelloch et al., in which a thread can fork a concurrent child thread, recursively and asynchronously. The model incurs a cost of Θ(log 𝑛) to spawn or synchronize 𝑛 tasks or threads. The binary-forking model realistically captures the performance of parallel algorithms implemented using modern multithreaded programming languages on multicore shared-memory machines. In contrast, the widely studied theoretical PRAM model does not consider the cost of spawning and synchronizing threads, and as a result, algorithms achieving optimal performance bounds in the PRAM model may not be optimal in the binary-forking model. Often, algorithms need to be redesigned to achieve optimal performance bounds in the binary-forking model and the non-constant synchronization cost makes the task challenging. In this paper, we show that in the binary-forking model we can achieve optimal or near-optimal span with negligible or no asymptotic blowup in work for comparison-based sorting, Strassen’s matrix multiplication (MM), and the Fast Fourier Transform (FFT). Our major results are as follows: (1) A randomized comparison- based sorting algorithm with optimal 𝑂 (log 𝑛) span and 𝑂 (𝑛 log 𝑛) work, both w.h.p. in 𝑛. (2) An optimal 𝑂(log𝑛) span algorithm for Strassen’s matrix multiplication (MM) with only a Θ (log log 𝑛)- factor blow-up in work as well as a near-optimal 𝑂 (log 𝑛 log log log 𝑛) span algorithm with no asymptotic blow-up in work. (3) A near- optimal 𝑂 (log 𝑛 log log log 𝑛) span Fast Fourier Transform (FFT) algorithm with less than a log 𝑛-factor blow-up in work for all prac- tical values of 𝑛 (i.e., 𝑛 ≤ 10^10,000).

(Paper by Zafar Ahmad, Rezaul Chowdhury, Rathish Das, Pramod Ganapathi, Aaron Gregory, and Mohammad Mahdi Javanmard)

Zoom: https://uci.zoom.us/j/96130514614

Video of the talk