Low-Synch Gram-Schmidt with Delayed Reorthogonalization for Krylov Solvers: Article No. 102940

Daniel Bielich, Julien Langou, Stephen Thomas, Kasia Swirydowicz, Ichitaro Yamazaki, Erik Boman

Research output: Contribution to journalArticlepeer-review

5 Scopus Citations


The parallel strong-scaling of iterative methods is often determined by the number of global reductions at each iteration. Low-synch Gram-Schmidt algorithms are applied here to the Arnoldi algorithm to reduce the number of global reductions and therefore to improve the parallel strong-scaling of iterative solvers for nonsymmetric matrices such as the GMRES and the Krylov-Schur iterative methods. In the Arnoldi context, the factorization is "left-looking" and processes one column at a time. Among the methods for generating an orthogonal basis for the Arnoldi algorithm, the classical Gram-Schmidt algorithm, with reorthogonalization (CGS2) requires three global reductions per iteration. A new variant of CGS2 that requires only one reduction per iteration is presented and applied to the Arnoldi algorithm. Delayed CGS2 (DCGS2) employs the minimum number of global reductions per iteration (one) for a one-column at-a-time algorithm. The main idea behind the new algorithm is to group global reductions by rearranging the order of operations. DCGS2 must be carefully integrated into an Arnoldi expansion or a GMRES solver. Numerical stability experiments assess robustness for Krylov-Schur eigenvalue computations. Performance experiments on the ORNL Summit supercomputer then establish the superiority of DCGS2 over CGS2.
Original languageAmerican English
Number of pages12
JournalParallel Computing
StatePublished - 2022

NREL Publication Number

  • NREL/JA-2C00-80326


  • exascale
  • global reduction
  • GPU
  • Gram-Schmidt
  • Krylov methods
  • low synchronization
  • many-core architecture
  • massively parallel
  • nonsymmetric
  • orthogonalization
  • scalable solvers


Dive into the research topics of 'Low-Synch Gram-Schmidt with Delayed Reorthogonalization for Krylov Solvers: Article No. 102940'. Together they form a unique fingerprint.

Cite this