Parallel Application Performance on Two Generations of Intel Xeon HPC Platforms

Christopher Chang, Scott Sides, Hai Long, Wesley Jones, Deepthi Vaidhyanathan

Research output: NRELTechnical Report


Two next-generation node configurations hosting the Haswell microarchitecture were tested with a suite of microbenchmarks and application examples, and compared with a current Ivy Bridge production node on NREL's Peregrine high-performance computing cluster. A primary conclusion from this study is that the additional cores are of little value to individual task performance--limitations to application parallelism, or resource contention among concurrently running but independent tasks, limits effective utilization of these added cores. Hyperthreading generally impacts throughput negatively, but can improve performance in the absence of detailed attention to runtime workflow configuration. The observations offer some guidance to procurement of future HPC systems at NREL. First, raw core count must be balanced with available resources, particularly memory bandwidth. Balance-of-system will determine value more than processor capability alone. Second, hyperthreading continues to be largely irrelevant to the workloads that are commonly seen, and were tested here, at NREL. Finally, perhaps the most impactful enhancement to productivity might occur through enabling multiple concurrent jobs per node. Given the right type and size of workload, more may be achieved by doing many slow things at once, than fast things in order.
Original languageAmerican English
Number of pages31
StatePublished - 2015

NREL Publication Number

  • NREL/TP-2C00-64268


  • amber
  • benchmarking
  • Gaussian
  • Haswell
  • multiply
  • Peregrine
  • stream
  • Vienna Ab Initio Simulation Package (VASP)


Dive into the research topics of 'Parallel Application Performance on Two Generations of Intel Xeon HPC Platforms'. Together they form a unique fingerprint.

Cite this