PIDX: Efficient Parallel I/O for Multi-Resolution Multi-Dimensional Scientific Datasets

Sidharth Kumar, Venkatram Vishwanath, Philip Carns, Brian Summa, Giorgio Scorzelli, Valerio Pascucci, Robert Ross, Jacqueline Chen, Hemanth Kolla, Ray Grout

Research output: Contribution to conferencePaperpeer-review

30 Scopus Citations

Abstract

The IDX data format provides efficient, cache oblivious, and progressive access to large-scale scientific datasets by storing the data in a hierarchical Z (HZ) order. Data stored in IDX format can be visualized in an interactive environment allowing for meaningful explorations with minimal resources. This technology enables real-time, interactive visualization and analysis of large datasets on a variety of systems ranging from desktops and laptop computers to portable devices such as iPhones/iPads and over the web. While the existing ViSUS API for writing IDX data is serial, there are obvious advantages of applying the IDX format to the output of large scale scientific simulations. We have therefore developed PIDX - a parallel API for writing data in an IDX format. With PIDX it is now possible to generate IDX datasets directly from large scale scientific simulations with the added advantage of real-time monitoring and visualization of the generated data. In this paper, we provide an overview of the IDX file format and how it is generated using PIDX. We then present a data model description and a novel aggregation strategy to enhance the scalability of the PIDX library. The S3D combustion application is used as an example to demonstrate the efficacy of PIDX for a real-world scientific simulation. S3D is used for fundamental studies of turbulent combustion requiring exceptionally high fidelity simulations. PIDX achieves up to 18 GiB/s I/O throughput at 8,192 processes for S3D to write data out in the IDX format. This allows for interactive analysis and visualization of S3D data, thus, enabling in situ analysis of S3D simulation.

Original languageAmerican English
Pages103-111
Number of pages9
DOIs
StatePublished - 2011
Event2011 IEEE International Conference on Cluster Computing, CLUSTER 2011 - Austin, TX, United States
Duration: 26 Sep 201130 Sep 2011

Conference

Conference2011 IEEE International Conference on Cluster Computing, CLUSTER 2011
Country/TerritoryUnited States
CityAustin, TX
Period26/09/1130/09/11

NREL Publication Number

  • NREL/CP-2C00-53704

Keywords

  • High Performance computing
  • in situ visualization
  • Interactive visualization
  • Parallel file systems
  • Parallel I/O

Fingerprint

Dive into the research topics of 'PIDX: Efficient Parallel I/O for Multi-Resolution Multi-Dimensional Scientific Datasets'. Together they form a unique fingerprint.

Cite this