Abstract
The Department of Energy's (DOE) Geothermal Data Repository (GDR) team has implemented or is currently implementing data standards and automated data pipelines for the following geothermal data types: 1) drilling data, 2) geospatial datasets, and 3) Distributed Acoustic Sensing (DAS) data. These data standards and pipelines are intended to improve the real-world applicability of geothermal machine learning outputs through improving the quality of data. More specifically, through standardizing high-value datasets, the GDR is reducing project-specific data curation requirements, allowing more time to be spent on actual research. By automating this process, the burden of standardization is taken off of the user, overall increasing the availability of standardized data. This paper provides an update on the GDR's transition toward data standardization through automated data pipelines and calls for feedback from the community on how the GDR team can improve this process.
Original language | American English |
---|---|
Pages | 2361-2375 |
Number of pages | 15 |
State | Published - 2023 |
Event | Geothermal Rising Conference 2023 - Reno, NV Duration: 1 Oct 2023 → 4 Oct 2023 |
Conference
Conference | Geothermal Rising Conference 2023 |
---|---|
City | Reno, NV |
Period | 1/10/23 → 4/10/23 |
Bibliographical note
See NREL/CP-6A20-86935 for preprintNREL Publication Number
- NREL/CP-6A20-88752
Keywords
- cloud-optimized
- DAS data
- data curation
- data lake
- data pipeline
- data science
- data standard
- geospatial data
- geothermal data
- Geothermal Data Repository
- machine learning