Accurate Prediction of Algal Biomass Lipid, Protein, and Carbohydrate Composition with Machine Learning Regression Modelling of Near-IR Spectra

Seth Steichen, Aga Pinowska, Joshua Brown Clay, David Hazlebeck, Lieve Laurens

Research output: NRELPoster


During large scale algal biomass cultivation, it is difficult to reliably control relative composition to target levels. Rapid determination of chemical composition is feasible by using near infrared (NIR) spectral data. We sought to build and improve on reliable high-throughput screening prediction method based on partial least squares regression (PLSR) by the application of artificial neural networks (ANN) and associated optimization strategies. The algal biomass sample set was designed and created in an iterative process of culturing in physiologically diverse conditions at the GAI field site, followed by compositional analyses at NREL. The workflow allowed us to identify gaps in compositional space for informing the subsequent cultivation and sampling efforts and generated a high quality set of 210 unique samples with chemical analysis results, spectral scanning data, and cultivation metadata. We observed a significant improvement in the performance of carbohydrate content predictions using an optimized ANN model compared to PLSR, with > 16% reduction in mean absolute percent error (MAPE) when tested on the same set of reserved data. The optimized ANN models for FAME and protein prediction performed exceptionally well with 5.99% and 5.09% MAPE, respectively. Application of these methods to detection and quantification of minor biomass constituents that are relevant to certain product streams has shown positive preliminary results, opening the possibility for extensions to the outputs of this powerful data type. All models are accompanied by prediction uncertainties and unsupervised spectral outlier detection to alert an operator to unreliable spectral data. These tools can be deployed for rapid determination of algal culture status, and cultivation and biomass quality improvement.
Original languageAmerican English
StatePublished - 2023

Publication series

NamePresented at the International Conference on Algal Biomass, Biofuels and Bioproducts (AlgalBBB 2023), 12-14 June 2023, Waikoloa Beach, Hawaii

NREL Publication Number

  • NREL/PO-2700-86488


  • algal biofuels
  • biomass composition
  • carbon allocation
  • machine learning
  • spectral analyses


Dive into the research topics of 'Accurate Prediction of Algal Biomass Lipid, Protein, and Carbohydrate Composition with Machine Learning Regression Modelling of Near-IR Spectra'. Together they form a unique fingerprint.

Cite this