Login / Signup

AmeriFlux BASE data pipeline to support network growth and data sharing.

Housen ChuDanielle S ChristiansonYou-Wei CheahGilberto Z PastorelloFianna O'BrienJoshua GedenSy-Toan NgoRachel HollowgrassKarla LeibowitzNorman F BeekwilderMegha SandeshSigrid DengelStephen W ChanAndré SantosKyle DelwicheKoong YiChristin BuechnerDennis BaldocchiDario PapaleTrevor F KeenanSébastien C BiraudDeborah A AgarwalMargaret S Torn
Published in: Scientific data (2023)
AmeriFlux is a network of research sites that measure carbon, water, and energy fluxes between ecosystems and the atmosphere using the eddy covariance technique to study a variety of Earth science questions. AmeriFlux's diversity of ecosystems, instruments, and data-processing routines create challenges for data standardization, quality assurance, and sharing across the network. To address these challenges, the AmeriFlux Management Project (AMP) designed and implemented the BASE data-processing pipeline. The pipeline begins with data uploaded by the site teams, followed by the AMP team's quality assurance and quality control (QA/QC), ingestion of site metadata, and publication of the BASE data product. The semi-automated pipeline enables us to keep pace with the rapid growth of the network. As of 2022, the AmeriFlux BASE data product contains 3,130 site years of data from 444 sites, with standardized units and variable names of more than 60 common variables, representing the largest long-term data repository for flux-met data in the world. The standardized, quality-ensured data product facilitates multisite comparisons, model evaluations, and data syntheses.
Keyphrases
  • electronic health record
  • big data
  • machine learning
  • data analysis
  • artificial intelligence
  • deep learning
  • quantum dots