Pipeline Overview & Stages¶
The cross-sensor-cal pipeline transforms NEON HDF5 directional reflectance into physically corrected and sensor-harmonized reflectance products. Each stage is restart-safe and produces structured, auditable outputs.
This page describes every stage of the pipeline, what it consumes, what it produces, and what can go wrong.
Pipeline summary¶
- Data acquisition (download NEON HDF5 tiles)
- HDF5 → ENVI export
- Topographic correction
- BRDF correction
- Sensor harmonization (spectral convolution)
- Parquet extraction + merging
- Quality assurance (QA PNG, PDF, JSON)
Each stage can be run independently using the --start-at and --end-at flags.
1. Data acquisition¶
Inputs: - NEON API paths or local HDF5 files
Outputs:
- cached HDF5 tiles stored under the selected --base-folder
The pipeline fetches only the tiles required for the selected flight line.
Common issues: - missing HDF5 files in NEON storage - interrupted downloads in cloud environments - insufficient space in temporary directories
2. HDF5 → ENVI export¶
Inputs:
- *_directional_reflectance.h5
- per-pixel geometry and metadata
Outputs:
- *_directional_reflectance_envi.img/.hdr
- sidecar JSON documenting extracted wavelengths, masks, and scaling
This stage produces an ENVI image that mirrors the HDF5 directional reflectance dataset.
What the ENVI file contains: - reflectance (scaled NEON values) - wavelength metadata - per-pixel masks (cloud, cloud shadow, water, snow, invalid)
Common issues: - mismatch between HDF5 metadata and ENVI header - extremely large tile sizes causing I/O delays - NaN bands due to malformed HDF5 datasets
3. Topographic correction¶
Inputs: - directional reflectance ENVI - DEM-derived slope and aspect - solar geometry
Outputs:
- *_topocorrected_envi.img
Topographic correction reduces slope- and aspect-driven variation in illumination.
The method assumes surface reflectance behaves consistently with simple terrain-adjustment models.
Common issues: - DEM resolution mismatch - strong terrain shadows that remain after correction - negative reflectance in deep shadows (masked)
4. BRDF correction¶
Inputs: - topographically corrected ENVI reflectance - view geometry (sensor zenith / azimuth) - solar geometry
Outputs:
- *_brdfandtopo_corrected_envi.img
- BRDF coefficient tables in the QA JSON
BRDF correction adjusts reflectance to a consistent view/illumination angle, making spectra across the flight line more comparable.
Common issues: - instabilities in BRDF coefficient fitting - extreme reflectance values that must be masked - spatial artifacts in low-SNR bands
5. Sensor harmonization (spectral convolution)¶
Inputs: - BRDF+topo corrected ENVI - sensor spectral response functions (SRFs)
Outputs:
- *_landsat_convolved_envi.img or other sensor-equivalent ENVI products
- bandpass-harmonized Parquet files
This stage integrates the corrected spectrum against the target sensor's SRFs. Supported sensors include Landsat OLI/OLI-2; others can be added.
Common issues: - wavelength misalignment - missing SRF tables - sensor bands with near-zero response across NEON wavelengths
6. Parquet extraction & merging¶
Inputs: - any ENVI cube produced by earlier stages
Outputs: - a Parquet file per cube (one row per pixel) - a merged pixel extraction table for the whole flight line
This step makes downstream analysis easy in Python, R, or DuckDB.
Common issues: - extremely large tables (billions of rows) - insufficient memory for merges - incorrect CRS metadata in ENVI headers
7. Quality assurance (QA)¶
Inputs: - all previous outputs
Outputs:
- *_qa.png
- *_qa.pdf
- *_qa.json
QA artifacts summarize reflectance distributions, masks, wavelength metadata, and BRDF/brightness statistics. See the QA page for details.
Running a partial pipeline¶
You can run only part of the pipeline:
cscal-pipeline \
--start-at brdf \
--end-at convolution
Or run a single stage manually if needed.