Outputs & File Structure¶
Purpose of this page¶
- Outputs on disk are the primary interface of SpectralBridge.
- Downstream analyses should rely on these artefacts, not return values from the Python API.
Canonical per-flightline outputs¶
Naming stems come from spectralbridge.paths.FlightlinePaths and spectralbridge.utils.naming.get_flightline_products; sensor-specific stems come from SensorProductPaths.
| Output type | Canonical filename pattern | Description | Notes / guarantees |
|---|---|---|---|
| Raw ENVI (when available) | <flight_id>_envi.(img|hdr|parquet) |
Direct export of the NEON HDF5 reflectance cube. | Used when present to seed later stages; parquet sidecar is written when exported. |
| BRDF + topographic corrected ENVI | <flight_id>_brdfandtopo_corrected_envi.(img|hdr|json|parquet) |
Physics-informed normalization and correction JSON produced before sensor resampling. | One corrected set per flightline; parquet mirrors the corrected ENVI cube. |
| Sensor-resampled ENVI + Parquet | <flight_id>_<sensor>_envi.(img|hdr|parquet) where <sensor> is one of landsat_tm, landsat_etm+, landsat_oli, landsat_oli2, micasense, micasense_to_match_tm_etm+, micasense_to_match_oli_oli2 |
Reflectance cubes resampled into the Landsat-referenced frame and MicaSense variants. | Each sensor has its own ENVI/Parquet trio; stems are consistent with FlightlinePaths.sensor_products. |
| Merged Parquet | <flight_id>_merged_pixel_extraction.parquet |
Master table that merges Parquet sidecars across stages into one analysis-ready spectral library. | Exactly one per flightline; treated as the primary success signal. |
| QA artefacts | <flight_id>_qa.png, <flight_id>_qa.json, optional <flight_id>_qa.pdf |
Visual and numeric QA summaries aligned to the merged outputs. | PNG and JSON are expected for every completed run; PDF is produced when rendering is enabled. |
| QA metrics parquet | <flight_id>_qa_metrics.parquet |
Structured QA metrics by band and sensor. | Emitted alongside QA JSON/PNG when QA calculation runs. |
What “success” means¶
- The merged parquet exists and is readable:
<flight_id>_merged_pixel_extraction.parquet. - The QA PNG renders:
<flight_id>_qa.png(with matching<flight_id>_qa.json). - Sensor-specific ENVI/Parquet products exist as configured; absence may reflect configuration rather than failure.
- If the merged parquet and QA PNG are present, the pipeline completed successfully for that flightline.
Idempotence and restart-safety¶
process_one_flightline and go_forth_and_multiply skip stages whose outputs already exist and validate, so re-running the pipeline will not recompute completed products. This skip-if-valid behavior makes restarts safe and is relied upon in notebook workflows and batch processing alike.
CI and documentation guarantees¶
- Documentation drift checks (
tools/doc_drift_audit.py) assert that_merged_pixel_extraction.parquetand_qa.pngremain part of the documented contract. - QA expectations and output stems are shared between user-facing docs and automated validation to protect reproducibility.
How to rely on these outputs¶
- Load Parquet products directly (especially the merged parquet) for analysis; they are the authoritative API.
- Inspect QA PNG/JSON before downstream modeling to confirm spectral health and calibration quality.
- Treat intermediate ENVI products as optional diagnostics; most workflows can ignore them once Parquet and QA artefacts are present.