Cosecha#
Tools for harvesting earth observation data for use in flood forecasting.
Cosecha provides a flexible pipeline for collecting geospatial data from multiple sources and writing to various formats with optional transformations.
Features#
- Time-series data collection (USGS NWIS streamflow, stage, precipitation)
- Gridded data support (HRRR, RRFS, RTMA via herbie; MRMS via S3)
- Multiple output formats: Parquet, NetCDF, Zarr, Iceberg, IceChunk
- Data transformations: unit conversion, spatial subsetting, variable selection/rename
- Cross-platform support (ecCodes C library required for GRIB2/MRMS)
Installation#
With optional dependencies for NWP (HRRR, RRFS) support:
Note: Cosecha depends on the ecCodes C library for reading GRIB2 data (used by MRMS). When installing with pip, you must have ecCodes available on your system. The easiest cross-platform approach is to install it via conda-forge:
Or use pixi which handles this automatically:
Quick Start#
from cosecha import USGSNWISReaper
# Fetch USGS streamflow data
reaper = USGSNWISReaper(
site_ids=["01650000"],
start_date="2026-01-01",
end_date="2026-01-31",
parameter_code="00060",
)
# Execute
data = reaper.reap()
# Write to Parquet
path = reaper.sow_to_parquet(file_path="./data/streamflow.pq")
Documentation#
Full documentation at https://dewberry.github.io/cosecha/
Contributing#
Contributions are welcome! Please see CONTRIBUTING.md for details.
License#
MIT License. See LICENSE for details.