Speaker
Description
Hyperspectral imaging (HSI) from satellites provides rich spectral information for monitoring vegetation over large areas. However, many satellite applications in agriculture are limited by the lack of labelled benchmark data and weak semantic links between spectra and plant health. We introduce PlantHyper, a large-scale captioned hyperspectral dataset derived from the German EnMAP satellite mission. The dataset is designed for plant and pathogen analysis in agricultural landscapes. It supports satellite-based tasks such as crop monitoring, stress and disease detection, weed discrimination, and yield-related condition assessment. PlantHyper uses EnMAP measurements in the visible–near infrared (VNIR) and shortwave infrared (SWIR) ranges. It covers wavelengths from 420 nm to 2450 nm. EnMAP records 224–228 contiguous spectral bands with an average sampling distance of 6.5 nm in the VNIR and 10 nm in the SWIR. Each pixel in the dataset has plant-level labels and a textual caption. The captions are generated with guidance from large language models. They describe crop type, visible stress or disease cues, and scene context. This creates an explicit link between satellite spectra and high-level descriptions of plant status. We use PlantHyper to study satellite hyperspectral image classification for pathogen detection under realistic orbital acquisition conditions. Our experiments show that adding caption-based semantic embeddings improves spectral representation learning. It also boosts classification performance compared with standard HSI baselines. By coupling spaceborne EnMAP hyperspectral data with LLM-enhanced semantic supervision, PlantHyper offers a reusable benchmark for satellite-based plant health monitoring. Additionally, PlantHyper provides a methodological framework that connects representation learning, agricultural remote sensing, and disease surveillance from orbit.
| Stream | Science or Engineering |
|---|