Skip to main content
Evidence-Based Supplement Research
Evidence-Based Supplement Research

Hyperspectral inversion model of ginkgo leaf yield prediction based on machine learning.

  • 2025-11-28
  • Frontiers in plant science 16
    • Zheng Zuo
    • Maocheng Zhao
    • Liang Qi
    • Bin Wu
    • Hongyan Zou
    • Weijun Xie
    • Qiaolin Ye
    • Chi Zhou
    • Kai Zhang

Study Design

Population
ginkgo biloba leaves
Methods
airborne hyperspectral imaging, five preprocessing methods (MSC, SNV, SG, FD, SS) with PLSR models, feature band selection (PSO, SPA, PCA, LASSO, CARS, PSAMA), vegetation indices (SAVI, MSAVI, NDRE, SIPI) and machine learning models (PLSR, RF, KNNR, LSTM, SVR, BiLSTM, BiLSTM-GS) for yield prediction
Funding
Unclear

Introduction

The yield of ginkgo biloba leaves serves as a critical indicator for assessing their growth and health status. However, current assessment methods primarily rely on manual harvesting and weighing, which are time-consuming, labor-intensive, inefficient, and costly.

Methods

To address these limitations, this study designed an algorithm-based yield estimation approach: by employing airborne hyperspectral imaging technology at a research base to replace traditional manual measurements, a canopy hyperspectral dataset and Region of Interest Pixel (ROP) sets were constructed. Five preprocessing methods, Multiplicative Scatter Correction (MSC), Standard Normal Variate (SNV), Savitzky-Golay (SG), First Derivative (FD), and Standard Scaling (SS), were employed to develop Partial Least Squares Regression (PLSR) models, identifying the optimal hyperspectral data preprocessing approach. The optimal preprocessing model was subsequently integrated with Particle Swarm Optimization (PSO), Successive Projections Algorithm (SPA), Principal Component Analysis (PCA), Least Absolute Shrinkage and Selection Operator (LASSO), Competitive Adaptive Reweighted Sampling (CARS) and Particle Swarm Attention Mechanism Algorithm (PSAMA) for feature band selection. Traditional spectral vegetation indices were refined through random forest stepwise regression and spectral index correlation analysis, ultimately determining Soil-Adjusted Vegetation Index (SAVI), Modified Soil-Adjusted Vegetation Index (MSAVI), Normalized Difference Red Edge Index (NDRE), Structure Insensitive Pigment Index (SIPI) as the final indices. The selected spectral bands and vegetation indices were then incorporated with PLSR, Random Forest (RF), K-Nearest Neighbors Regression (KNNR), Long Short-Term Memory (LSTM), Support Vector Regression (SVR), Bidirectional LSTM (BiLSTM), and BiLSTM- Grid SearchCV (BiLSTM-GS) machine learning models for yield prediction.

Results

Results demonstrated that the SNV-PLSR model achieved superior performance ( Rp2 = 0.7831, RMSEP = 0.0325). The optimal SNV- (SAVI - MSAVI - NDRE - SIPI - ROP) - (BiLSTM-GS) model, combining PSAMA-selected feature bands with vegetation index and ROP, yielded outstanding prediction accuracy ( Rp2 = 0.8795, RMSEP = 0.1021).

Discussion

This airborne hyperspectral canopy-based estimation technology provides an accurate, non-destructive solution for monitoring ginkgo leaf yield in field cultivation.

Research Insights

    Back to top