Discrimination accuracy of haploid and diploid maize seeds using NIR spectroscopy coupled with different machine learning algorithms and data pretreatment methods

Yükleniyor...
Küçük Resim

Tarih

2025

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Taylor & Francis Inc

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

Spectral data collected at the single seed level allows determination of the biochemical content of the seed sample, as well as to identify the seed class. NIR (Near Infrared) spectroscopy provides a more precise method for differentiating haploid and diploid seeds in maize than traditional visual examination. In this study, classification models that can be used in the separation of haploid and diploid maize seeds were developed using spectra collected between 900-1700 nm from a single maize seed. In the study, 427 diploid and 311 haploid samples obtained by crossing 10 donor materials and 3 inducer lines and separated by eye according to the Navajo marker were used. Spectral measurements were conducted over the wavelength range of 900 to 1700 nm for each sample. The robust PCA (Principal Component Analysis) method was used to detect spectral outliers. Spectral data were treated with none, FD (First Derivative), SD (Second Derivative), SNV (Standard Normal Variate), and their binary combinations. Logistic Regression, Support Vector Machine with a linear kernel (SVM-C), Random Forest, and XGBoost methods were employed as machine learning techniques. The performance of the developed machine learning models was assessed using metrics such as Sensitivity, Specificity, Recall, F1-Score, and Accuracy. The Boosting method demonstrated the best performance with 94.9% accuracy, 95.1% sensitivity, 94% specificity, and an F1 Score of 96%, particularly when using raw reflectance data. These results obtained from raw data show that high accuracy can be achieved in classification models without requiring additional preprocessing steps. D2 preprocessing was found to be unsuitable for intact seed spectra, whereas SNV and D1 applications improved the classification success of other modeling techniques. The study revealed that the Boosting-Raw combination is a powerful and feasible method for classifying haploid and diploid samples.

Açıklama

Anahtar Kelimeler

Classification, machine learning algorithm, Zea mays

Kaynak

Spectroscopy Letters

WoS Q Değeri

Q3

Scopus Q Değeri

Q3

Cilt

Sayı

Künye