Discrimination accuracy of haploid and diploid maize seeds using NIR spectroscopy coupled with different machine learning algorithms and data pretreatment methods

dc.authoridKahrıman, Fatih / 0000-0001-6944-0512
dc.authoridPolat, Adem / 0000-0002-5662-4141
dc.authoridTiryaki, Ali Murat / 0000-0001-8224-6319
dc.authoridEskizeybek, Volkan / 0000-0002-5373-0379
dc.authoridFidan, Sertuğ / 0000-0002-3458-7618
dc.authoridSongur, Umut / 000-0001-7035-9607
dc.contributor.authorKahrıman, Fatih
dc.contributor.authorPolat, Adem
dc.contributor.authorTiryaki, Ali Murat
dc.contributor.authorEskizeybek, Volkan
dc.contributor.authorFidan, Sertuğ
dc.contributor.authorSongur, Umut
dc.date.accessioned2025-05-29T02:57:47Z
dc.date.available2025-05-29T02:57:47Z
dc.date.issued2025
dc.departmentÇanakkale Onsekiz Mart Üniversitesi
dc.description.abstractSpectral data collected at the single seed level allows determination of the biochemical content of the seed sample, as well as to identify the seed class. NIR (Near Infrared) spectroscopy provides a more precise method for differentiating haploid and diploid seeds in maize than traditional visual examination. In this study, classification models that can be used in the separation of haploid and diploid maize seeds were developed using spectra collected between 900-1700 nm from a single maize seed. In the study, 427 diploid and 311 haploid samples obtained by crossing 10 donor materials and 3 inducer lines and separated by eye according to the Navajo marker were used. Spectral measurements were conducted over the wavelength range of 900 to 1700 nm for each sample. The robust PCA (Principal Component Analysis) method was used to detect spectral outliers. Spectral data were treated with none, FD (First Derivative), SD (Second Derivative), SNV (Standard Normal Variate), and their binary combinations. Logistic Regression, Support Vector Machine with a linear kernel (SVM-C), Random Forest, and XGBoost methods were employed as machine learning techniques. The performance of the developed machine learning models was assessed using metrics such as Sensitivity, Specificity, Recall, F1-Score, and Accuracy. The Boosting method demonstrated the best performance with 94.9% accuracy, 95.1% sensitivity, 94% specificity, and an F1 Score of 96%, particularly when using raw reflectance data. These results obtained from raw data show that high accuracy can be achieved in classification models without requiring additional preprocessing steps. D2 preprocessing was found to be unsuitable for intact seed spectra, whereas SNV and D1 applications improved the classification success of other modeling techniques. The study revealed that the Boosting-Raw combination is a powerful and feasible method for classifying haploid and diploid samples.
dc.description.sponsorshipTUBITAK [221N269, 221N418]
dc.description.sponsorshipThis study was supported by TUBITAK within the scope of 1071 program [Project number: 221N269 (221N418)].
dc.identifier.doi10.1080/00387010.2025.2501766
dc.identifier.issn0038-7010
dc.identifier.issn1532-2289
dc.identifier.scopus2-s2.0-105005519188
dc.identifier.scopusqualityQ3
dc.identifier.urihttps://doi.org/10.1080/00387010.2025.2501766
dc.identifier.urihttps://hdl.handle.net/20.500.12428/30178
dc.identifier.wosWOS:001488670900001
dc.identifier.wosqualityQ3
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherTaylor & Francis Inc
dc.relation.ispartofSpectroscopy Letters
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.snmzKA_WOS_20250529
dc.subjectClassification
dc.subjectmachine learning algorithm
dc.subjectZea mays
dc.titleDiscrimination accuracy of haploid and diploid maize seeds using NIR spectroscopy coupled with different machine learning algorithms and data pretreatment methods
dc.typeArticle

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
[ X ]
İsim:
Fatih Kahriman_Makale.pdf
Boyut:
1.61 MB
Biçim:
Adobe Portable Document Format