A novel filter feature selection method for text classification: Extensive Feature Selector
Yükleniyor...
Tarih
Early Access
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
SAGE Publications
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
As the huge dimensionality of textual data restrains the classification accuracy, it is essential to apply feature selection (FS) methods as
dimension reduction step in text classification (TC) domain. Most of the FS methods for TC contain several number of probabilities.
In this study, we proposed a new FS method named as Extensive Feature Selector (EFS), which benefits from corpus-based and classbased probabilities in its calculations. The performance of EFS is compared with nine well-known FS methods, namely, Chi-Squared
(CHI2), Class Discriminating Measure (CDM), Discriminative Power Measure (DPM), Odds Ratio (OR), Distinguishing Feature
Selector (DFS), Comprehensively Measure Feature Selection (CMFS), Discriminative Feature Selection (DFSS), Normalised Difference
Measure (NDM) and Max–Min Ratio (MMR) using Multinomial Naive Bayes (MNB), Support-Vector Machines (SVMs) and k-Nearest
Neighbour (KNN) classifiers on four benchmark data sets. These data sets are Reuters-21578, 20-Newsgroup, Mini 20-Newsgroup
and Polarity. The experiments were carried out for six different feature sizes which are 10, 30, 50, 100, 300 and 500. Experimental
results show that the performance of EFS method is more successful than the other nine methods in most cases according to microF1 and macro-F1 scores.
Açıklama
Anahtar Kelimeler
Dimension Reduction, Feature Selection, Text Classification
Kaynak
Journal of Information Science
WoS Q Değeri
Q1
Scopus Q Değeri
Cilt
Early Access
Sayı
Published Online : 04.2021
Künye
Parlak, B., & Uysal, A. K. (2021). A novel filter feature selection method for text classification: Extensive feature selector. Journal of Information Science, doi:10.1177/0165551521991037