HYBRID TEXT CLASSIFICATION BASED ON TF-IDF AND ADAPTIVE ALSHE ENSEMBLE

Authors

  • Rakhmanov Askar Department of "System and Applied Programming" Tashkent University of Information Technologies named after Muhammad al Khwarizmi Author
  • Abduvalieva Zebiniso Department of "System and Applied Programming" Tashkent University of Information Technologies named after Muhammad al-Khwarizmi Author
  • Murodov D.D Tashkent University of Information Technologies named after Muhammad al-Khwarizmi Author

Keywords:

Text classification, TF-IDF, machine learning, NLP, LinearSVC, Naive Bayes, ensemble, ALSHE, Macro F1.

Abstract

This article investigates the task of multi-class classification of technical texts. The experiments utilized the LocalDocs-10 corpus, compiled from software package descriptions and partitioned into 10 thematic classes. Texts were represented via word-level and character-level TF-IDF  -grams, alongside compact SVD-derived features. A comparative evaluation was conducted between classical machine learning algorithms and several hybrid approaches. Special emphasis was placed on the adaptive ALSHE-Gated model, which integrates Complement Naive Bayes and LinearSVC through a confidence-driven switching mechanism. The Passive-Aggressive Classifier achieved the highest performance among baseline models, attaining an Accuracy of   and a Macro F1-score of  . These findings affirm that lightweight TF-IDF models constitute a viable alternative to computationally intensive neural networks for small- to medium-sized corpora.

References

Scikit-learn developers. Working With Text Data. Scikit-learn documentation.

Pedregosa F. et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 2011, 12, pp. 2825-2830.

Cortes C., Vapnik V. Support-vector networks. Machine Learning, 1995, 20, pp. 273-297.

Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys, 2002, 34(1), pp. 1-47.

Joulin A., Grave E., Bojanowski P., Mikolov T. Bag of Tricks for Efficient Text Classification. EACL, 2017.

Devlin J., Chang M.-W., Lee K., Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL-HLT, 2019.

Downloads

Published

2026-05-05

How to Cite

HYBRID TEXT CLASSIFICATION BASED ON TF-IDF AND ADAPTIVE ALSHE ENSEMBLE. (2026). Eurasian Journal of Academic Research, 6(5), 23-27. https://www.in-academy.uz/index.php/EJAR/article/view/39714
Innovative Academy RSC
Article metrics Views and PDF downloads
0 Views
0 Downloads