What is Text Classification
Automatic text categorization
Text Classification is a machine learning task of automatically assigning categories or labels to texts based on their content.
Classification Types
- Binary — two classes (spam/not spam)
- Multi-class — several mutually exclusive classes
- Multi-label — multiple labels simultaneously
Methods
- Traditional ML — Naive Bayes, SVM, Random Forest
- Deep Learning — LSTM, CNN for texts
- Transformers — BERT, RoBERTa, GPT
Business Applications
- Spam and unwanted content filtering
- Support ticket routing
- Document categorization
- Sentiment analysis of reviews
- News topic detection
Quality Metrics
- Accuracy, Precision, Recall
- F1-score (harmonic mean)
- AUC-ROC for binary classification