A state-of-the-art spam message classification model built on RoBERTa-base transformer architecture, achieving 99.42% accuracy and 0.9782 F1-score for spam class for the test set. Developed as the core spam detection component for Amy, an intelligent Discord moderation bot.
This model is a fine-tuned version of FacebookAI/roberta-base for binary spam classification in messaging applications. The classifier accurately distinguishes between legitimate messages (ham) and spam/phishing content, making it production-ready for real-world deployment in messaging platforms and content moderation systems.
- Developed by: roshana1s - Model type: Binary Sequence Classification - Language: English - License: Apache-2.0 - Base Model: FacebookAI/roberta-base - Primary Use Case: Discord bot moderation and real-time spam detection
- š¤ Transformer-based Architecture: Built on RoBERTa-base for superior text understanding - ā” High Performance: 0.9782 F1-score for spam detection, 99.42% overall accuracy - š§ Hyperparameter Optimization: Automated tuning using Optuna framework (25 trials) - āļø Class Imbalance Handling: Successfully addressed through weighted loss function - š URL Bias Mitigation: Enhanced with real-world ham messages containing links - š Comprehensive Evaluation: Evaluated on completely unseen test set
Discord helper (optional) Include this helper when targeting Discord: it normalizes invites, mentions, and custom emoji before tokenization to improve robustness in chat contexts.
Messaging Platforms - Discord bot moderation (Primary use case) - SMS filtering systems - Chat application content filtering
- Non-English language spam detection (trained exclusively on English data) - Sentiment analysis or other NLP tasks beyond binary spam classification
The model was trained on a combination of two comprehensive SMS spam datasets:
1. SMS Spam Collection Dataset - UCI Machine Learning Repository 2. Discord Text Messages ā a manually collected dataset of real Discord messages containing both ham and spam samples. (This dataset was created to mitigate ` ` bias.)
Preprocessing Steps: 1. Label encoding (ham ā 0, spam ā 1) 2. Text cleaning and normalization with Discord-specific preprocessing 3. Train/validation/test split (70/15/15) 4. Tokenization with RoBERTa tokenizer 5. Dynamic padding and truncation
Automated hyperparameter search using Optuna framework (25 trials):
Search Space: - Dropout rates: Hidden dropout (0.1-0.3), Attention dropout (0.1-0.2) - Learning rate: 1e-5 to 5e-5 range - Weight decay: 0.0 to 0.1 regularization - Batch size: 8, 16, or 32 samples - Gradient accumulation steps: 1 to 4 - Training epochs: 2 to 5 epochs - Warmup ratio: 0.05 to 0.1 for learning rate scheduling
Best Parameters Found (Trial 6/25): - Hidden dropout: 0.10069482002001506 - Attention dropout: 0.12460257350587067 - Learning rate: 4.976184540342024e-05 - Weight decay: 0.04490021845024478 - Batch size: 16 - Gradient accumulation steps: 4 - Epochs: 4 - Warmup ratio: 0.07622459860163384
1. Data Preprocessing: SMS text cleaning and label encoding 2. Tokenization: Dynamic padding with maximum sequence length of 128 tokens 3. Class Balancing: Weighted loss function to handle imbalanced dataset 4. Hyperparameter Optimization: Optuna-based automated tuning 5. Evaluation: Comprehensive metrics on held-out test set
- Optimizer: AdamW - Loss Function: Weighted Cross-Entropy (handles class imbalance) - Label Smoothing: 0.1 (prevents overconfidence) - Learning Rate Schedule: Linear warmup followed by linear decay
| Metric | Score | |--------|-------| | Overall Accuracy | 99.41% | | Weighted F1-Score | 0.9941 | | Spam F1-Score | 0.9782 | | Spam Precision | 96.55% | | Spam Recall | 99.12% | | Ham Precision | 99.86% | | Ham Recall | 99.45% |
| | Predicted Ham | Predicted Spam | |---------------|---------------|----------------| | Actual Ham | 725 | 4 | | Actual Spam | 1 | 112 |
- True Positives: 112 spam messages correctly identified - True Negatives: 725 ham messages correctly identified - False Positives: 4 - False Negatives: 1
> š Strong Generalization: All performance metrics are evaluated on a completely unseen test set (15% of data) that was never used during training or hyperparameter tuning, ensuring robust real-world performance and preventing overfitting.
Challenge: During initial training, the model became overconfident and labeled almost all messages containing ` ` as spam, even if some were legitimate ham.
Solution: Augmented training data with additional real ham messages containing links collected from Discord servers. This helps the model understand that URLs can appear in non-spam messages and improves generalization for real-world inference, particularly important for Discord bot deployment where legitimate messages often contain links.
ā
Class Imbalance Handling (SUCCESSFULLY ADDRESSED)
Challenge: The combined dataset exhibits natural imbalance.
Solution: Implemented weighted loss function during training to handle the imbalanced dataset effectively, resulting in exceptional performance for both classes.
Challenge: Ensuring model generalizes well to unseen data.
Solution: Comprehensive evaluation on completely held-out test set (15% of data) never used during training or hyperparameter tuning, with demonstrated strong generalization (99.42% accuracy on unseen data).
- Language Limitation: Model performance is optimized for English text only - SMS Format: Trained on SMS-style messages; may require adaptation for other formats (e.g., formal business emails)
- Python: 3.8+ - Framework: PyTorch, Hugging Face Transformers
If you use this model in your research or application, please cite: