Daksh0505
sentiment-model-imdb
This repository contains two deep learning models for sentiment classification of IMDB movie reviews, each trained with a different vocabulary size and number of parameters. - These models were trained on a dataset of approximately 150,000 IMDB movie reviews, which were manually scraped from the web. - The reviews were pseudo-labeled using soft probability outputs from the `cardiffnlp/twitter-roberta-base-sentiment` model. - This method provided probabilistic sentiment labels (Negative / Neutral / Positive) for training, allowing the models to learn from soft targets rather than hard class labels. š¹ Model A - Filename: `sentimentmodelimdb6.6M.keras` - Trainable Parameters: ~6.6 million - Total Parameters: ~13.06 million - Vocabulary Size: 50,000 tokens - Description: Lightweight and efficient; optimized for speed. š¹ Model B - Filename: `sentimentmodelimdb34M.keras` - Trainable Parameters: ~34 million - Total Parameters: ~99.43 million - Vocabulary Size: 256,000 tokens - Description: Larger and more expressive; higher accuracy on nuanced reviews. Each model uses its own tokenizer in Keras JSON format: - `tokenizer50k.json` ā used with Model A - `tokenizer256k.json` ā used with Model B š§ Load Models & Tokenizers (from Hugging Face Hub) Click below to test both models live in your browser: [](https://huggingface.co/spaces/Daksh0505/sentiment-model-comparison)
Seq2Seq-LSTM-MultiHeadAttention
Seq2Seq LSTM with Multi-Head Attention for English ā Hindi Translation This model performs English to Hindi translation using a Seq2Seq architecture with LSTM-based encoder-decoder and multi-head cross-attention. The attention mechanism helps the decoder focus on relevant parts of the input sentence during translation. - Architecture: BiLSTM Encoder + LSTM Decoder + Multi-Head Cross-Attention - Task: Language Translation (English ā Hindi) - License: Open for research and demonstration purposes (educational use) ----|------------|-----------|---------------|------------| | Model A | 12M | 50k | 20k English-Hindi sentence pairs | seq2seq-lstm-multiheadattention-12.3 | | Model B | 42M | 256k | 100k English-Hindi sentence pairs | seq2seq-lstm-multiheadattention-42 | - Model A is smaller and performs well on the dataset it was trained on. - Model B has higher capacity but needs more data for robust generalization. - Demonstration and educational purposes - Understanding Seq2Seq + Attention mechanisms - Translating English sentences to Hindi - Feature extraction: Encoder outputs can be used for downstream NLP tasks by generating contextual embedding vectors that capture sentence-level semantics --- - High-stakes or production translation systems without further fine-tuning - Handling very large or domain-specific datasets without retraining - Evaluated qualitatively on selected test sentences - Model A: good accuracy for small, simple sentences - Model B: may require larger datasets for generalization BLEU or other quantitative metrics can be added if evaluation is performed. - Source: Collected English-Hindi parallel sentences - Size: - Model A: 20k sentence pairs - Model B: 100k sentence pairs - Preprocessing: Tokenization, padding, ` ` / ` ` tokens - Dataset: For further Fine-Tuning, training dataset is available in this model card - Larger model may underperform if trained on small datasets - Handles only sentence-level translation; not optimized for paragraphs - May produce incorrect translations for rare words or out-of-vocabulary terms - Larger model is only trained for epoch 1, so do not used it without fine tuning on your own dataset Step-by-Step Prediction Example For Encoder-Decoder inference visit Daksh0505/Seq2Seq-LSTM-MultiHeadAttention-Translation