fc63
Gender Prediction Model From Text
toxic-classification-model
This model is a fine-tuned version of bert-base-uncased for toxicity detection in Turkish text. It has been trained on labeled datasets containing online comments categorized by their toxicity levels. The model uses the Hugging Face transformers library and is suitable for sequence classification tasks. This work was completed as a project assignment for the Natural Language Processing (CENG493) course at Çankaya University. - Model Type: Sequence Classification - Language(s): Turkish - License: GNU GENERAL PUBLIC LICENSE - Fine-tuned from: `dbmdz/bert-base-turkish-cased` This model can be used directly to analyze the toxicity of text in Turkish. For example: - Content moderation in online forums and social media platforms - Filtering harmful language in customer reviews or feedback - Monitoring and preventing cyberbullying in messaging applications - Integrating toxic language filtering into chatbots or virtual assistants - Using it as part of a sentiment analysis pipeline - Not suitable for analyzing languages other than Turkish - Should not be used for sensitive decision-making without human oversight The model may inherit biases from the training data, including overrepresentation or underrepresentation of certain demographics or topics. It may also misclassify non-toxic content as toxic or fail to detect subtler forms of toxicity. - Avoid deploying the model in high-stakes scenarios without additional validation. - Regularly monitor performance and update the model if new biases are detected. The model was evaluated on a held-out test set containing a balanced mix of toxic and non-toxic examples.
deepfake-detection-cnn_v2
This assignment is part of the CENG 481 - Artificial Neural Networks course assignment. It addresses the task of detecting deepfake content using image-based CNN classification and transfer learning techniques. - Source: DFDC Part-34 on Kaggle - Metadata: `metadata34.csv` - Each video is represented by 10 frames: `0.jpg`, `30.jpg`, ..., `270.jpg` - Fake videos are linked to their originals via metadata - Base: `EfficientNetB0`, pretrained on ImageNet - Frozen base trained with custom head; then base unfrozen and fine-tuned - Architecture: GlobalAveragePooling2D → Dropout(0.4) → Dense(1, sigmoid) - Input size: 224×224×3 - Optimizer: Adam (`lr=1e-4` frozen, `lr=1e-5` unfrozen) - Loss: Binary Crossentropy - Metrics: AUC, Accuracy, Precision, Recall, F1 - Balanced dataset from 6784 images (REAL + FAKE) - Train/Test split: 79% / 21% (stratified) - Batch size: 32 - Epochs: max 100 (early stopping with patience=8) - Model checkpointing enabled (.keras format) - TensorBoard used for experiment tracking - Platform: Google Colab (GPU) - Accuracy: 0.80 - AUC-ROC: 0.88 - Precision: 0.78 - Recall: 0.82 - F1-Score: 0.80 - 🤗 Model: https://huggingface.co/fc63/deepfake-detection-cnnv2 - 💻 Codebase: https://github.com/fc63/Deep-Fake-Video-Detection Deepfake technology poses threats to media trust, privacy, and security. This assignment aims to mitigate misuse by improving detection accuracy while acknowledging dataset limitations and the risk of bias. This assignment was completed as part of the CENG 481 - Artificial Neural Networks course at Çankaya University under the supervision of Dr. Nurdan Saran.
turkish-toxic-language-detection
toxic-category-model
Class 0: Insult Class 1: Other Class 2: PROFANITY Class 3: Racist Class 4: Sexist This model is a fine-tuned version of bert-base-uncased for toxicity detection in Turkish text. It has been trained on labeled datasets containing online comments categorized by their toxicity levels. The model uses the Hugging Face transformers library and is suitable for sequence classification tasks. This work was completed as a project assignment for the Natural Language Processing (CENG493) course at Çankaya University. - Model Type: Sequence Classification - Language(s): Turkish - License: GNU GENERAL PUBLIC LICENSE - Fine-tuned from: `dbmdz/bert-base-turkish-cased` This model can be used directly to analyze the toxicity of text in English. For example: - Content moderation in online forums and social media platforms - Filtering harmful language in customer reviews or feedback - Monitoring and preventing cyberbullying in messaging applications - Integrating toxic language filtering into chatbots or virtual assistants - Using it as part of a sentiment analysis pipeline - Not suitable for analyzing languages other than Turkish - Should not be used for sensitive decision-making without human oversight The model may inherit biases from the training data, including overrepresentation or underrepresentation of certain demographics or topics. It may also misclassify non-toxic content as toxic or fail to detect subtler forms of toxicity. - Avoid deploying the model in high-stakes scenarios without additional validation. - Regularly monitor performance and update the model if new biases are detected. The model was evaluated on a held-out test set containing a balanced mix of toxic and non-toxic examples.