t5-base-summarization-claim-extractor

712
13
1 language
license:cc-by-nc-sa-4.0
by
Babelscape
Language Model
OTHER
New
712 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary

Model Name: T5-base-summarization-claim-extractor Authors: Alessandro Scirè, Karim Ghonim, and Roberto Navigli Contact: scire@diag.

Training Data Analysis

🔵 Good (6.0/10)

Researched training datasets used by t5-base-summarization-claim-extractor with quality assessment

Specialized For

general
multilingual

Training Datasets (1)

c4
🔵 6/10
general
multilingual
Key Strengths
  • Scale and Accessibility: 750GB of publicly available, filtered text
  • Systematic Filtering: Documented heuristics enable reproducibility
  • Language Diversity: Despite English-only, captures diverse writing styles
Considerations
  • English-Only: Limits multilingual applications
  • Filtering Limitations: Offensive content and low-quality text remain despite filtering

Explore our comprehensive training dataset analysis

View All Datasets

Code Examples

Intended Usepythontransformers
from transformers import T5ForConditionalGeneration, T5Tokenizer

tokenizer = T5Tokenizer.from_pretrained("Babelscape/t5-base-summarization-claim-extractor")
model = T5ForConditionalGeneration.from_pretrained("Babelscape/t5-base-summarization-claim-extractor")
summary = 'Simone Biles made a triumphant return to the Olympic stage at the Paris 2024 Games, competing in the women’s gymnastics qualifications. Overcoming a previous struggle with the “twisties” that led to her withdrawal from events at the Tokyo 2020 Olympics, Biles dazzled with strong performances on all apparatus, helping the U.S. team secure a commanding lead in the qualifications. Her routines showcased her resilience and skill, drawing enthusiastic support from a star-studded audience'

tok_input = tokenizer.batch_encode_plus([summary], return_tensors="pt", padding=True)
claims = model.generate(**tok_input)
claims = tokenizer.batch_decode(claims, skip_special_tokens=True)
Citationbibtex
@inproceedings{scire-etal-2024-fenice,
    title = "{FENICE}: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction",
    author = "Scir{\`e}, Alessandro and Ghonim, Karim and Navigli, Roberto",
    editor = "Ku, Lun-Wei  and Martins, Andre and Srikumar, Vivek",
    booktitle = "Findings of the Association for Computational Linguistics ACL 2024",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand and virtual meeting",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.findings-acl.841",
    pages = "14148--14161",
}

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.