MoritzLaurer
DeBERTa-v3-base-mnli-fever-anli
--- language: - en license: mit tags: - text-classification - zero-shot-classification datasets: - multi_nli - facebook/anli - fever metrics: - accuracy pipeline_tag: zero-shot-classification model-index: - name: MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli results: - task: type: natural-language-inference name: Natural Language Inference dataset: name: anli type: anli config: plain_text split: test_r3 metrics: - type: accuracy value: 0.495 name: Accuracy verified: true verifyToken: eyJhbGciOiJF
deberta-v3-large-zeroshot-v2.0
--- language: - en tags: - text-classification - zero-shot-classification base_model: microsoft/deberta-v3-large pipeline_tag: zero-shot-classification library_name: transformers license: mit ---
mDeBERTa-v3-base-mnli-xnli
--- language: - multilingual - en - ar - bg - de - el - es - fr - hi - ru - sw - th - tr - ur - vi - zh license: mit tags: - zero-shot-classification - text-classification - nli - pytorch metrics: - accuracy datasets: - multi_nli - xnli pipeline_tag: zero-shot-classification widget: - text: "Angela Merkel ist eine Politikerin in Deutschland und Vorsitzende der CDU" candidate_labels: "politics, economy, entertainment, environment" ---
deberta-v3-xsmall-zeroshot-v1.1-all-33
--- base_model: microsoft/deberta-v3-xsmall language: - en tags: - text-classification - zero-shot-classification pipeline_tag: zero-shot-classification library_name: transformers license: mit ---
DeBERTa-v3-large-mnli-fever-anli-ling-wanli
--- language: - en license: mit tags: - text-classification - zero-shot-classification datasets: - multi_nli - facebook/anli - fever - lingnli - alisawuffles/WANLI metrics: - accuracy pipeline_tag: zero-shot-classification model-index: - name: DeBERTa-v3-large-mnli-fever-anli-ling-wanli results: - task: type: text-classification name: Natural Language Inference dataset: name: MultiNLI-matched type: multi_nli split: validation_matched metrics: - type: accuracy value: 0,912 verified: false - task:
mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
--- language: - multilingual - zh - ja - ar - ko - de - fr - es - pt - hi - id - it - tr - ru - bn - ur - mr - ta - vi - fa - pl - uk - nl - sv - he - sw - ps license: mit tags: - zero-shot-classification - text-classification - nli - pytorch datasets: - MoritzLaurer/multilingual-NLI-26lang-2mil7 - xnli - multi_nli - facebook/anli - fever - lingnli - alisawuffles/WANLI metrics: - accuracy pipeline_tag: zero-shot-classification widget: - text: Angela Merkel ist eine Politikerin in Deutschland und
DeBERTa-v3-xsmall-mnli-fever-anli-ling-binary
bge-m3-zeroshot-v2.0
zeroshot-v2.0 series of models Models in this series are designed for efficient zeroshot classification with the Hugging Face pipeline. These models can do classification without training data and run on both GPUs and CPUs. An overview of the latest zeroshot classifiers is available in my Zeroshot Classifier Collection. The main update of this `zeroshot-v2.0` series of models is that several models are trained on fully commercially-friendly data for users with strict license requirements. These models can do one universal classification task: determine whether a hypothesis is "true" or "not true" given a text (`entailment` vs. `notentailment`). This task format is based on the Natural Language Inference task (NLI). The task is so universal that any classification task can be reformulated into this task by the Hugging Face pipeline. Training data Models with a "`-c`" in the name are trained on two types of fully commercially-friendly data: 1. Synthetic data generated with Mixtral-8x7B-Instruct-v0.1. I first created a list of 500+ diverse text classification tasks for 25 professions in conversations with Mistral-large. The data was manually curated. I then used this as seed data to generate several hundred thousand texts for these tasks with Mixtral-8x7B-Instruct-v0.1. The final dataset used is available in the syntheticzeroshotmixtralv0.1 dataset in the subset `mixtralwrittentextfortasksv4`. Data curation was done in multiple iterations and will be improved in future iterations. 2. Two commercially-friendly NLI datasets: (MNLI, FEVER-NLI). These datasets were added to increase generalization. 3. Models without a "`-c`" in the name also included a broader mix of training data with a broader mix of licenses: ANLI, WANLI, LingNLI, and all datasets in this list where `usedinv1.1==True`. `multilabel=False` forces the model to decide on only one class. `multilabel=True` enables the model to choose multiple classes. The models were evaluated on 28 different text classification tasks with the f1macro metric. The main reference point is `facebook/bart-large-mnli` which is, at the time of writing (03.04.24), the most used commercially-friendly 0-shot classifier. | | facebook/bart-large-mnli | roberta-base-zeroshot-v2.0-c | roberta-large-zeroshot-v2.0-c | deberta-v3-base-zeroshot-v2.0-c | deberta-v3-base-zeroshot-v2.0 (fewshot) | deberta-v3-large-zeroshot-v2.0-c | deberta-v3-large-zeroshot-v2.0 (fewshot) | bge-m3-zeroshot-v2.0-c | bge-m3-zeroshot-v2.0 (fewshot) | |:---------------------------|---------------------------:|-----------------------------:|------------------------------:|--------------------------------:|-----------------------------------:|---------------------------------:|------------------------------------:|-----------------------:|--------------------------:| | all datasets mean | 0.497 | 0.587 | 0.622 | 0.619 | 0.643 (0.834) | 0.676 | 0.673 (0.846) | 0.59 | (0.803) | | amazonpolarity (2) | 0.937 | 0.924 | 0.951 | 0.937 | 0.943 (0.961) | 0.952 | 0.956 (0.968) | 0.942 | (0.951) | | imdb (2) | 0.892 | 0.871 | 0.904 | 0.893 | 0.899 (0.936) | 0.923 | 0.918 (0.958) | 0.873 | (0.917) | | appreviews (2) | 0.934 | 0.913 | 0.937 | 0.938 | 0.945 (0.948) | 0.943 | 0.949 (0.962) | 0.932 | (0.954) | | yelpreviews (2) | 0.948 | 0.953 | 0.977 | 0.979 | 0.975 (0.989) | 0.988 | 0.985 (0.994) | 0.973 | (0.978) | | rottentomatoes (2) | 0.83 | 0.802 | 0.841 | 0.84 | 0.86 (0.902) | 0.869 | 0.868 (0.908) | 0.813 | (0.866) | | emotiondair (6) | 0.455 | 0.482 | 0.486 | 0.459 | 0.495 (0.748) | 0.499 | 0.484 (0.688) | 0.453 | (0.697) | | emocontext (4) | 0.497 | 0.555 | 0.63 | 0.59 | 0.592 (0.799) | 0.699 | 0.676 (0.81) | 0.61 | (0.798) | | empathetic (32) | 0.371 | 0.374 | 0.404 | 0.378 | 0.405 (0.53) | 0.447 | 0.478 (0.555) | 0.387 | (0.455) | | financialphrasebank (3) | 0.465 | 0.562 | 0.455 | 0.714 | 0.669 (0.906) | 0.691 | 0.582 (0.913) | 0.504 | (0.895) | | banking77 (72) | 0.312 | 0.124 | 0.29 | 0.421 | 0.446 (0.751) | 0.513 | 0.567 (0.766) | 0.387 | (0.715) | | massive (59) | 0.43 | 0.428 | 0.543 | 0.512 | 0.52 (0.755) | 0.526 | 0.518 (0.789) | 0.414 | (0.692) | | wikitoxictoxicaggreg (2) | 0.547 | 0.751 | 0.766 | 0.751 | 0.769 (0.904) | 0.741 | 0.787 (0.911) | 0.736 | (0.9) | | wikitoxicobscene (2) | 0.713 | 0.817 | 0.854 | 0.853 | 0.869 (0.922) | 0.883 | 0.893 (0.933) | 0.783 | (0.914) | | wikitoxicthreat (2) | 0.295 | 0.71 | 0.817 | 0.813 | 0.87 (0.946) | 0.827 | 0.879 (0.952) | 0.68 | (0.947) | | wikitoxicinsult (2) | 0.372 | 0.724 | 0.798 | 0.759 | 0.811 (0.912) | 0.77 | 0.779 (0.924) | 0.783 | (0.915) | | wikitoxicidentityhate (2) | 0.473 | 0.774 | 0.798 | 0.774 | 0.765 (0.938) | 0.797 | 0.806 (0.948) | 0.761 | (0.931) | | hateoffensive (3) | 0.161 | 0.352 | 0.29 | 0.315 | 0.371 (0.862) | 0.47 | 0.461 (0.847) | 0.291 | (0.823) | | hatexplain (3) | 0.239 | 0.396 | 0.314 | 0.376 | 0.369 (0.765) | 0.378 | 0.389 (0.764) | 0.29 | (0.729) | | biasframesoffensive (2) | 0.336 | 0.571 | 0.583 | 0.544 | 0.601 (0.867) | 0.644 | 0.656 (0.883) | 0.541 | (0.855) | | biasframessex (2) | 0.263 | 0.617 | 0.835 | 0.741 | 0.809 (0.922) | 0.846 | 0.815 (0.946) | 0.748 | (0.905) | | biasframesintent (2) | 0.616 | 0.531 | 0.635 | 0.554 | 0.61 (0.881) | 0.696 | 0.687 (0.891) | 0.467 | (0.868) | | agnews (4) | 0.703 | 0.758 | 0.745 | 0.68 | 0.742 (0.898) | 0.819 | 0.771 (0.898) | 0.687 | (0.892) | | yahootopics (10) | 0.299 | 0.543 | 0.62 | 0.578 | 0.564 (0.722) | 0.621 | 0.613 (0.738) | 0.587 | (0.711) | | trueteacher (2) | 0.491 | 0.469 | 0.402 | 0.431 | 0.479 (0.82) | 0.459 | 0.538 (0.846) | 0.471 | (0.518) | | spam (2) | 0.505 | 0.528 | 0.504 | 0.507 | 0.464 (0.973) | 0.74 | 0.597 (0.983) | 0.441 | (0.978) | | wellformedquery (2) | 0.407 | 0.333 | 0.333 | 0.335 | 0.491 (0.769) | 0.334 | 0.429 (0.815) | 0.361 | (0.718) | | manifesto (56) | 0.084 | 0.102 | 0.182 | 0.17 | 0.187 (0.376) | 0.258 | 0.256 (0.408) | 0.147 | (0.331) | | capsotu (21) | 0.34 | 0.479 | 0.523 | 0.502 | 0.477 (0.664) | 0.603 | 0.502 (0.686) | 0.472 | (0.644) | These numbers indicate zeroshot performance, as no data from these datasets was added in the training mix. Note that models without a "`-c`" in the title were evaluated twice: one run without any data from these 28 datasets to test pure zeroshot performance (the first number in the respective column) and the final run including up to 500 training data points per class from each of the 28 datasets (the second number in brackets in the column, "fewshot"). No model was trained on test data. Details on the different datasets are available here: https://github.com/MoritzLaurer/zeroshot-classifier/blob/main/v1humandata/datasetsoverview.csv - deberta-v3-zeroshot vs. roberta-zeroshot: deberta-v3 performs clearly better than roberta, but it is a bit slower. roberta is directly compatible with Hugging Face's production inference TEI containers and flash attention. These containers are a good choice for production use-cases. tl;dr: For accuracy, use a deberta-v3 model. If production inference speed is a concern, you can consider a roberta model (e.g. in a TEI container and HF Inference Endpoints). - commercial use-cases: models with "`-c`" in the title are guaranteed to be trained on only commercially-friendly data. Models without a "`-c`" were trained on more data and perform better, but include data with non-commercial licenses. Legal opinions diverge if this training data affects the license of the trained model. For users with strict legal requirements, the models with "`-c`" in the title are recommended. - Multilingual/non-English use-cases: use bge-m3-zeroshot-v2.0 or bge-m3-zeroshot-v2.0-c. Note that multilingual models perform worse than English-only models. You can therefore also first machine translate your texts to English with libraries like EasyNMT and then apply any English-only model to the translated data. Machine translation also facilitates validation in case your team does not speak all languages in the data. - context window: The `bge-m3` models can process up to 8192 tokens. The other models can process up to 512. Note that longer text inputs both make the mode slower and decrease performance, so if you're only working with texts of up to 400~ words / 1 page, use e.g. a deberta model for better performance. - The latest updates on new models are always available in the Zeroshot Classifier Collection. Reproduction code is available in the `v2syntheticdata` directory here: https://github.com/MoritzLaurer/zeroshot-classifier/tree/main Limitations and bias The model can only do text classification tasks. Biases can come from the underlying foundation model, the human NLI training data and the synthetic data generated by Mixtral. License The foundation model was published under the MIT license. The licenses of the training data vary depending on the model, see above. This model is an extension of the research described in this paper. Ideas for cooperation or questions? If you have questions or ideas for cooperation, contact me at moritz{at}huggingface{dot}co or LinkedIn Flexible usage and "prompting" You can formulate your own hypotheses by changing the `hypothesistemplate` of the zeroshot pipeline. Similar to "prompt engineering" for LLMs, you can test different formulations of your `hypothesistemplate` and verbalized classes to improve performance.
DeBERTa-v3-base-mnli
deberta-v3-base-zeroshot-v2.0
deberta-v3-base-zeroshot-v1.1-all-33
Model description: deberta-v3-base-zeroshot-v1.1-all-33 The model is designed for zero-shot classification with the Hugging Face pipeline. The model can do one universal classification task: determine whether a hypothesis is "true" or "not true" given a text (`entailment` vs. `notentailment`). This task format is based on the Natural Language Inference task (NLI). The task is so universal that any classification task can be reformulated into this task. A detailed description of how the model was trained and how it can be used is available in this paper. Training data The model was trained on a mixture of 33 datasets and 387 classes that have been reformatted into this universal format. 1. Five NLI datasets with ~885k texts: "mnli", "anli", "fever", "wanli", "ling" 2. 28 classification tasks reformatted into the universal NLI format. ~51k cleaned texts were used to avoid overfitting: 'amazonpolarity', 'imdb', 'appreviews', 'yelpreviews', 'rottentomatoes', 'emotiondair', 'emocontext', 'empathetic', 'financialphrasebank', 'banking77', 'massive', 'wikitoxictoxicaggregated', 'wikitoxicobscene', 'wikitoxicthreat', 'wikitoxicinsult', 'wikitoxicidentityhate', 'hateoffensive', 'hatexplain', 'biasframesoffensive', 'biasframessex', 'biasframesintent', 'agnews', 'yahootopics', 'trueteacher', 'spam', 'wellformedquery', 'manifesto', 'capsotu'. See details on each dataset here: https://github.com/MoritzLaurer/zeroshot-classifier/blob/main/datasetsoverview.csv Note that compared to other NLI models, this model predicts two classes (`entailment` vs. `notentailment`) as opposed to three classes (entailment/neutral/contradiction) The model was only trained on English data. For multilingual use-cases, I recommend machine translating texts to English with libraries like EasyNMT. English-only models tend to perform better than multilingual models and validation with English data can be easier if you don't speak all languages in your corpus. How to use the model Simple zero-shot classification pipeline Details on data and training The code for preparing the data and training & evaluating the model is fully open-source here: https://github.com/MoritzLaurer/zeroshot-classifier/tree/main Hyperparameters and other details are available in this Weights & Biases repo: https://wandb.ai/moritzlaurer/deberta-v3-base-zeroshot-v1-1-all-33/table?workspace=user- Balanced accuracy is reported for all datasets. `deberta-v3-base-zeroshot-v1.1-all-33` was trained on all datasets, with only maximum 500 texts per class to avoid overfitting. The metrics on these datasets are therefore not strictly zeroshot, as the model has seen some data for each task during training. `deberta-v3-base-zeroshot-v1.1-heldout` indicates zeroshot performance on the respective dataset. To calculate these zeroshot metrics, the pipeline was run 28 times, each time with one dataset held out from training to simulate a zeroshot setup. | | deberta-v3-base-mnli-fever-anli-ling-wanli-binary | deberta-v3-base-zeroshot-v1.1-heldout | deberta-v3-base-zeroshot-v1.1-all-33 | |:---------------------------|---------------------------:|----------------------------------------:|---------------------------------------:| | datasets mean (w/o nli) | 62 | 70.7 | 84 | | amazonpolarity (2) | 91.7 | 95.7 | 96 | | imdb (2) | 87.3 | 93.6 | 94.5 | | appreviews (2) | 91.3 | 92.2 | 94.4 | | yelpreviews (2) | 95.1 | 97.4 | 98.3 | | rottentomatoes (2) | 83 | 88.7 | 90.8 | | emotiondair (6) | 46.5 | 42.6 | 74.5 | | emocontext (4) | 58.5 | 57.4 | 81.2 | | empathetic (32) | 31.3 | 37.3 | 52.7 | | financialphrasebank (3) | 78.3 | 68.9 | 91.2 | | banking77 (72) | 18.9 | 46 | 73.7 | | massive (59) | 44 | 56.6 | 78.9 | | wikitoxictoxicaggreg (2) | 73.7 | 82.5 | 90.5 | | wikitoxicobscene (2) | 77.3 | 91.6 | 92.6 | | wikitoxicthreat (2) | 83.5 | 95.2 | 96.7 | | wikitoxicinsult (2) | 79.6 | 91 | 91.6 | | wikitoxicidentityhate (2) | 83.9 | 88 | 94.4 | | hateoffensive (3) | 55.2 | 66.1 | 86 | | hatexplain (3) | 44.1 | 57.6 | 76.9 | | biasframesoffensive (2) | 56.8 | 85.4 | 87 | | biasframessex (2) | 85.4 | 87 | 91.8 | | biasframesintent (2) | 56.3 | 85.2 | 87.8 | | agnews (4) | 77.3 | 80 | 90.5 | | yahootopics (10) | 53.6 | 57.7 | 72.8 | | trueteacher (2) | 51.4 | 49.5 | 82.4 | | spam (2) | 51.8 | 50 | 97.2 | | wellformedquery (2) | 49.9 | 52.5 | 77.2 | | manifesto (56) | 5.8 | 18.9 | 39.1 | | capsotu (21) | 25.2 | 64 | 72.5 | | mnlim (2) | 92.4 | nan | 92.7 | | mnlimm (2) | 92.4 | nan | 92.5 | | fevernli (2) | 89 | nan | 89.1 | | anlir1 (2) | 79.4 | nan | 80 | | anlir2 (2) | 68.4 | nan | 68.4 | | anlir3 (2) | 66.2 | nan | 68 | | wanli (2) | 81.6 | nan | 81.8 | | lingnli (2) | 88.4 | nan | 88.4 | Limitations and bias The model can only do text classification tasks. Please consult the original DeBERTa paper and the papers for the different datasets for potential biases. License The base model (DeBERTa-v3) is published under the MIT license. The datasets the model was fine-tuned on are published under a diverse set of licenses. The following table provides an overview of the non-NLI datasets used for fine-tuning, information on licenses, the underlying papers etc.: https://github.com/MoritzLaurer/zeroshot-classifier/blob/main/datasetsoverview.csv Citation If you use this model academically, please cite: Ideas for cooperation or questions? If you have questions or ideas for cooperation, contact me at m{dot}laurer{at}vu{dot}nl or LinkedIn Debugging and issues Note that DeBERTa-v3 was released on 06.12.21 and older versions of HF Transformers can have issues running the model (e.g. resulting in an issue with the tokenizer). Using Transformers>=4.13 might solve some issues. Also make sure to install sentencepiece to avoid tokenizer errors. Run: `pip install transformers[sentencepiece]` or `pip install sentencepiece` Hypotheses used for classification The hypotheses in the tables below were used to fine-tune the model. Inspecting them can help users get a feeling for which type of hypotheses and tasks the model was trained on. You can formulate your own hypotheses by changing the `hypothesistemplate` of the zeroshot pipeline. For example: Note that a few rows in the `massive` and `banking77` datasets contain `nan` because some classes were so ambiguous/unclear that I excluded them from the data. wellformedquery | label | hypothesis | |:----------------|:-----------------------------------------------| | notwellformed | This example is not a well formed Google query | | wellformed | This example is a well formed Google query. | biasframessex | label | hypothesis | |:--------|:-----------------------------------------------------------| | notsex | This example does not contain allusions to sexual content. | | sex | This example contains allusions to sexual content. | biasframesintent | label | hypothesis | |:-----------|:-----------------------------------------------------------------| | intent | The intent of this example is to be offensive/disrespectful. | | notintent | The intent of this example is not to be offensive/disrespectful. | biasframesoffensive | label | hypothesis | |:--------------|:-------------------------------------------------------------------------| | notoffensive | This example could not be considered offensive, disrespectful, or toxic. | | offensive | This example could be considered offensive, disrespectful, or toxic. | financialphrasebank | label | hypothesis | |:---------|:--------------------------------------------------------------------------| | negative | The sentiment in this example is negative from an investor's perspective. | | neutral | The sentiment in this example is neutral from an investor's perspective. | | positive | The sentiment in this example is positive from an investor's perspective. | rottentomatoes | label | hypothesis | |:---------|:-----------------------------------------------------------------------| | negative | The sentiment in this example rotten tomatoes movie review is negative | | positive | The sentiment in this example rotten tomatoes movie review is positive | amazonpolarity | label | hypothesis | |:---------|:----------------------------------------------------------------| | negative | The sentiment in this example amazon product review is negative | | positive | The sentiment in this example amazon product review is positive | imdb | label | hypothesis | |:---------|:------------------------------------------------------------| | negative | The sentiment in this example imdb movie review is negative | | positive | The sentiment in this example imdb movie review is positive | appreviews | label | hypothesis | |:---------|:------------------------------------------------------| | negative | The sentiment in this example app review is negative. | | positive | The sentiment in this example app review is positive. | yelpreviews | label | hypothesis | |:---------|:-------------------------------------------------------| | negative | The sentiment in this example yelp review is negative. | | positive | The sentiment in this example yelp review is positive. | wikitoxictoxicaggregated | label | hypothesis | |:--------------------|:----------------------------------------------------------------| | nottoxicaggregated | This example wikipedia comment does not contain toxic language. | | toxicaggregated | This example wikipedia comment contains toxic language. | wikitoxicobscene | label | hypothesis | |:------------|:------------------------------------------------------------------| | notobscene | This example wikipedia comment does not contain obscene language. | | obscene | This example wikipedia comment contains obscene language. | wikitoxicthreat | label | hypothesis | |:-----------|:----------------------------------------------------------| | notthreat | This example wikipedia comment does not contain a threat. | | threat | This example wikipedia comment contains a threat. | wikitoxicinsult | label | hypothesis | |:-----------|:-----------------------------------------------------------| | insult | This example wikipedia comment contains an insult. | | notinsult | This example wikipedia comment does not contain an insult. | wikitoxicidentityhate | label | hypothesis | |:-----------------|:---------------------------------------------------------------| | identityhate | This example wikipedia comment contains identity hate. | | notidentityhate | This example wikipedia comment does not contain identity hate. | hateoffensive | label | hypothesis | |:------------|:------------------------------------------------------------------------| | hatespeech | This example tweet contains hate speech. | | neither | This example tweet contains neither offensive language nor hate speech. | | offensive | This example tweet contains offensive language without hate speech. | hatexplain | label | hypothesis | |:------------|:-------------------------------------------------------------------------------------------| | hatespeech | This example text from twitter or gab contains hate speech. | | neither | This example text from twitter or gab contains neither offensive language nor hate speech. | | offensive | This example text from twitter or gab contains offensive language without hate speech. | spam | label | hypothesis | |:---------|:------------------------------| | notspam | This example sms is not spam. | | spam | This example sms is spam. | emotiondair | label | hypothesis | |:---------|:---------------------------------------------------| | anger | This example tweet expresses the emotion: anger | | fear | This example tweet expresses the emotion: fear | | joy | This example tweet expresses the emotion: joy | | love | This example tweet expresses the emotion: love | | sadness | This example tweet expresses the emotion: sadness | | surprise | This example tweet expresses the emotion: surprise | emocontext | label | hypothesis | |:--------|:--------------------------------------------------------------------------------------| | angry | This example tweet expresses the emotion: anger | | happy | This example tweet expresses the emotion: happiness | | others | This example tweet does not express any of the emotions: anger, sadness, or happiness | | sad | This example tweet expresses the emotion: sadness | empathetic | label | hypothesis | |:-------------|:-----------------------------------------------------------| | afraid | The main emotion of this example dialogue is: afraid | | angry | The main emotion of this example dialogue is: angry | | annoyed | The main emotion of this example dialogue is: annoyed | | anticipating | The main emotion of this example dialogue is: anticipating | | anxious | The main emotion of this example dialogue is: anxious | | apprehensive | The main emotion of this example dialogue is: apprehensive | | ashamed | The main emotion of this example dialogue is: ashamed | | caring | The main emotion of this example dialogue is: caring | | confident | The main emotion of this example dialogue is: confident | | content | The main emotion of this example dialogue is: content | | devastated | The main emotion of this example dialogue is: devastated | | disappointed | The main emotion of this example dialogue is: disappointed | | disgusted | The main emotion of this example dialogue is: disgusted | | embarrassed | The main emotion of this example dialogue is: embarrassed | | excited | The main emotion of this example dialogue is: excited | | faithful | The main emotion of this example dialogue is: faithful | | furious | The main emotion of this example dialogue is: furious | | grateful | The main emotion of this example dialogue is: grateful | | guilty | The main emotion of this example dialogue is: guilty | | hopeful | The main emotion of this example dialogue is: hopeful | | impressed | The main emotion of this example dialogue is: impressed | | jealous | The main emotion of this example dialogue is: jealous | | joyful | The main emotion of this example dialogue is: joyful | | lonely | The main emotion of this example dialogue is: lonely | | nostalgic | The main emotion of this example dialogue is: nostalgic | | prepared | The main emotion of this example dialogue is: prepared | | proud | The main emotion of this example dialogue is: proud | | sad | The main emotion of this example dialogue is: sad | | sentimental | The main emotion of this example dialogue is: sentimental | | surprised | The main emotion of this example dialogue is: surprised | | terrified | The main emotion of this example dialogue is: terrified | | trusting | The main emotion of this example dialogue is: trusting | agnews | label | hypothesis | |:---------|:-------------------------------------------------------| | Business | This example news text is about business news | | Sci/Tech | This example news text is about science and technology | | Sports | This example news text is about sports | | World | This example news text is about world news | yahootopics | label | hypothesis | |:-----------------------|:---------------------------------------------------------------------------------------------------| | Business & Finance | This example question from the Yahoo Q&A forum is categorized in the topic: Business & Finance | | Computers & Internet | This example question from the Yahoo Q&A forum is categorized in the topic: Computers & Internet | | Education & Reference | This example question from the Yahoo Q&A forum is categorized in the topic: Education & Reference | | Entertainment & Music | This example question from the Yahoo Q&A forum is categorized in the topic: Entertainment & Music | | Family & Relationships | This example question from the Yahoo Q&A forum is categorized in the topic: Family & Relationships | | Health | This example question from the Yahoo Q&A forum is categorized in the topic: Health | | Politics & Government | This example question from the Yahoo Q&A forum is categorized in the topic: Politics & Government | | Science & Mathematics | This example question from the Yahoo Q&A forum is categorized in the topic: Science & Mathematics | | Society & Culture | This example question from the Yahoo Q&A forum is categorized in the topic: Society & Culture | | Sports | This example question from the Yahoo Q&A forum is categorized in the topic: Sports | massive | label | hypothesis | |:-------------------------|:------------------------------------------------------------------------------------------| | alarmquery | The example utterance is a query about alarms. | | alarmremove | The intent of this example utterance is to remove an alarm. | | alarmset | The intent of the example utterance is to set an alarm. | | audiovolumedown | The intent of the example utterance is to lower the volume. | | audiovolumemute | The intent of this example utterance is to mute the volume. | | audiovolumeother | The example utterance is related to audio volume. | | audiovolumeup | The intent of this example utterance is turning the audio volume up. | | calendarquery | The example utterance is a query about a calendar. | | calendarremove | The intent of the example utterance is to remove something from a calendar. | | calendarset | The intent of this example utterance is to set something in a calendar. | | cookingquery | The example utterance is a query about cooking. | | cookingrecipe | This example utterance is about cooking recipies. | | datetimeconvert | The example utterance is related to date time changes or conversion. | | datetimequery | The intent of this example utterance is a datetime query. | | emailaddcontact | The intent of this example utterance is adding an email address to contacts. | | emailquery | The example utterance is a query about emails. | | emailquerycontact | The intent of this example utterance is to query contact details. | | emailsendemail | The intent of the example utterance is to send an email. | | generalgreet | This example utterance is a general greet. | | generaljoke | The intent of the example utterance is to hear a joke. | | generalquirky | nan | | iotcleaning | The intent of the example utterance is for an IoT device to start cleaning. | | iotcoffee | The intent of this example utterance is for an IoT device to make coffee. | | iothuelightchange | The intent of this example utterance is changing the light. | | iothuelightdim | The intent of the example utterance is to dim the lights. | | iothuelightoff | The example utterance is related to turning the lights off. | | iothuelighton | The example utterance is related to turning the lights on. | | iothuelightup | The intent of this example utterance is to brighten lights. | | iotwemooff | The intent of this example utterance is turning an IoT device off. | | iotwemoon | The intent of the example utterance is to turn an IoT device on. | | listscreateoradd | The example utterance is related to creating or adding to lists. | | listsquery | The example utterance is a query about a list. | | listsremove | The intent of this example utterance is to remove a list or remove something from a list. | | musicdislikeness | The intent of this example utterance is signalling music dislike. | | musiclikeness | The example utterance is related to liking music. | | musicquery | The example utterance is a query about music. | | musicsettings | The intent of the example utterance is to change music settings. | | newsquery | The example utterance is a query about the news. | | playaudiobook | The example utterance is related to playing audiobooks. | | playgame | The intent of this example utterance is to start playing a game. | | playmusic | The intent of this example utterance is for an IoT device to play music. | | playpodcasts | The example utterance is related to playing podcasts. | | playradio | The intent of the example utterance is to play something on the radio. | | qacurrency | This example utteranceis about currencies. | | qadefinition | The example utterance is a query about a definition. | | qafactoid | The example utterance is a factoid question. | | qamaths | The example utterance is a question about maths. | | qastock | This example utterance is about stocks. | | recommendationevents | This example utterance is about event recommendations. | | recommendationlocations | The intent of this example utterance is receiving recommendations for good locations. | | recommendationmovies | This example utterance is about movie recommendations. | | socialpost | The example utterance is about social media posts. | | socialquery | The example utterance is a query about a social network. | | takeawayorder | The intent of this example utterance is to order takeaway food. | | takeawayquery | This example utterance is about takeaway food. | | transportquery | The example utterance is a query about transport or travels. | | transporttaxi | The intent of this example utterance is to get a taxi. | | transportticket | This example utterance is about transport tickets. | | transporttraffic | This example utterance is about transport or traffic. | | weatherquery | This example utterance is a query about the wheather. | banking77 | label | hypothesis | |:-------------------------------------------------|:----------------------------------------------------------------------------------------------------------| | Refundnotshowingup | This customer example message is about a refund not showing up. | | activatemycard | This banking customer example message is about activating a card. | | agelimit | This banking customer example message is related to age limits. | | applepayorgooglepay | This banking customer example message is about apple pay or google pay | | atmsupport | This banking customer example message requests ATM support. | | automatictopup | This banking customer example message is about automatic top up. | | balancenotupdatedafterbanktransfer | This banking customer example message is about a balance not updated after a transfer. | | balancenotupdatedafterchequeorcashdeposit | This banking customer example message is about a balance not updated after a cheque or cash deposit. | | beneficiarynotallowed | This banking customer example message is related to a beneficiary not being allowed or a failed transfer. | | canceltransfer | This banking customer example message is related to the cancellation of a transfer. | | cardabouttoexpire | This banking customer example message is related to the expiration of a card. | | cardacceptance | This banking customer example message is related to the scope of acceptance of a card. | | cardarrival | This banking customer example message is about the arrival of a card. | | carddeliveryestimate | This banking customer example message is about a card delivery estimate or timing. | | cardlinking | nan | | cardnotworking | This banking customer example message is about a card not working. | | cardpaymentfeecharged | This banking customer example message is about a card payment fee. | | cardpaymentnotrecognised | This banking customer example message is about a payment the customer does not recognise. | | cardpaymentwrongexchangerate | This banking customer example message is about a wrong exchange rate. | | cardswallowed | This banking customer example message is about a card swallowed by a machine. | | cashwithdrawalcharge | This banking customer example message is about a cash withdrawal charge. | | cashwithdrawalnotrecognised | This banking customer example message is about an unrecognised cash withdrawal. | | changepin | This banking customer example message is about changing a pin code. | | compromisedcard | This banking customer example message is about a compromised card. | | contactlessnotworking | This banking customer example message is about contactless not working | | countrysupport | This banking customer example message is about country-specific support. | | declinedcardpayment | This banking customer example message is about a declined card payment. | | declinedcashwithdrawal | This banking customer example message is about a declined cash withdrawal. | | declinedtransfer | This banking customer example message is about a declined transfer. | | directdebitpaymentnotrecognised | This banking customer example message is about an unrecognised direct debit payment. | | disposablecardlimits | This banking customer example message is about the limits of disposable cards. | | editpersonaldetails | This banking customer example message is about editing personal details. | | exchangecharge | This banking customer example message is about exchange rate charges. | | exchangerate | This banking customer example message is about exchange rates. | | exchangeviaapp | nan | | extrachargeonstatement | This banking customer example message is about an extra charge. | | failedtransfer | This banking customer example message is about a failed transfer. | | fiatcurrencysupport | This banking customer example message is about fiat currency support | | getdisposablevirtualcard | This banking customer example message is about getting a disposable virtual card. | | getphysicalcard | nan | | gettingsparecard | This banking customer example message is about getting a spare card. | | gettingvirtualcard | This banking customer example message is about getting a virtual card. | | lostorstolencard | This banking customer example message is about a lost or stolen card. | | lostorstolenphone | This banking customer example message is about a lost or stolen phone. | | orderphysicalcard | This banking customer example message is about ordering a card. | | passcodeforgotten | This banking customer example message is about a forgotten passcode. | | pendingcardpayment | This banking customer example message is about a pending card payment. | | pendingcashwithdrawal | This banking customer example message is about a pending cash withdrawal. | | pendingtopup | This banking customer example message is about a pending top up. | | pendingtransfer | This banking customer example message is about a pending transfer. | | pinblocked | This banking customer example message is about a blocked pin. | | receivingmoney | This banking customer example message is about receiving money. | | requestrefund | This banking customer example message is about a refund request. | | revertedcardpayment? | This banking customer example message is about reverting a card payment. | | supportedcardsandcurrencies | nan | | terminateaccount | This banking customer example message is about terminating an account. | | topupbybanktransfercharge | nan | | topupbycardcharge | This banking customer example message is about the charge for topping up by card. | | topupbycashorcheque | This banking customer example message is about topping up by cash or cheque. | | topupfailed | This banking customer example message is about top up issues or failures. | | topuplimits | This banking customer example message is about top up limitations. | | topupreverted | This banking customer example message is about issues with topping up. | | toppingupbycard | This banking customer example message is about topping up by card. | | transactionchargedtwice | This banking customer example message is about a transaction charged twice. | | transferfeecharged | This banking customer example message is about an issue with a transfer fee charge. | | transferintoaccount | This banking customer example message is about transfers into the customer's own account. | | transfernotreceivedbyrecipient | This banking customer example message is about a transfer that has not arrived yet. | | transfertiming | This banking customer example message is about transfer timing. | | unabletoverifyidentity | This banking customer example message is about an issue with identity verification. | | verifymyidentity | This banking customer example message is about identity verification. | | verifysourceoffunds | This banking customer example message is about the source of funds. | | verifytopup | This banking customer example message is about verification and top ups | | virtualcardnotworking | This banking customer example message is about a virtual card not working | | visaormastercard | This banking customer example message is about types of bank cards. | | whyverifyidentity | This banking customer example message questions why identity verification is necessary. | | wrongamountofcashreceived | This banking customer example message is about a wrong amount of cash received. | | wrongexchangerateforcashwithdrawal | This banking customer example message is about a wrong exchange rate for a cash withdrawal. | trueteacher | label | hypothesis | |:-----------------------|:---------------------------------------------------------------------| | factuallyconsistent | The example summary is factually consistent with the full article. | | factuallyinconsistent | The example summary is factually inconsistent with the full article. | capsotu | label | hypothesis | |:----------------------|:----------------------------------------------------------------------------------------------------------| | Agriculture | This example text from a US presidential speech is about agriculture | | Civil Rights | This example text from a US presidential speech is about civil rights or minorities or civil liberties | | Culture | This example text from a US presidential speech is about cultural policy | | Defense | This example text from a US presidential speech is about defense or military | | Domestic Commerce | This example text from a US presidential speech is about banking or finance or commerce | | Education | This example text from a US presidential speech is about education | | Energy | This example text from a US presidential speech is about energy or electricity or fossil fuels | | Environment | This example text from a US presidential speech is about the environment or water or waste or pollution | | Foreign Trade | This example text from a US presidential speech is about foreign trade | | Government Operations | This example text from a US presidential speech is about government operations or administration | | Health | This example text from a US presidential speech is about health | | Housing | This example text from a US presidential speech is about community development or housing issues | | Immigration | This example text from a US presidential speech is about migration | | International Affairs | This example text from a US presidential speech is about international affairs or foreign aid | | Labor | This example text from a US presidential speech is about employment or labour | | Law and Crime | This example text from a US presidential speech is about law, crime or family issues | | Macroeconomics | This example text from a US presidential speech is about macroeconomics | | Public Lands | This example text from a US presidential speech is about public lands or water management | | Social Welfare | This example text from a US presidential speech is about social welfare | | Technology | This example text from a US presidential speech is about space or science or technology or communications | | Transportation | This example text from a US presidential speech is about transportation | manifesto | label | hypothesis | |:-------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Agriculture and Farmers: Positive | This example text from a political party manifesto is positive towards policies for agriculture and farmers | | Anti-Growth Economy: Positive | This example text from a political party manifesto is in favour of anti-growth politics | | Anti-Imperialism | This example text from a political party manifesto is anti-imperialistic, for example against controlling other countries and for greater self-government of colonies | | Centralisation | This example text from a political party manifesto is in favour of political centralisation | | Civic Mindedness: Positive | This example text from a political party manifesto is positive towards national solidarity, civil society or appeals for public spiritedness or against anti-social attitudes | | Constitutionalism: Negative | This example text from a political party manifesto is positive towards constitutionalism | | Constitutionalism: Positive | This example text from a political party manifesto is positive towards constitutionalism and the status quo of the constitution | | Controlled Economy | This example text from a political party manifesto is supportive of direct government control of the economy, e.g. price control or minimum wages | | Corporatism/Mixed Economy | This example text from a political party manifesto is positive towards cooperation of government, employers, and trade unions simultaneously | | Culture: Positive | This example text from a political party manifesto is in favour of cultural policies or leisure facilities, for example museus, libraries or public sport clubs | | Decentralization | This example text from a political party manifesto is for decentralisation or federalism | | Democracy | This example text from a political party manifesto favourably mentions democracy or democratic procedures or institutions | | Economic Goals | This example text from a political party manifesto is a broad/general statement on economic goals without specifics | | Economic Growth: Positive | This example text from a political party manifesto is supportive of economic growth, for example facilitation of more production or government aid for growth | | Economic Orthodoxy | This example text from a political party manifesto is for economic orthodoxy, for example reduction of budget deficits, thrift or a strong currency | | Economic Planning | This example text from a political party manifesto is positive towards government economic planning, e.g. policy plans or strategies | | Education Expansion | This example text from a political party manifesto is about the need to expand/improve policy on education | | Education Limitation | This example text from a political party manifesto is sceptical towards state expenditure on education, for example in favour of study fees or private schools | | Environmental Protection | This example text from a political party manifesto is in favour of environmental protection, e.g. fighting climate change or 'green' policies or preservation of natural resources or animal rights | | Equality: Positive | This example text from a political party manifesto is positive towards equality or social justice, e.g. protection of underprivileged groups or fair distribution of resources | | European Community/Union: Negative | This example text from a political party manifesto negatively mentions the EU or European Community | | European Community/Union: Positive | This example text from a political party manifesto is positive towards the EU or European Community, for example EU expansion and integration | | Foreign Special Relationships: Negative | This example text from a political party manifesto is negative towards particular countries | | Foreign Special Relationships: Positive | This example text from a political party manifesto is positive towards particular countries | | Free Market Economy | This example text from a political party manifesto is in favour of a free market economy and capitalism | | Freedom and Human Rights | This example text from a political party manifesto is in favour of freedom and human rights, for example freedom of speech, assembly or against state coercion or for individualism | | Governmental and Administrative Efficiency | This example text from a political party manifesto is in favour of efficiency in government/administration, for example by restructuring civil service or improving bureaucracy | | Incentives: Positive | This example text from a political party manifesto is favourable towards supply side economic policies supporting businesses, for example for incentives like subsidies or tax breaks | | Internationalism: Negative | This example text from a political party manifesto is sceptical of internationalism, for example negative towards international cooperation, in favour of national sovereignty and unilaterialism | | Internationalism: Positive | This example text from a political party manifesto is in favour of international cooperation with other countries, for example mentions the need for aid to developing countries, or global governance | | Keynesian Demand Management | This example text from a political party manifesto is for keynesian demand management and demand side economic policies | | Labour Groups: Negative | This example text from a political party manifesto is negative towards labour groups and unions | | Labour Groups: Positive | This example text from a political party manifesto is positive towards labour groups, for example for good working conditions, fair wages or unions | | Law and Order: Positive | This example text from a political party manifesto is positive towards law and order and strict law enforcement | | Market Regulation | This example text from a political party manifesto is supports market regulation for a fair and open market, for example for consumer protection or for increased competition or for social market economy | | Marxist Analysis | This example text from a political party manifesto is positive towards Marxist-Leninist ideas or uses specific Marxist terminology | | Middle Class and Professional Groups | This example text from a political party manifesto favourably references the middle class, e.g. white colar groups or the service sector | | Military: Negative | This example text from a political party manifesto is negative towards the military, for example for decreasing military spending or disarmament | | Military: Positive | This example text from a political party manifesto is positive towards the military, for example for military spending or rearmament or military treaty obligations | | Multiculturalism: Negative | This example text from a political party manifesto is sceptical towards multiculturalism, or for cultural integration or appeals to cultural homogeneity in society | | Multiculturalism: Positive | This example text from a political party manifesto favourably mentions cultural diversity, for example for freedom of religion or linguistic heritages | | National Way of Life: Negative | This example text from a political party manifesto unfavourably mentions a country's nation and history, for example sceptical towards patriotism or national pride | | National Way of Life: Positive | This example text from a political party manifesto is positive towards the national way of life and history, for example pride of citizenship or appeals to patriotism | | Nationalisation | This example text from a political party manifesto is positive towards government ownership of industries or land or for economic nationalisation | | Non-economic Demographic Groups | This example text from a political party manifesto favourably mentions non-economic demographic groups like women, students or specific age groups | | Peace | This example text from a political party manifesto is positive towards peace and peaceful means of solving crises, for example in favour of negotiations and ending wars | | Political Authority | This example text from a political party manifesto mentions the speaker's competence to govern or other party's lack of such competence, or favourably mentions a strong/stable government | | Political Corruption | This example text from a political party manifesto is negative towards political corruption or abuse of political/bureaucratic power | | Protectionism: Negative | This example text from a political party manifesto is negative towards protectionism, in favour of free trade | | Protectionism: Positive | This example text from a political party manifesto is in favour of protectionism, for example tariffs, export subsidies | | Technology and Infrastructure: Positive | This example text from a political party manifesto is about technology and infrastructure, e.g. the importance of modernisation of industry, or supportive of public spending on infrastructure/tech | | Traditional Morality: Negative | This example text from a political party manifesto is negative towards traditional morality, for example against religious moral values, for divorce or abortion, for modern families or separation of church and state | | Traditional Morality: Positive | This example text from a political party manifesto is favourable towards traditional or religious values, for example for censorship of immoral behavour, for traditional family values or religious institutions | | Underprivileged Minority Groups | This example text from a political party manifesto favourably mentions underprivileged minorities, for example handicapped, homosexuals or immigrants | | Welfare State Expansion | This example text from a political party manifesto is positive towards the welfare state, e.g. health care, pensions or social housing | | Welfare State Limitation | This example text from a political party manifesto is for limiting the welfare state, for example public funding for social services or social security, e.g. private care before state care |
ModernBERT-large-zeroshot-v2.0
deberta-v3-large-zeroshot-v1.1-all-33
Model description: deberta-v3-large-zeroshot-v1.1-all-33 The model is designed for zero-shot classification with the Hugging Face pipeline. The model can do one universal classification task: determine whether a hypothesis is "true" or "not true" given a text (`entailment` vs. `notentailment`). This task format is based on the Natural Language Inference task (NLI). The task is so universal that any classification task can be reformulated into this task. A detailed description of how the model was trained and how it can be used is available in this paper. Training data The model was trained on a mixture of 33 datasets and 387 classes that have been reformatted into this universal format. 1. Five NLI datasets with ~885k texts: "mnli", "anli", "fever", "wanli", "ling" 2. 28 classification tasks reformatted into the universal NLI format. ~51k cleaned texts were used to avoid overfitting: 'amazonpolarity', 'imdb', 'appreviews', 'yelpreviews', 'rottentomatoes', 'emotiondair', 'emocontext', 'empathetic', 'financialphrasebank', 'banking77', 'massive', 'wikitoxictoxicaggregated', 'wikitoxicobscene', 'wikitoxicthreat', 'wikitoxicinsult', 'wikitoxicidentityhate', 'hateoffensive', 'hatexplain', 'biasframesoffensive', 'biasframessex', 'biasframesintent', 'agnews', 'yahootopics', 'trueteacher', 'spam', 'wellformedquery', 'manifesto', 'capsotu'. See details on each dataset here: https://github.com/MoritzLaurer/zeroshot-classifier/blob/main/datasetsoverview.csv Note that compared to other NLI models, this model predicts two classes (`entailment` vs. `notentailment`) as opposed to three classes (entailment/neutral/contradiction) The model was only trained on English data. For multilingual use-cases, I recommend machine translating texts to English with libraries like EasyNMT. English-only models tend to perform better than multilingual models and validation with English data can be easier if you don't speak all languages in your corpus. How to use the model Simple zero-shot classification pipeline Details on data and training The code for preparing the data and training & evaluating the model is fully open-source here: https://github.com/MoritzLaurer/zeroshot-classifier/tree/main Hyperparameters and other details are available in this Weights & Biases repo: https://wandb.ai/moritzlaurer/deberta-v3-large-zeroshot-v1-1-all-33/table?workspace=user- Balanced accuracy is reported for all datasets. `deberta-v3-large-zeroshot-v1.1-all-33` was trained on all datasets, with only maximum 500 texts per class to avoid overfitting. The metrics on these datasets are therefore not strictly zeroshot, as the model has seen some data for each task during training. `deberta-v3-large-zeroshot-v1.1-heldout` indicates zeroshot performance on the respective dataset. To calculate these zeroshot metrics, the pipeline was run 28 times, each time with one dataset held out from training to simulate a zeroshot setup. | | deberta-v3-large-mnli-fever-anli-ling-wanli-binary | deberta-v3-large-zeroshot-v1.1-heldout | deberta-v3-large-zeroshot-v1.1-all-33 | |:---------------------------|----------------------------:|-----------------------------------------:|----------------------------------------:| | datasets mean (w/o nli) | 64.1 | 73.4 | 85.2 | | amazonpolarity (2) | 94.7 | 96.6 | 96.8 | | imdb (2) | 90.3 | 95.2 | 95.5 | | appreviews (2) | 93.6 | 94.3 | 94.7 | | yelpreviews (2) | 98.5 | 98.4 | 98.9 | | rottentomatoes (2) | 83.9 | 90.5 | 90.8 | | emotiondair (6) | 49.2 | 42.1 | 72.1 | | emocontext (4) | 57 | 69.3 | 82.4 | | empathetic (32) | 42 | 34.4 | 58 | | financialphrasebank (3) | 77.4 | 77.5 | 91.9 | | banking77 (72) | 29.1 | 52.8 | 72.2 | | massive (59) | 47.3 | 64.7 | 77.3 | | wikitoxictoxicaggreg (2) | 81.6 | 86.6 | 91 | | wikitoxicobscene (2) | 85.9 | 91.9 | 93.1 | | wikitoxicthreat (2) | 77.9 | 93.7 | 97.6 | | wikitoxicinsult (2) | 77.8 | 91.1 | 92.3 | | wikitoxicidentityhate (2) | 86.4 | 89.8 | 95.7 | | hateoffensive (3) | 62.8 | 66.5 | 88.4 | | hatexplain (3) | 46.9 | 61 | 76.9 | | biasframesoffensive (2) | 62.5 | 86.6 | 89 | | biasframessex (2) | 87.6 | 89.6 | 92.6 | | biasframesintent (2) | 54.8 | 88.6 | 89.9 | | agnews (4) | 81.9 | 82.8 | 90.9 | | yahootopics (10) | 37.7 | 65.6 | 74.3 | | trueteacher (2) | 51.2 | 54.9 | 86.6 | | spam (2) | 52.6 | 51.8 | 97.1 | | wellformedquery (2) | 49.9 | 40.4 | 82.7 | | manifesto (56) | 10.6 | 29.4 | 44.1 | | capsotu (21) | 23.2 | 69.4 | 74 | | mnlim (2) | 93.1 | nan | 93.1 | | mnlimm (2) | 93.2 | nan | 93.2 | | fevernli (2) | 89.3 | nan | 89.5 | | anlir1 (2) | 87.9 | nan | 87.3 | | anlir2 (2) | 76.3 | nan | 78 | | anlir3 (2) | 73.6 | nan | 74.1 | | wanli (2) | 82.8 | nan | 82.7 | | lingnli (2) | 90.2 | nan | 89.6 | Limitations and bias The model can only do text classification tasks. Please consult the original DeBERTa paper and the papers for the different datasets for potential biases. License The base model (DeBERTa-v3) is published under the MIT license. The datasets the model was fine-tuned on are published under a diverse set of licenses. The following table provides an overview of the non-NLI datasets used for fine-tuning, information on licenses, the underlying papers etc.: https://github.com/MoritzLaurer/zeroshot-classifier/blob/main/datasetsoverview.csv Citation If you use this model academically, please cite: Ideas for cooperation or questions? If you have questions or ideas for cooperation, contact me at m{dot}laurer{at}vu{dot}nl or LinkedIn Debugging and issues Note that DeBERTa-v3 was released on 06.12.21 and older versions of HF Transformers can have issues running the model (e.g. resulting in an issue with the tokenizer). Using Transformers>=4.13 might solve some issues. Hypotheses used for classification The hypotheses in the tables below were used to fine-tune the model. Inspecting them can help users get a feeling for which type of hypotheses and tasks the model was trained on. You can formulate your own hypotheses by changing the `hypothesistemplate` of the zeroshot pipeline. For example: Note that a few rows in the `massive` and `banking77` datasets contain `nan` because some classes were so ambiguous/unclear that I excluded them from the data. wellformedquery | label | hypothesis | |:----------------|:-----------------------------------------------| | notwellformed | This example is not a well formed Google query | | wellformed | This example is a well formed Google query. | biasframessex | label | hypothesis | |:--------|:-----------------------------------------------------------| | notsex | This example does not contain allusions to sexual content. | | sex | This example contains allusions to sexual content. | biasframesintent | label | hypothesis | |:-----------|:-----------------------------------------------------------------| | intent | The intent of this example is to be offensive/disrespectful. | | notintent | The intent of this example is not to be offensive/disrespectful. | biasframesoffensive | label | hypothesis | |:--------------|:-------------------------------------------------------------------------| | notoffensive | This example could not be considered offensive, disrespectful, or toxic. | | offensive | This example could be considered offensive, disrespectful, or toxic. | financialphrasebank | label | hypothesis | |:---------|:--------------------------------------------------------------------------| | negative | The sentiment in this example is negative from an investor's perspective. | | neutral | The sentiment in this example is neutral from an investor's perspective. | | positive | The sentiment in this example is positive from an investor's perspective. | rottentomatoes | label | hypothesis | |:---------|:-----------------------------------------------------------------------| | negative | The sentiment in this example rotten tomatoes movie review is negative | | positive | The sentiment in this example rotten tomatoes movie review is positive | amazonpolarity | label | hypothesis | |:---------|:----------------------------------------------------------------| | negative | The sentiment in this example amazon product review is negative | | positive | The sentiment in this example amazon product review is positive | imdb | label | hypothesis | |:---------|:------------------------------------------------------------| | negative | The sentiment in this example imdb movie review is negative | | positive | The sentiment in this example imdb movie review is positive | appreviews | label | hypothesis | |:---------|:------------------------------------------------------| | negative | The sentiment in this example app review is negative. | | positive | The sentiment in this example app review is positive. | yelpreviews | label | hypothesis | |:---------|:-------------------------------------------------------| | negative | The sentiment in this example yelp review is negative. | | positive | The sentiment in this example yelp review is positive. | wikitoxictoxicaggregated | label | hypothesis | |:--------------------|:----------------------------------------------------------------| | nottoxicaggregated | This example wikipedia comment does not contain toxic language. | | toxicaggregated | This example wikipedia comment contains toxic language. | wikitoxicobscene | label | hypothesis | |:------------|:------------------------------------------------------------------| | notobscene | This example wikipedia comment does not contain obscene language. | | obscene | This example wikipedia comment contains obscene language. | wikitoxicthreat | label | hypothesis | |:-----------|:----------------------------------------------------------| | notthreat | This example wikipedia comment does not contain a threat. | | threat | This example wikipedia comment contains a threat. | wikitoxicinsult | label | hypothesis | |:-----------|:-----------------------------------------------------------| | insult | This example wikipedia comment contains an insult. | | notinsult | This example wikipedia comment does not contain an insult. | wikitoxicidentityhate | label | hypothesis | |:-----------------|:---------------------------------------------------------------| | identityhate | This example wikipedia comment contains identity hate. | | notidentityhate | This example wikipedia comment does not contain identity hate. | hateoffensive | label | hypothesis | |:------------|:------------------------------------------------------------------------| | hatespeech | This example tweet contains hate speech. | | neither | This example tweet contains neither offensive language nor hate speech. | | offensive | This example tweet contains offensive language without hate speech. | hatexplain | label | hypothesis | |:------------|:-------------------------------------------------------------------------------------------| | hatespeech | This example text from twitter or gab contains hate speech. | | neither | This example text from twitter or gab contains neither offensive language nor hate speech. | | offensive | This example text from twitter or gab contains offensive language without hate speech. | spam | label | hypothesis | |:---------|:------------------------------| | notspam | This example sms is not spam. | | spam | This example sms is spam. | emotiondair | label | hypothesis | |:---------|:---------------------------------------------------| | anger | This example tweet expresses the emotion: anger | | fear | This example tweet expresses the emotion: fear | | joy | This example tweet expresses the emotion: joy | | love | This example tweet expresses the emotion: love | | sadness | This example tweet expresses the emotion: sadness | | surprise | This example tweet expresses the emotion: surprise | emocontext | label | hypothesis | |:--------|:--------------------------------------------------------------------------------------| | angry | This example tweet expresses the emotion: anger | | happy | This example tweet expresses the emotion: happiness | | others | This example tweet does not express any of the emotions: anger, sadness, or happiness | | sad | This example tweet expresses the emotion: sadness | empathetic | label | hypothesis | |:-------------|:-----------------------------------------------------------| | afraid | The main emotion of this example dialogue is: afraid | | angry | The main emotion of this example dialogue is: angry | | annoyed | The main emotion of this example dialogue is: annoyed | | anticipating | The main emotion of this example dialogue is: anticipating | | anxious | The main emotion of this example dialogue is: anxious | | apprehensive | The main emotion of this example dialogue is: apprehensive | | ashamed | The main emotion of this example dialogue is: ashamed | | caring | The main emotion of this example dialogue is: caring | | confident | The main emotion of this example dialogue is: confident | | content | The main emotion of this example dialogue is: content | | devastated | The main emotion of this example dialogue is: devastated | | disappointed | The main emotion of this example dialogue is: disappointed | | disgusted | The main emotion of this example dialogue is: disgusted | | embarrassed | The main emotion of this example dialogue is: embarrassed | | excited | The main emotion of this example dialogue is: excited | | faithful | The main emotion of this example dialogue is: faithful | | furious | The main emotion of this example dialogue is: furious | | grateful | The main emotion of this example dialogue is: grateful | | guilty | The main emotion of this example dialogue is: guilty | | hopeful | The main emotion of this example dialogue is: hopeful | | impressed | The main emotion of this example dialogue is: impressed | | jealous | The main emotion of this example dialogue is: jealous | | joyful | The main emotion of this example dialogue is: joyful | | lonely | The main emotion of this example dialogue is: lonely | | nostalgic | The main emotion of this example dialogue is: nostalgic | | prepared | The main emotion of this example dialogue is: prepared | | proud | The main emotion of this example dialogue is: proud | | sad | The main emotion of this example dialogue is: sad | | sentimental | The main emotion of this example dialogue is: sentimental | | surprised | The main emotion of this example dialogue is: surprised | | terrified | The main emotion of this example dialogue is: terrified | | trusting | The main emotion of this example dialogue is: trusting | agnews | label | hypothesis | |:---------|:-------------------------------------------------------| | Business | This example news text is about business news | | Sci/Tech | This example news text is about science and technology | | Sports | This example news text is about sports | | World | This example news text is about world news | yahootopics | label | hypothesis | |:-----------------------|:---------------------------------------------------------------------------------------------------| | Business & Finance | This example question from the Yahoo Q&A forum is categorized in the topic: Business & Finance | | Computers & Internet | This example question from the Yahoo Q&A forum is categorized in the topic: Computers & Internet | | Education & Reference | This example question from the Yahoo Q&A forum is categorized in the topic: Education & Reference | | Entertainment & Music | This example question from the Yahoo Q&A forum is categorized in the topic: Entertainment & Music | | Family & Relationships | This example question from the Yahoo Q&A forum is categorized in the topic: Family & Relationships | | Health | This example question from the Yahoo Q&A forum is categorized in the topic: Health | | Politics & Government | This example question from the Yahoo Q&A forum is categorized in the topic: Politics & Government | | Science & Mathematics | This example question from the Yahoo Q&A forum is categorized in the topic: Science & Mathematics | | Society & Culture | This example question from the Yahoo Q&A forum is categorized in the topic: Society & Culture | | Sports | This example question from the Yahoo Q&A forum is categorized in the topic: Sports | massive | label | hypothesis | |:-------------------------|:------------------------------------------------------------------------------------------| | alarmquery | The example utterance is a query about alarms. | | alarmremove | The intent of this example utterance is to remove an alarm. | | alarmset | The intent of the example utterance is to set an alarm. | | audiovolumedown | The intent of the example utterance is to lower the volume. | | audiovolumemute | The intent of this example utterance is to mute the volume. | | audiovolumeother | The example utterance is related to audio volume. | | audiovolumeup | The intent of this example utterance is turning the audio volume up. | | calendarquery | The example utterance is a query about a calendar. | | calendarremove | The intent of the example utterance is to remove something from a calendar. | | calendarset | The intent of this example utterance is to set something in a calendar. | | cookingquery | The example utterance is a query about cooking. | | cookingrecipe | This example utterance is about cooking recipies. | | datetimeconvert | The example utterance is related to date time changes or conversion. | | datetimequery | The intent of this example utterance is a datetime query. | | emailaddcontact | The intent of this example utterance is adding an email address to contacts. | | emailquery | The example utterance is a query about emails. | | emailquerycontact | The intent of this example utterance is to query contact details. | | emailsendemail | The intent of the example utterance is to send an email. | | generalgreet | This example utterance is a general greet. | | generaljoke | The intent of the example utterance is to hear a joke. | | generalquirky | nan | | iotcleaning | The intent of the example utterance is for an IoT device to start cleaning. | | iotcoffee | The intent of this example utterance is for an IoT device to make coffee. | | iothuelightchange | The intent of this example utterance is changing the light. | | iothuelightdim | The intent of the example utterance is to dim the lights. | | iothuelightoff | The example utterance is related to turning the lights off. | | iothuelighton | The example utterance is related to turning the lights on. | | iothuelightup | The intent of this example utterance is to brighten lights. | | iotwemooff | The intent of this example utterance is turning an IoT device off. | | iotwemoon | The intent of the example utterance is to turn an IoT device on. | | listscreateoradd | The example utterance is related to creating or adding to lists. | | listsquery | The example utterance is a query about a list. | | listsremove | The intent of this example utterance is to remove a list or remove something from a list. | | musicdislikeness | The intent of this example utterance is signalling music dislike. | | musiclikeness | The example utterance is related to liking music. | | musicquery | The example utterance is a query about music. | | musicsettings | The intent of the example utterance is to change music settings. | | newsquery | The example utterance is a query about the news. | | playaudiobook | The example utterance is related to playing audiobooks. | | playgame | The intent of this example utterance is to start playing a game. | | playmusic | The intent of this example utterance is for an IoT device to play music. | | playpodcasts | The example utterance is related to playing podcasts. | | playradio | The intent of the example utterance is to play something on the radio. | | qacurrency | This example utteranceis about currencies. | | qadefinition | The example utterance is a query about a definition. | | qafactoid | The example utterance is a factoid question. | | qamaths | The example utterance is a question about maths. | | qastock | This example utterance is about stocks. | | recommendationevents | This example utterance is about event recommendations. | | recommendationlocations | The intent of this example utterance is receiving recommendations for good locations. | | recommendationmovies | This example utterance is about movie recommendations. | | socialpost | The example utterance is about social media posts. | | socialquery | The example utterance is a query about a social network. | | takeawayorder | The intent of this example utterance is to order takeaway food. | | takeawayquery | This example utterance is about takeaway food. | | transportquery | The example utterance is a query about transport or travels. | | transporttaxi | The intent of this example utterance is to get a taxi. | | transportticket | This example utterance is about transport tickets. | | transporttraffic | This example utterance is about transport or traffic. | | weatherquery | This example utterance is a query about the wheather. | banking77 | label | hypothesis | |:-------------------------------------------------|:----------------------------------------------------------------------------------------------------------| | Refundnotshowingup | This customer example message is about a refund not showing up. | | activatemycard | This banking customer example message is about activating a card. | | agelimit | This banking customer example message is related to age limits. | | applepayorgooglepay | This banking customer example message is about apple pay or google pay | | atmsupport | This banking customer example message requests ATM support. | | automatictopup | This banking customer example message is about automatic top up. | | balancenotupdatedafterbanktransfer | This banking customer example message is about a balance not updated after a transfer. | | balancenotupdatedafterchequeorcashdeposit | This banking customer example message is about a balance not updated after a cheque or cash deposit. | | beneficiarynotallowed | This banking customer example message is related to a beneficiary not being allowed or a failed transfer. | | canceltransfer | This banking customer example message is related to the cancellation of a transfer. | | cardabouttoexpire | This banking customer example message is related to the expiration of a card. | | cardacceptance | This banking customer example message is related to the scope of acceptance of a card. | | cardarrival | This banking customer example message is about the arrival of a card. | | carddeliveryestimate | This banking customer example message is about a card delivery estimate or timing. | | cardlinking | nan | | cardnotworking | This banking customer example message is about a card not working. | | cardpaymentfeecharged | This banking customer example message is about a card payment fee. | | cardpaymentnotrecognised | This banking customer example message is about a payment the customer does not recognise. | | cardpaymentwrongexchangerate | This banking customer example message is about a wrong exchange rate. | | cardswallowed | This banking customer example message is about a card swallowed by a machine. | | cashwithdrawalcharge | This banking customer example message is about a cash withdrawal charge. | | cashwithdrawalnotrecognised | This banking customer example message is about an unrecognised cash withdrawal. | | changepin | This banking customer example message is about changing a pin code. | | compromisedcard | This banking customer example message is about a compromised card. | | contactlessnotworking | This banking customer example message is about contactless not working | | countrysupport | This banking customer example message is about country-specific support. | | declinedcardpayment | This banking customer example message is about a declined card payment. | | declinedcashwithdrawal | This banking customer example message is about a declined cash withdrawal. | | declinedtransfer | This banking customer example message is about a declined transfer. | | directdebitpaymentnotrecognised | This banking customer example message is about an unrecognised direct debit payment. | | disposablecardlimits | This banking customer example message is about the limits of disposable cards. | | editpersonaldetails | This banking customer example message is about editing personal details. | | exchangecharge | This banking customer example message is about exchange rate charges. | | exchangerate | This banking customer example message is about exchange rates. | | exchangeviaapp | nan | | extrachargeonstatement | This banking customer example message is about an extra charge. | | failedtransfer | This banking customer example message is about a failed transfer. | | fiatcurrencysupport | This banking customer example message is about fiat currency support | | getdisposablevirtualcard | This banking customer example message is about getting a disposable virtual card. | | getphysicalcard | nan | | gettingsparecard | This banking customer example message is about getting a spare card. | | gettingvirtualcard | This banking customer example message is about getting a virtual card. | | lostorstolencard | This banking customer example message is about a lost or stolen card. | | lostorstolenphone | This banking customer example message is about a lost or stolen phone. | | orderphysicalcard | This banking customer example message is about ordering a card. | | passcodeforgotten | This banking customer example message is about a forgotten passcode. | | pendingcardpayment | This banking customer example message is about a pending card payment. | | pendingcashwithdrawal | This banking customer example message is about a pending cash withdrawal. | | pendingtopup | This banking customer example message is about a pending top up. | | pendingtransfer | This banking customer example message is about a pending transfer. | | pinblocked | This banking customer example message is about a blocked pin. | | receivingmoney | This banking customer example message is about receiving money. | | requestrefund | This banking customer example message is about a refund request. | | revertedcardpayment? | This banking customer example message is about reverting a card payment. | | supportedcardsandcurrencies | nan | | terminateaccount | This banking customer example message is about terminating an account. | | topupbybanktransfercharge | nan | | topupbycardcharge | This banking customer example message is about the charge for topping up by card. | | topupbycashorcheque | This banking customer example message is about topping up by cash or cheque. | | topupfailed | This banking customer example message is about top up issues or failures. | | topuplimits | This banking customer example message is about top up limitations. | | topupreverted | This banking customer example message is about issues with topping up. | | toppingupbycard | This banking customer example message is about topping up by card. | | transactionchargedtwice | This banking customer example message is about a transaction charged twice. | | transferfeecharged | This banking customer example message is about an issue with a transfer fee charge. | | transferintoaccount | This banking customer example message is about transfers into the customer's own account. | | transfernotreceivedbyrecipient | This banking customer example message is about a transfer that has not arrived yet. | | transfertiming | This banking customer example message is about transfer timing. | | unabletoverifyidentity | This banking customer example message is about an issue with identity verification. | | verifymyidentity | This banking customer example message is about identity verification. | | verifysourceoffunds | This banking customer example message is about the source of funds. | | verifytopup | This banking customer example message is about verification and top ups | | virtualcardnotworking | This banking customer example message is about a virtual card not working | | visaormastercard | This banking customer example message is about types of bank cards. | | whyverifyidentity | This banking customer example message questions why identity verification is necessary. | | wrongamountofcashreceived | This banking customer example message is about a wrong amount of cash received. | | wrongexchangerateforcashwithdrawal | This banking customer example message is about a wrong exchange rate for a cash withdrawal. | trueteacher | label | hypothesis | |:-----------------------|:---------------------------------------------------------------------| | factuallyconsistent | The example summary is factually consistent with the full article. | | factuallyinconsistent | The example summary is factually inconsistent with the full article. | capsotu | label | hypothesis | |:----------------------|:----------------------------------------------------------------------------------------------------------| | Agriculture | This example text from a US presidential speech is about agriculture | | Civil Rights | This example text from a US presidential speech is about civil rights or minorities or civil liberties | | Culture | This example text from a US presidential speech is about cultural policy | | Defense | This example text from a US presidential speech is about defense or military | | Domestic Commerce | This example text from a US presidential speech is about banking or finance or commerce | | Education | This example text from a US presidential speech is about education | | Energy | This example text from a US presidential speech is about energy or electricity or fossil fuels | | Environment | This example text from a US presidential speech is about the environment or water or waste or pollution | | Foreign Trade | This example text from a US presidential speech is about foreign trade | | Government Operations | This example text from a US presidential speech is about government operations or administration | | Health | This example text from a US presidential speech is about health | | Housing | This example text from a US presidential speech is about community development or housing issues | | Immigration | This example text from a US presidential speech is about migration | | International Affairs | This example text from a US presidential speech is about international affairs or foreign aid | | Labor | This example text from a US presidential speech is about employment or labour | | Law and Crime | This example text from a US presidential speech is about law, crime or family issues | | Macroeconomics | This example text from a US presidential speech is about macroeconomics | | Public Lands | This example text from a US presidential speech is about public lands or water management | | Social Welfare | This example text from a US presidential speech is about social welfare | | Technology | This example text from a US presidential speech is about space or science or technology or communications | | Transportation | This example text from a US presidential speech is about transportation | manifesto | label | hypothesis | |:-------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Agriculture and Farmers: Positive | This example text from a political party manifesto is positive towards policies for agriculture and farmers | | Anti-Growth Economy: Positive | This example text from a political party manifesto is in favour of anti-growth politics | | Anti-Imperialism | This example text from a political party manifesto is anti-imperialistic, for example against controlling other countries and for greater self-government of colonies | | Centralisation | This example text from a political party manifesto is in favour of political centralisation | | Civic Mindedness: Positive | This example text from a political party manifesto is positive towards national solidarity, civil society or appeals for public spiritedness or against anti-social attitudes | | Constitutionalism: Negative | This example text from a political party manifesto is positive towards constitutionalism | | Constitutionalism: Positive | This example text from a political party manifesto is positive towards constitutionalism and the status quo of the constitution | | Controlled Economy | This example text from a political party manifesto is supportive of direct government control of the economy, e.g. price control or minimum wages | | Corporatism/Mixed Economy | This example text from a political party manifesto is positive towards cooperation of government, employers, and trade unions simultaneously | | Culture: Positive | This example text from a political party manifesto is in favour of cultural policies or leisure facilities, for example museus, libraries or public sport clubs | | Decentralization | This example text from a political party manifesto is for decentralisation or federalism | | Democracy | This example text from a political party manifesto favourably mentions democracy or democratic procedures or institutions | | Economic Goals | This example text from a political party manifesto is a broad/general statement on economic goals without specifics | | Economic Growth: Positive | This example text from a political party manifesto is supportive of economic growth, for example facilitation of more production or government aid for growth | | Economic Orthodoxy | This example text from a political party manifesto is for economic orthodoxy, for example reduction of budget deficits, thrift or a strong currency | | Economic Planning | This example text from a political party manifesto is positive towards government economic planning, e.g. policy plans or strategies | | Education Expansion | This example text from a political party manifesto is about the need to expand/improve policy on education | | Education Limitation | This example text from a political party manifesto is sceptical towards state expenditure on education, for example in favour of study fees or private schools | | Environmental Protection | This example text from a political party manifesto is in favour of environmental protection, e.g. fighting climate change or 'green' policies or preservation of natural resources or animal rights | | Equality: Positive | This example text from a political party manifesto is positive towards equality or social justice, e.g. protection of underprivileged groups or fair distribution of resources | | European Community/Union: Negative | This example text from a political party manifesto negatively mentions the EU or European Community | | European Community/Union: Positive | This example text from a political party manifesto is positive towards the EU or European Community, for example EU expansion and integration | | Foreign Special Relationships: Negative | This example text from a political party manifesto is negative towards particular countries | | Foreign Special Relationships: Positive | This example text from a political party manifesto is positive towards particular countries | | Free Market Economy | This example text from a political party manifesto is in favour of a free market economy and capitalism | | Freedom and Human Rights | This example text from a political party manifesto is in favour of freedom and human rights, for example freedom of speech, assembly or against state coercion or for individualism | | Governmental and Administrative Efficiency | This example text from a political party manifesto is in favour of efficiency in government/administration, for example by restructuring civil service or improving bureaucracy | | Incentives: Positive | This example text from a political party manifesto is favourable towards supply side economic policies supporting businesses, for example for incentives like subsidies or tax breaks | | Internationalism: Negative | This example text from a political party manifesto is sceptical of internationalism, for example negative towards international cooperation, in favour of national sovereignty and unilaterialism | | Internationalism: Positive | This example text from a political party manifesto is in favour of international cooperation with other countries, for example mentions the need for aid to developing countries, or global governance | | Keynesian Demand Management | This example text from a political party manifesto is for keynesian demand management and demand side economic policies | | Labour Groups: Negative | This example text from a political party manifesto is negative towards labour groups and unions | | Labour Groups: Positive | This example text from a political party manifesto is positive towards labour groups, for example for good working conditions, fair wages or unions | | Law and Order: Positive | This example text from a political party manifesto is positive towards law and order and strict law enforcement | | Market Regulation | This example text from a political party manifesto is supports market regulation for a fair and open market, for example for consumer protection or for increased competition or for social market economy | | Marxist Analysis | This example text from a political party manifesto is positive towards Marxist-Leninist ideas or uses specific Marxist terminology | | Middle Class and Professional Groups | This example text from a political party manifesto favourably references the middle class, e.g. white colar groups or the service sector | | Military: Negative | This example text from a political party manifesto is negative towards the military, for example for decreasing military spending or disarmament | | Military: Positive | This example text from a political party manifesto is positive towards the military, for example for military spending or rearmament or military treaty obligations | | Multiculturalism: Negative | This example text from a political party manifesto is sceptical towards multiculturalism, or for cultural integration or appeals to cultural homogeneity in society | | Multiculturalism: Positive | This example text from a political party manifesto favourably mentions cultural diversity, for example for freedom of religion or linguistic heritages | | National Way of Life: Negative | This example text from a political party manifesto unfavourably mentions a country's nation and history, for example sceptical towards patriotism or national pride | | National Way of Life: Positive | This example text from a political party manifesto is positive towards the national way of life and history, for example pride of citizenship or appeals to patriotism | | Nationalisation | This example text from a political party manifesto is positive towards government ownership of industries or land or for economic nationalisation | | Non-economic Demographic Groups | This example text from a political party manifesto favourably mentions non-economic demographic groups like women, students or specific age groups | | Peace | This example text from a political party manifesto is positive towards peace and peaceful means of solving crises, for example in favour of negotiations and ending wars | | Political Authority | This example text from a political party manifesto mentions the speaker's competence to govern or other party's lack of such competence, or favourably mentions a strong/stable government | | Political Corruption | This example text from a political party manifesto is negative towards political corruption or abuse of political/bureaucratic power | | Protectionism: Negative | This example text from a political party manifesto is negative towards protectionism, in favour of free trade | | Protectionism: Positive | This example text from a political party manifesto is in favour of protectionism, for example tariffs, export subsidies | | Technology and Infrastructure: Positive | This example text from a political party manifesto is about technology and infrastructure, e.g. the importance of modernisation of industry, or supportive of public spending on infrastructure/tech | | Traditional Morality: Negative | This example text from a political party manifesto is negative towards traditional morality, for example against religious moral values, for divorce or abortion, for modern families or separation of church and state | | Traditional Morality: Positive | This example text from a political party manifesto is favourable towards traditional or religious values, for example for censorship of immoral behavour, for traditional family values or religious institutions | | Underprivileged Minority Groups | This example text from a political party manifesto favourably mentions underprivileged minorities, for example handicapped, homosexuals or immigrants | | Welfare State Expansion | This example text from a political party manifesto is positive towards the welfare state, e.g. health care, pensions or social housing | | Welfare State Limitation | This example text from a political party manifesto is for limiting the welfare state, for example public funding for social services or social security, e.g. private care before state care |
bge-m3-zeroshot-v2.0-c
roberta-base-zeroshot-v2.0-c
deberta-v3-large-zeroshot-v1
deberta-v3-large-zeroshot-v1 Model description The model is designed for zero-shot classification with the Hugging Face pipeline. The model should be substantially better at zero-shot classification than my other zero-shot models on the Hugging Face hub: https://huggingface.co/MoritzLaurer. The model can do one universal task: determine whether a hypothesis is `true` or `nottrue` given a text (also called `entailment` vs. `notentailment`). This task format is based on the Natural Language Inference task (NLI). The task is so universal that any classification task can be reformulated into the task. Training data The model was trained on a mixture of 27 tasks and 310 classes that have been reformatted into this universal format. 1. 26 classification tasks with ~400k texts: 'amazonpolarity', 'imdb', 'appreviews', 'yelpreviews', 'rottentomatoes', 'emotiondair', 'emocontext', 'empathetic', 'financialphrasebank', 'banking77', 'massive', 'wikitoxictoxicaggregated', 'wikitoxicobscene', 'wikitoxicthreat', 'wikitoxicinsult', 'wikitoxicidentityhate', 'hateoffensive', 'hatexplain', 'biasframesoffensive', 'biasframessex', 'biasframesintent', 'agnews', 'yahootopics', 'trueteacher', 'spam', 'wellformedquery'. See details on each dataset here: https://docs.google.com/spreadsheets/d/1Z18tMh02IiWgh6o8pfoMiILH4IXpr78wdnmNd5FaE/edit?usp=sharing 3. Five NLI datasets with ~885k texts: "mnli", "anli", "fever", "wanli", "ling" Note that compared to other NLI models, this model predicts two classes (`entailment` vs. `notentailment`) as opposed to three classes (entailment/neutral/contradiction) How to use the model Simple zero-shot classification pipeline Details on data and training The code for preparing the data and training & evaluating the model is fully open-source here: https://github.com/MoritzLaurer/zeroshot-classifier/tree/main Limitations and bias The model can only do text classification tasks. Please consult the original DeBERTa paper and the papers for the different datasets for potential biases. License The base model (DeBERTa-v3) is published under the MIT license. The datasets the model was fine-tuned on are published under a diverse set of licenses. The following spreadsheet provides an overview of the non-NLI datasets used for fine-tuning. The spreadsheets contains information on licenses, the underlying papers etc.: https://docs.google.com/spreadsheets/d/1Z18tMh02IiWgh6o8pfoMiILH4IXpr78wdnmNd5FaE/edit?usp=sharing In addition, the model was also trained on the following NLI datasets: MNLI, ANLI, WANLI, LING-NLI, FEVER-NLI. Ideas for cooperation or questions? If you have questions or ideas for cooperation, contact me at m{dot}laurer{at}vu{dot}nl or LinkedIn Debugging and issues Note that DeBERTa-v3 was released on 06.12.21 and older versions of HF Transformers seem to have issues running the model (e.g. resulting in an issue with the tokenizer). Using Transformers>=4.13 might solve some issues.
deberta-v3-base-zeroshot-v1
deberta-v3-base-zeroshot-v1 Model description The model is designed for zero-shot classification with the Hugging Face pipeline. The model should be substantially better at zero-shot classification than my other zero-shot models on the Hugging Face hub: https://huggingface.co/MoritzLaurer. The model can do one universal task: determine whether a hypothesis is `true` or `nottrue` given a text (also called `entailment` vs. `notentailment`). This task format is based on the Natural Language Inference task (NLI). The task is so universal that any classification task can be reformulated into the task. Training data The model was trained on a mixture of 27 tasks and 310 classes that have been reformatted into this universal format. 1. 26 classification tasks with ~400k texts: 'amazonpolarity', 'imdb', 'appreviews', 'yelpreviews', 'rottentomatoes', 'emotiondair', 'emocontext', 'empathetic', 'financialphrasebank', 'banking77', 'massive', 'wikitoxictoxicaggregated', 'wikitoxicobscene', 'wikitoxicthreat', 'wikitoxicinsult', 'wikitoxicidentityhate', 'hateoffensive', 'hatexplain', 'biasframesoffensive', 'biasframessex', 'biasframesintent', 'agnews', 'yahootopics', 'trueteacher', 'spam', 'wellformedquery'. See details on each dataset here: https://docs.google.com/spreadsheets/d/1Z18tMh02IiWgh6o8pfoMiILH4IXpr78wdnmNd5FaE/edit?usp=sharing 3. Five NLI datasets with ~885k texts: "mnli", "anli", "fever", "wanli", "ling" Note that compared to other NLI models, this model predicts two classes (`entailment` vs. `notentailment`) as opposed to three classes (entailment/neutral/contradiction) How to use the model Simple zero-shot classification pipeline Details on data and training The code for preparing the data and training & evaluating the model is fully open-source here: https://github.com/MoritzLaurer/zeroshot-classifier/tree/main Limitations and bias The model can only do text classification tasks. Please consult the original DeBERTa paper and the papers for the different datasets for potential biases. License The base model (DeBERTa-v3) is published under the MIT license. The datasets the model was fine-tuned on are published under a diverse set of licenses. The following spreadsheet provides an overview of the non-NLI datasets used for fine-tuning. The spreadsheets contains information on licenses, the underlying papers etc.: https://docs.google.com/spreadsheets/d/1Z18tMh02IiWgh6o8pfoMiILH4IXpr78wdnmNd5FaE/edit?usp=sharing In addition, the model was also trained on the following NLI datasets: MNLI, ANLI, WANLI, LING-NLI, FEVER-NLI. Ideas for cooperation or questions? If you have questions or ideas for cooperation, contact me at m{dot}laurer{at}vu{dot}nl or LinkedIn Debugging and issues Note that DeBERTa-v3 was released on 06.12.21 and older versions of HF Transformers seem to have issues running the model (e.g. resulting in an issue with the tokenizer). Using Transformers>=4.13 might solve some issues.
multilingual-MiniLMv2-L6-mnli-xnli
xtremedistil-l6-h256-zeroshot-v1.1-all-33
DeBERTa-v3-small-mnli-fever-docnli-ling-2c
roberta-large-zeroshot-v2.0-c
deberta-v3-large-zeroshot-v2.0-c
Model description: deberta-v3-large-zeroshot-v2.0-c zeroshot-v2.0 series of models Models in this series are designed for efficient zeroshot classification with the Hugging Face pipeline. These models can do classification without training data and run on both GPUs and CPUs. An overview of the latest zeroshot classifiers is available in my Zeroshot Classifier Collection. The main update of this `zeroshot-v2.0` series of models is that several models are trained on fully commercially-friendly data for users with strict license requirements. These models can do one universal classification task: determine whether a hypothesis is "true" or "not true" given a text (`entailment` vs. `notentailment`). This task format is based on the Natural Language Inference task (NLI). The task is so universal that any classification task can be reformulated into this task by the Hugging Face pipeline. Training data Models with a "`-c`" in the name are trained on two types of fully commercially-friendly data: 1. Synthetic data generated with Mixtral-8x7B-Instruct-v0.1. I first created a list of 500+ diverse text classification tasks for 25 professions in conversations with Mistral-large. The data was manually curated. I then used this as seed data to generate several hundred thousand texts for these tasks with Mixtral-8x7B-Instruct-v0.1. The final dataset used is available in the syntheticzeroshotmixtralv0.1 dataset in the subset `mixtralwrittentextfortasksv4`. Data curation was done in multiple iterations and will be improved in future iterations. 2. Two commercially-friendly NLI datasets: (MNLI, FEVER-NLI). These datasets were added to increase generalization. 3. Models without a "`-c`" in the name also included a broader mix of training data with a broader mix of licenses: ANLI, WANLI, LingNLI, and all datasets in this list where `usedinv1.1==True`. `multilabel=False` forces the model to decide on only one class. `multilabel=True` enables the model to choose multiple classes. The models were evaluated on 28 different text classification tasks with the f1macro metric. The main reference point is `facebook/bart-large-mnli` which is, at the time of writing (03.04.24), the most used commercially-friendly 0-shot classifier. | | facebook/bart-large-mnli | roberta-base-zeroshot-v2.0-c | roberta-large-zeroshot-v2.0-c | deberta-v3-base-zeroshot-v2.0-c | deberta-v3-base-zeroshot-v2.0 (fewshot) | deberta-v3-large-zeroshot-v2.0-c | deberta-v3-large-zeroshot-v2.0 (fewshot) | bge-m3-zeroshot-v2.0-c | bge-m3-zeroshot-v2.0 (fewshot) | |:---------------------------|---------------------------:|-----------------------------:|------------------------------:|--------------------------------:|-----------------------------------:|---------------------------------:|------------------------------------:|-----------------------:|--------------------------:| | all datasets mean | 0.497 | 0.587 | 0.622 | 0.619 | 0.643 (0.834) | 0.676 | 0.673 (0.846) | 0.59 | (0.803) | | amazonpolarity (2) | 0.937 | 0.924 | 0.951 | 0.937 | 0.943 (0.961) | 0.952 | 0.956 (0.968) | 0.942 | (0.951) | | imdb (2) | 0.892 | 0.871 | 0.904 | 0.893 | 0.899 (0.936) | 0.923 | 0.918 (0.958) | 0.873 | (0.917) | | appreviews (2) | 0.934 | 0.913 | 0.937 | 0.938 | 0.945 (0.948) | 0.943 | 0.949 (0.962) | 0.932 | (0.954) | | yelpreviews (2) | 0.948 | 0.953 | 0.977 | 0.979 | 0.975 (0.989) | 0.988 | 0.985 (0.994) | 0.973 | (0.978) | | rottentomatoes (2) | 0.83 | 0.802 | 0.841 | 0.84 | 0.86 (0.902) | 0.869 | 0.868 (0.908) | 0.813 | (0.866) | | emotiondair (6) | 0.455 | 0.482 | 0.486 | 0.459 | 0.495 (0.748) | 0.499 | 0.484 (0.688) | 0.453 | (0.697) | | emocontext (4) | 0.497 | 0.555 | 0.63 | 0.59 | 0.592 (0.799) | 0.699 | 0.676 (0.81) | 0.61 | (0.798) | | empathetic (32) | 0.371 | 0.374 | 0.404 | 0.378 | 0.405 (0.53) | 0.447 | 0.478 (0.555) | 0.387 | (0.455) | | financialphrasebank (3) | 0.465 | 0.562 | 0.455 | 0.714 | 0.669 (0.906) | 0.691 | 0.582 (0.913) | 0.504 | (0.895) | | banking77 (72) | 0.312 | 0.124 | 0.29 | 0.421 | 0.446 (0.751) | 0.513 | 0.567 (0.766) | 0.387 | (0.715) | | massive (59) | 0.43 | 0.428 | 0.543 | 0.512 | 0.52 (0.755) | 0.526 | 0.518 (0.789) | 0.414 | (0.692) | | wikitoxictoxicaggreg (2) | 0.547 | 0.751 | 0.766 | 0.751 | 0.769 (0.904) | 0.741 | 0.787 (0.911) | 0.736 | (0.9) | | wikitoxicobscene (2) | 0.713 | 0.817 | 0.854 | 0.853 | 0.869 (0.922) | 0.883 | 0.893 (0.933) | 0.783 | (0.914) | | wikitoxicthreat (2) | 0.295 | 0.71 | 0.817 | 0.813 | 0.87 (0.946) | 0.827 | 0.879 (0.952) | 0.68 | (0.947) | | wikitoxicinsult (2) | 0.372 | 0.724 | 0.798 | 0.759 | 0.811 (0.912) | 0.77 | 0.779 (0.924) | 0.783 | (0.915) | | wikitoxicidentityhate (2) | 0.473 | 0.774 | 0.798 | 0.774 | 0.765 (0.938) | 0.797 | 0.806 (0.948) | 0.761 | (0.931) | | hateoffensive (3) | 0.161 | 0.352 | 0.29 | 0.315 | 0.371 (0.862) | 0.47 | 0.461 (0.847) | 0.291 | (0.823) | | hatexplain (3) | 0.239 | 0.396 | 0.314 | 0.376 | 0.369 (0.765) | 0.378 | 0.389 (0.764) | 0.29 | (0.729) | | biasframesoffensive (2) | 0.336 | 0.571 | 0.583 | 0.544 | 0.601 (0.867) | 0.644 | 0.656 (0.883) | 0.541 | (0.855) | | biasframessex (2) | 0.263 | 0.617 | 0.835 | 0.741 | 0.809 (0.922) | 0.846 | 0.815 (0.946) | 0.748 | (0.905) | | biasframesintent (2) | 0.616 | 0.531 | 0.635 | 0.554 | 0.61 (0.881) | 0.696 | 0.687 (0.891) | 0.467 | (0.868) | | agnews (4) | 0.703 | 0.758 | 0.745 | 0.68 | 0.742 (0.898) | 0.819 | 0.771 (0.898) | 0.687 | (0.892) | | yahootopics (10) | 0.299 | 0.543 | 0.62 | 0.578 | 0.564 (0.722) | 0.621 | 0.613 (0.738) | 0.587 | (0.711) | | trueteacher (2) | 0.491 | 0.469 | 0.402 | 0.431 | 0.479 (0.82) | 0.459 | 0.538 (0.846) | 0.471 | (0.518) | | spam (2) | 0.505 | 0.528 | 0.504 | 0.507 | 0.464 (0.973) | 0.74 | 0.597 (0.983) | 0.441 | (0.978) | | wellformedquery (2) | 0.407 | 0.333 | 0.333 | 0.335 | 0.491 (0.769) | 0.334 | 0.429 (0.815) | 0.361 | (0.718) | | manifesto (56) | 0.084 | 0.102 | 0.182 | 0.17 | 0.187 (0.376) | 0.258 | 0.256 (0.408) | 0.147 | (0.331) | | capsotu (21) | 0.34 | 0.479 | 0.523 | 0.502 | 0.477 (0.664) | 0.603 | 0.502 (0.686) | 0.472 | (0.644) | These numbers indicate zeroshot performance, as no data from these datasets was added in the training mix. Note that models without a "`-c`" in the title were evaluated twice: one run without any data from these 28 datasets to test pure zeroshot performance (the first number in the respective column) and the final run including up to 500 training data points per class from each of the 28 datasets (the second number in brackets in the column, "fewshot"). No model was trained on test data. Details on the different datasets are available here: https://github.com/MoritzLaurer/zeroshot-classifier/blob/main/v1humandata/datasetsoverview.csv - deberta-v3-zeroshot vs. roberta-zeroshot: deberta-v3 performs clearly better than roberta, but it is a bit slower. roberta is directly compatible with Hugging Face's production inference TEI containers and flash attention. These containers are a good choice for production use-cases. tl;dr: For accuracy, use a deberta-v3 model. If production inference speed is a concern, you can consider a roberta model (e.g. in a TEI container and HF Inference Endpoints). - commercial use-cases: models with "`-c`" in the title are guaranteed to be trained on only commercially-friendly data. Models without a "`-c`" were trained on more data and perform better, but include data with non-commercial licenses. Legal opinions diverge if this training data affects the license of the trained model. For users with strict legal requirements, the models with "`-c`" in the title are recommended. - Multilingual/non-English use-cases: use bge-m3-zeroshot-v2.0 or bge-m3-zeroshot-v2.0-c. Note that multilingual models perform worse than English-only models. You can therefore also first machine translate your texts to English with libraries like EasyNMT and then apply any English-only model to the translated data. Machine translation also facilitates validation in case your team does not speak all languages in the data. - context window: The `bge-m3` models can process up to 8192 tokens. The other models can process up to 512. Note that longer text inputs both make the mode slower and decrease performance, so if you're only working with texts of up to 400~ words / 1 page, use e.g. a deberta model for better performance. - The latest updates on new models are always available in the Zeroshot Classifier Collection. Reproduction code is available in the `v2syntheticdata` directory here: https://github.com/MoritzLaurer/zeroshot-classifier/tree/main Limitations and bias The model can only do text classification tasks. Biases can come from the underlying foundation model, the human NLI training data and the synthetic data generated by Mixtral. License The foundation model was published under the MIT license. The licenses of the training data vary depending on the model, see above. This model is an extension of the research described in this paper. Ideas for cooperation or questions? If you have questions or ideas for cooperation, contact me at moritz{at}huggingface{dot}co or LinkedIn Flexible usage and "prompting" You can formulate your own hypotheses by changing the `hypothesistemplate` of the zeroshot pipeline. Similar to "prompt engineering" for LLMs, you can test different formulations of your `hypothesistemplate` and verbalized classes to improve performance.
multilingual-MiniLMv2-L12-mnli-xnli
ernie-m-large-mnli-xnli
ModernBERT-base-zeroshot-v2.0
This model is answerdotai/ModernBERT-base fine-tuned on the same dataset mix as the `zeroshot-v2.0` models in the Zeroshot Classifiers Collection. General takeaways: - The model is very fast and memory efficient. It's multiple times faster and consumes multiple times less memory than DeBERTav3. The memory efficiency enables larger batch sizes. I got a ~2x speed increase by enabling bf16 (instead of fp16). - It performs slightly worse then DeBERTav3 on average on the tasks tested below. - I'm in the process of preparing a newer version trained on better synthetic data to make full use of the 8k context window and to update the training mix of the older `zeroshot-v2.0` models. |Datasets|Mean|Mean w/o NLI|mnlim|mnlimm|fevernli|anlir1|anlir2|anlir3|wanli|lingnli|wellformedquery|rottentomatoes|amazonpolarity|imdb|yelpreviews|hatexplain|massive|banking77|emotiondair|emocontext|empathetic|agnews|yahootopics|biasframessex|biasframesoffensive|biasframesintent|financialphrasebank|appreviews|hateoffensive|trueteacher|spam|wikitoxictoxicaggregated|wikitoxicobscene|wikitoxicidentityhate|wikitoxicthreat|wikitoxicinsult|manifesto|capsotu| | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | |Accuracy|0.831|0.835|0.932|0.936|0.884|0.763|0.647|0.657|0.823|0.889|0.753|0.864|0.949|0.935|0.974|0.798|0.788|0.727|0.789|0.793|0.489|0.893|0.717|0.927|0.851|0.859|0.907|0.952|0.926|0.726|0.978|0.912|0.914|0.93|0.951|0.906|0.476|0.708| |F1 macro|0.813|0.818|0.925|0.93|0.872|0.74|0.61|0.611|0.81|0.874|0.751|0.864|0.949|0.935|0.974|0.751|0.738|0.746|0.733|0.798|0.475|0.893|0.712|0.919|0.851|0.859|0.892|0.952|0.847|0.721|0.966|0.912|0.914|0.93|0.942|0.906|0.329|0.637| |Inference text/sec (A100 40GB GPU, batch=128)|3472.0|3474.0|2338.0|4416.0|2993.0|2959.0|2904.0|3003.0|4647.0|4486.0|5032.0|4354.0|2466.0|1140.0|1582.0|4392.0|5446.0|5296.0|4904.0|4787.0|2251.0|4042.0|1884.0|4048.0|4032.0|4121.0|4275.0|3746.0|4485.0|1114.0|4322.0|2260.0|2274.0|2189.0|2085.0|2410.0|3933.0|4388.0| The following hyperparameters were used during training: - learningrate: 5e-05 - trainbatchsize: 32 - evalbatchsize: 128 - seed: 42 - optimizer: Use adamwtorch with betas=(0.9,0.999) and epsilon=1e-08 and optimizerargs=No additional optimizer arguments - lrschedulertype: linear - lrschedulerwarmupratio: 0.06 - numepochs: 2 - Transformers 4.48.0.dev0 - Pytorch 2.5.1+cu124 - Datasets 3.2.0 - Tokenizers 0.21.0
deberta-v3-base-zeroshot-v2.0-c
xlm-v-base-mnli-xnli
DeBERTa-v3-base-mnli-fever-docnli-ling-2c
policy-distilbert-7d
xtremedistil-l6-h256-mnli-fever-anli-ling-binary
deberta-v3-large-mnli-fever-anli-ling-wanli-binary
deberta-v3-base-mnli-fever-anli-ling-wanli-binary
MiniLM-L6-mnli-fever-docnli-ling-2c
MiniLM-L6-mnli
parler-tts-mini-v1.1
Parler-TTS Mini v1.1 is a lightweight text-to-speech (TTS) model, trained on 45K hours of audio data, that can generate high-quality, natural sounding speech with features that can be controlled using a simple text prompt (e.g. gender, background noise, speaking rate, pitch and reverberation). π¨ Parler-TTS Mini v1.1 is the exact same model than Mini v1. It was trained on the same datasets and with the same training configuration. The only change is the use of a better prompt tokenizer. This tokenizer has a larger vocabulary and handles byte fallback, which simplifies multilingual training. It's based on unsloth/llama-2-7b tokenizer. Thanks to the AI4Bharat team who provided advice and assistance in improving tokenization. π¨ π Quick Index π¨βπ» Installation π² Using a random voice π― Using a specific speaker Motivation Optimizing inference π¨Unlike previous versions of Parler-TTS, here we use two tokenizers - one for the prompt and one for the description.π¨ Using Parler-TTS is as simple as "bonjour". Simply install the library once: Parler-TTS has been trained to generate speech with features that can be controlled with a simple text prompt, for example: To ensure speaker consistency across generations, this checkpoint was also trained on 34 speakers, characterized by name (e.g. Jon, Lea, Gary, Jenna, Mike, Laura). To take advantage of this, simply adapt your text description to specify which speaker to use: `Jon's voice is monotone yet slightly fast in delivery, with a very close recording that almost has no background noise.` Tips: We've set up an inference guide to make generation faster. Think SDPA, torch.compile, batching and streaming! Include the term "very clear audio" to generate the highest quality audio, and "very noisy audio" for high levels of background noise Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech The remaining speech features (gender, speaking rate, pitch and reverberation) can be controlled directly through the prompt Parler-TTS is a reproduction of work from the paper Natural language guidance of high-fidelity text-to-speech with synthetic annotations by Dan Lyth and Simon King, from Stability AI and Edinburgh University respectively. Contrarily to other TTS models, Parler-TTS is a fully open-source release. All of the datasets, pre-processing, training code and weights are released publicly under permissive license, enabling the community to build on our work and develop their own powerful TTS models. Parler-TTS was released alongside: The Parler-TTS repository - you can train and fine-tuned your own version of the model. The Data-Speech repository - a suite of utility scripts designed to annotate speech datasets. The Parler-TTS organization - where you can find the annotated datasets as well as the future checkpoints. If you found this repository useful, please consider citing this work and also the original Stability AI paper: This model is permissively licensed under the Apache 2.0 license.