SamLowe

4 models • 1 total models in database

Sort by:

roberta-base-go_emotions

--- language: en tags: - text-classification - pytorch - roberta - emotions - multi-class-classification - multi-label-classification datasets: - go_emotions license: mit widget: - text: I am not having a great day. ---

license:mit

470,272

627

roberta-base-go_emotions-onnx

This model is the ONNX version of https://huggingface.co/SamLowe/roberta-base-goemotions. `onnx/model.onnx` is the full precision ONNX version - that has identical accuracy/metrics to the original Transformers model - and has the same model size (499MB) - is faster in inference than normal Transformers, particularly for smaller batch sizes - in my tests about 2x to 3x as fast for a batch size of 1 on a 8 core 11th gen i7 CPU using ONNXRuntime Using a fixed threshold of 0.5 to convert the scores to binary predictions for each label: - Accuracy: 0.474 - Precision: 0.575 - Recall: 0.396 - F1: 0.450 See more details in the SamLowe/roberta-base-goemotions model card for the increases possible through selecting label-specific thresholds to maximise F1 scores, or another metric. `onnx/modelquantized.onnx` is the int8 quantized version - that is one quarter the size (125MB) of the full precision model (above) - but delivers almost all of the accuracy - is faster in inference than both the full precision ONNX above, and the normal Transformers model - about 2x as fast for a batch size of 1 on an 8 core 11th gen i7 CPU using ONNXRuntime vs the full precision model above - which makes it circa 5x as fast as the full precision normal Transformers model (on the above mentioned CPU, for a batch of 1) Using a fixed threshold of 0.5 to convert the scores to binary predictions for each label: - Accuracy: 0.475 - Precision: 0.582 - Recall: 0.398 - F1: 0.447 Note how the metrics are almost identical to the full precision metrics above. See more details in the SamLowe/roberta-base-goemotions model card for the increases possible through selecting label-specific thresholds to maximise F1 scores, or another metric. Optimum library has equivalents (starting `ORT`) for the main Transformers classes, so these models can be used with the familiar constructs. The only extra property needed is `filename` on the model creation, which in the below example specifies the quantized (INT8) model. - Tokenization can be done before with the `tokenizers` library, - and then the fed into ONNXRuntime as the type of dict it uses, - and then simply the postprocessing sigmoid is needed afterward on the model output (which comes as a numpy array) to create the embeddings. Example notebook: showing usage, accuracy & performance

license:mit

51,381

universal-sentence-encoder-multilingual-3-onnx

license:apache-2.0

universal-sentence-encoder-large-5-onnx

license:apache-2.0