gte-multilingual-base
1.4M
329
8K
GPT-3 class
277M
76 languages
license:apache-2.0
by
Alibaba-NLP
Embedding Model
OTHER
High
1.4M downloads
Battle-tested
Edge AI:
Mobile
Laptop
Server
1GB+ RAM
Mobile
Laptop
Server
Quick Summary
--- tags: - mteb - sentence-transformers - transformers - multilingual - sentence-similarity - text-embeddings-inference license: apache-2.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
1GB+ RAM
Code Examples
Get Dense Embeddings with Transformerspythontransformers
# Requires transformers>=4.36.0
import torch.nn.functional as F
from transformers import AutoModel, AutoTokenizer
input_texts = [
"what is the capital of China?",
"how to implement quick sort in python?",
"北京",
"快排算法介绍"
]
model_name_or_path = 'Alibaba-NLP/gte-multilingual-base'
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = AutoModel.from_pretrained(model_name_or_path, trust_remote_code=True)
# Tokenize the input texts
batch_dict = tokenizer(input_texts, max_length=8192, padding=True, truncation=True, return_tensors='pt')
outputs = model(**batch_dict)
dimension=768 # The output dimension of the output embedding, should be in [128, 768]
embeddings = outputs.last_hidden_state[:, 0][:dimension]
embeddings = F.normalize(embeddings, p=2, dim=1)
scores = (embeddings[:1] @ embeddings[1:].T) * 100
print(scores.tolist())
# [[0.3016996383666992, 0.7503870129585266, 0.3203084468841553]]Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.