colbert-xm
18.5K
67
80 languages
license:mit
by
antoinelouis
Embedding Model
OTHER
Fair
18K downloads
Community-tested
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary
🛠️ Usage | 📊 Evaluation | 🤖 Training | 🔗 Citation This is a ColBERT model that can be used for semantic search in many languages.
Code Examples
Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Usagebash
pip install git+https://github.com/stanford-futuredata/ColBERT.git@main torchtorch==2.1.2 faiss-gpu==1.7.2 langdetect==1.0.9Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Use of custom modules that automatically detect the language of the passages to index and activate tpython
# Use of custom modules that automatically detect the language of the passages to index and activate the language-specific adapters accordingly
from .custom import CustomIndexer, CustomSearcher
from colbert.infra import Run, RunConfig
n_gpu: int = 1 # Set your number of available GPUs
experiment: str = "colbert" # Name of the folder where the logs and created indices will be stored
index_name: str = "my_index" # The name of your index, i.e. the name of your vector database
documents: list = ["Ceci est un premier document.", "Voici un second document.", "etc."] # Corpus
# Step 1: Indexing. This step encodes all passages into matrices, stores them on disk, and builds data structures for efficient search.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
indexer = CustomIndexer(checkpoint="antoinelouis/colbert-xm")
indexer.index(name=index_name, collection=documents)
# Step 2: Searching. Given the model and index, you can issue queries over the collection to retrieve the top-k passages for each query.
with Run().context(RunConfig(nranks=n_gpu,experiment=experiment)):
searcher = CustomSearcher(index=index_name) # You don't need to specify checkpoint again, the model name is stored in the index.
results = searcher.search(query="Comment effectuer une recherche avec ColBERT ?", k=10)
# results: tuple of tuples of length k containing ((passage_id, passage_rank, passage_score), ...)Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.