BindPred: Gradient Boosted Trees on ESM2 Embeddings
Model Overview The BindPred model is a Gradient Boosted Trees (GBT) regressor trained on ESM2 embeddings from Meta’s ESM2 protein language model. It is designed for binding affinity predictive tasks. Pretrained Colab Notebook:https://colab.research.google.com/drive/1ndzICxVBUUBHffmi0KDtUXaKaMtqTz55
Predicts binding affinity between ACE2 (human and animals) and RBD proteins.
General-purpose GBT model trained on ESM2 embeddings.
• Architecture: Gradient Boosted Trees (CatBoostRegressor)
modelpath = hfhubdownload(repoid="hbp5181/BindPred", filename="ESM2BindPred.cbm")
• Feature Extraction: ESM2 embeddings (33-layer transformer, 650M params)
ACE2 RBD: https://github.com/jbloomlab/SARSr-CoVhomologsurvey
• The model is trained on ESM2 embeddings and is limited by the quality of those embeddings.
• Performance depends on the training dataset used.
• Not a deep-learning model; instead, it leverages GBTs for fast, interpretable predictions.