knowledgator

96 models • 4 total models in database

Sort by:

SMILES2IUPAC-canonical-base

SMILES2IUPAC-canonical-base was designed to accurately translate SMILES chemical names to IUPAC standards. SMILES2IUPAC-canonical-base is based on the MT5 model with optimizations in implementing different tokenizers for the encoder and decoder. - Developed by: Knowladgator Engineering - Model type: Encoder-Decoder with attention mechanism - Language(s) (NLP): SMILES, IUPAC (English) - License: Apache License 2.0 Model Sources - Paper: coming soon - Demo: ChemicalConverters SMILES to IUPAC ! Preferred IUPAC style To choose the preferred IUPAC style, place style tokens before your SMILES sequence. | Style Token | Description | |-------------|----------------------------------------------------------------------------------------------------| | ` ` | The most known name of the substance, sometimes is the mixture of traditional and systematic style | | ` ` | The totally systematic style without trivial names | | ` ` | The style is based on trivial names of the parts of substances | Validation SMILES to IUPAC translations It's possible to validate the translations by reverse translation into IUPAC and calculating Tanimoto similarity of two molecules fingerprints. ` ` The larger is Tanimoto similarity, the larger is probability, that the prediction was correct. This model has limited accuracy in processing large molecules and currently, doesn't support isomeric and isotopic SMILES. The model was trained on 100M examples of SMILES-IUPAC pairs with lr=0.00001, batchsize=512 for 2 epochs. | Model | Accuracy | BLEU-4 score | Size(MB) | |-------------------------------------|---------|------------------|----------| | SMILES2IUPAC-canonical-small |75% |0.93 |23 | | SMILES2IUPAC-canonical-base |86.9% |0.964 |180 | | STOUT V2.0\ |66.65% |0.92 |128 | | STOUT V2.0 (according to our tests) | |0.89 |128 | According to the original paper https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00512-4

knowledgator

SMILES2IUPAC-canonical-base

gliner-multitask-v1.0

modern-gliner-bi-large-v1.0

comprehend_it-base

gliclass-edge-v3.0

gliclass-base-v3.0

gliclass-large-v3.0

gliner-pii-base-v1.0

gliclass-small-v1.0

Gliner Multitask Large V0.5

SMILES2IUPAC-canonical-small

Llama-encoder-1.0B

Gliner Pii Large V1.0

gliner-pii-edge-v1.0

gliclass-modern-large-v3.0

gliner-pii-small-v1.0

gliclass-base-v2.0-rac-init

gliclass-modern-base-v3.0

gliner-x-large

gliclass-large-v1.0

gliner-poly-small-v1.0

comprehend_it-multilingual-t5-base

Qwen-encoder-0.5B

IUPAC2SMILES-canonical-base

gliclass-modern-base-v2.0-init

t5-for-ie

gliner-bi-edge-v2.0

gliclass-base-v1.0-lw

UTC DeBERTa Large V2

gliner-linker-large-v1.0

Qwen-encoder-1.5B

gliner-relex-large-v1.0

UTC-DeBERTa-small-v2

gliclass-modern-large-v2.0

gliclass-modern-large-v2.0-init

gliner-x-small-v0.5

flan-t5-small-for-classification

flan-t5-large-for-classification

gliner-relex-large-v0.5

gliner-linker-base-v1.0

gliclass-base-v1.0

gliclass-qwen-1.5B-v1.0

gliclass-large-v1.0-lw

gliner-x-base

gliclass-x-base

gliner-bi-large-v1.0

gliclass-large-v1.0-init

gliner-linker-rerank-v1.0

gliclass-base-v1.0-init

Qwen2-0.5Bchp-test1

retrico-lm-2b-sft-gemma

gliner-decoder-large-v1.0

modern-gliner-bi-base-v1.0

gliner-bi-small-v1.0

gliner-bi-small-v2.0

retrico-lm-gemma4-grpo-2b

gliner-x-small

gliner-decoder-small-v1.0

gliclass-small-v1.0-init

gliner-x-base-v0.5

gliclass-qwen-0.5B-v1.0

UTC-DeBERTa-base-v2

SMILES-DeBERTa-small

gliner-llama-1.3B-v1.0

gliner-qwen-0.5B-v1.0

gliclass_msmarco_merged

gliner-qwen-1.5B-v1.0

gliclass-modern-base-v2.0

gliclass-llama-1.3B-v1.0

gliclass-small-v1.0-lw

gliner-decoder-base-v1.0

SMILES-DeBERTa-base

gliner-bi-llama-v1.0

UTC-DeBERTa-large

UTC-DeBERTa-small

UTC-DeBERTA-base

gliner-llama-1B-v1.0

IUPAC2SMILES-canonical-small

gliner-bi-base-v1.0