9pinus
Macbert Base Chinese Medicine Recognition
Model description This model is a fine-tuned version of bert-base-chinese for the purpose of medicine name recognition. We fine-tuned bert-base-chinese on a 500M dataset including 100K+ authorized medical articles on which we labeled all the medicine names. The model achieves 92% accuracy on our test dataset. - Transformers 4.15.0 - Pytorch 1.10.1+cu113 - Datasets 1.17.0 - Tokenizers 0.10.3
Macbert Base Chinese Medical Collation
This model is a fine-tuned version of macbert for the purpose of spell checking in medical application scenarios. We fine-tuned macbert Chinese base version on a 300M dataset including 60K+ authorized medical articles. We proposed to randomly confuse 30% sentences of these articles by adding noise with a either visually or phonologically resembled characters. Consequently, the fine-tuned model can achieve 96% accuracy on our test dataset. You can use this model directly with a pipeline for token classification: - Transformers 4.15.0 - Pytorch 1.10.1+cu113 - Datasets 1.17.0 - Tokenizers 0.10.3