redlessone
DermLIP_ViT-B-16
DermLIP is a vision-language model for dermatology, trained on the Derm1M dataset—the largest dermatological image-text corpus to date. - Model Type: Pretrained Vision-Language Model (CLIP-style) -...
DermLIP PanDerm Base W PubMed 256
DermLIP is a vision-language model for dermatology, trained on the Derm1M dataset—the largest dermatological image-text corpus to date. This model variant (`PanDerm-base-w-PubMed-256`) utilizes domain-specific pretraining to deliver superior performance compared to other DermLIP variants.. - Model Type: Pretrained Vision-Language Model (CLIP-style) - Vision encoder (PanDerm-base): https://github.com/SiyuanYan1/PanDerm - Text encoder (PubmedBert-256): https://huggingface.co/NeuML/pubmedbert-base-embeddings - Training data: 403,563 skin image-text pairs from Derm1M datasets. Images include both dermoscopic and clinical images. - Training objective: image-text contrastive loss - Hardware: 1 x Nvidia H200(~90GB memory usage) - Hours used: ~9.5 hours - Zero-shot classification - Few-shot learning - Cross-modal retrieval - Concept annotation/explanation Then install the package following the instruction in the repository. For any additional questions or comments, contact Siyuan Yan (`[email protected]`),