haoranxu

19 models • 1 total models in database

Sort by:

X ALMA

X-ALMA builds upon ALMA-R by expanding support from 6 to 50 languages. It utilizes a plug-and-play architecture with language-specific modules, complemented by a carefully designed training recipe. This release includes the the complete X-ALMA model that contains the X-ALMA pre-trained base model and all its language-specific modules. X-ALMA supports 50 languages: en,da,nl,de,is,no,sv,af,ca,ro,gl,it,pt,es,bg,mk,sr,uk,ru,id,ms,th,vi,mg,fr,hu,el,cs,pl,lt,lv,ka,zh,ja,ko,fi,et,gu,hi,mr,ne,ur,az,kk,ky,tr,uz,ar,he,fa, ensuring their high-performance in translation, regardless of thir resource level. All X-ALMA checkpoints are released at huggingface: | Models | Model Link | Description | |:-------------:|:---------------:|:---------------:| | X-ALMA | haoranxu/X-ALMA) | X-ALMA model with all its modules | | X-ALMA-13B-Pretrain | haoranxu/X-ALMA-13B-Pretrain | X-ALMA 13B multilingual pre-trained base model | | X-ALMA-Group1 | haoranxu/X-ALMA-13B-Group1 | X-ALMA group1 specific module and the merged model | | X-ALMA-Group2 | haoranxu/X-ALMA-13B-Group2 | X-ALMA group2 specific module and the merged model | | X-ALMA-Group3 | haoranxu/X-ALMA-13B-Group3 | X-ALMA group3 specific module and the merged model | | X-ALMA-Group4 | haoranxu/X-ALMA-13B-Group4 | X-ALMA group4 specific module and the merged model | | X-ALMA-Group5 | haoranxu/X-ALMA-13B-Group5 | X-ALMA group5 specific module and the merged model | | X-ALMA-Group6 | haoranxu/X-ALMA-13B-Group6 | X-ALMA group6 specific module and the merged model | | X-ALMA-Group7 | haoranxu/X-ALMA-13B-Group7 | X-ALMA group7 specific module and the merged model | | X-ALMA-Group8 | haoranxu/X-ALMA-13B-Group8 | X-ALMA group8 specific module and the merged model | A quick start: There are three ways to load X-ALMA for translation. An example of translating "我爱机器翻译。" into English (X-ALMA should also able to do multilingual open-ended QA). The first way: loading the merged model where the language-specific module has been merged into the base model (Recommended): The second way: loading the base model and language-specific module (Recommended): The third way: loading the base model with all language-specific modules like MoE: (Require large GPU memory)

NaNK

license:mit