mesolitica

192 models • 1 total models in database
Sort by:

llama2-embedding-1b-8k

--- language: - ms ---

NaNK
llama
202,579
2

wav2vec2-xls-r-300m-mixed

49,195
5

translation-t5-small-standard-bahasa-cased-v2

Trained on 1536 context length, able to translate malay, pasar malay (social media texts or local context), english, manglish, javanese, banjarese and indonesian to target language. It also able to maintain the text structure as it is and only translate necessary texts, eg, programming code. Added more coding translation dataset, noisy b.cari.com.my translation, noisy ChatGPT4 translation and heavy postfilter.

33,102
1

sentiment-analysis-nanot5-small-malaysian-cased

16,364
0

Malaysian-whisper-large-v3-turbo-v3

3,356
9

malay-parler-tts-mini-v1

1,285
1

Malaysian-Qwen2.5-7B-Reasoning-SFT

NaNK
776
1

malaysian-whisper-small-v2

401
0

emotion-analysis-nanot5-small-malaysian-cased

379
0

bert-base-standard-bahasa-cased

355
0

llama2-embedding-2b-8k-contrastive

NaNK
llama
319
2

malaysian-whisper-small-v3

315
2

mallam-3b-20k-instructions

NaNK
264
0

Malaysian-Podcast-Dia-1.6B

Full parameter finetuning nari-labs/Dia-1.6B on Malaysian Podcast from mesolitica/Malaysian-Emilia where the permutation for voice conversion only select 80% similar. Complete tutorial how to use at mesolitica/malaya-speech/Dia-TTS. 1. The finetuning done in FP32-BF16 mixed precision training. 2. Multipacking encoder-decoder. 3. Wandb at https://wandb.ai/huseinzol05/dia-tts-malaysian-emilia-full-mixed-precision-podcast Source code at https://github.com/mesolitica/malaya-speech/tree/master/session/dia-tts Special thanks to https://www.sns.com.my and Nvidia for 8x H100 node!

NaNK
240
1

Malaysian-Llama-3.1-8B-Instruct

NaNK
llama
199
0

nanot5-base-malaysian-translation-v2

148
1

Qwen2.5-72B-Instruct-FP8

This is FP8 Dynamic Quantization (A8W8) for https://huggingface.co/Qwen/Qwen2.5-72B-Instruct, we use it for vLLM==0.8.5.post1 and above.

NaNK
124
0

finetune-dependency-t5-tiny-standard-bahasa-cased

114
0

Malaysian-TTS-4B-v0.1

Continue pretraining Qwen/Qwen3-4B-Base on mesolitica/Malaysian-TTS-v2, 1. Use DistilCodec as speech detokenizer, output in 24k sample rate. 2. Optional controllable pitch and speed for each words. 3. Support context switching between Malay and English. 4. Support streamable text segment. 5. Support `husein` and `idayu` speakers only. 1. Dataset purely synthetic generated using mesolitica/Malaysian-Podcast-Dia-1.6B. 2. Multipacking with proper document masking on 4096 context length. 3. FP32-BF16 mixed precision training. 4. Full parameter finetuning. 5. WanDB at https://wandb.ai/huseinzol05/Qwen-Qwen3-4B-Base-4k-TTS-distilcodec Source code at https://github.com/mesolitica/malaya-speech/tree/master/session/qwen-tts Special thanks to https://www.sns.com.my and Nvidia for 1x H100!

NaNK
105
0

translation-t5-small-standard-bahasa-cased

103
0

mallam-1.1B-4096

NaNK
101
10

pos-t5-small-standard-bahasa-cased

82
0

malaysian-whisper-medium

81
5

Malaysian-Llama-3.2-1B-Instruct

NaNK
llama
67
0

ner-t5-small-standard-bahasa-cased

59
0

Malaysian-Dia-1.6B

Full parameter finetuning nari-labs/Dia-1.6B on mesolitica/Malaysian-Emilia. Complete tutorial how to use at mesolitica/malaya-speech/Dia-TTS. 1. The finetuning done in FP32-BF16 mixed precision training. 2. Multipacking encoder-decoder. 3. Wandb at https://wandb.ai/huseinzol05/dia-tts-malaysian-emilia-full-mixed-precision-multipacking-v2 Source code at https://github.com/mesolitica/malaya-speech/tree/master/session/dia-tts Special thanks to https://www.sns.com.my and Nvidia for 8x H100 node!

NaNK
58
0

translation-t5-base-standard-bahasa-cased

51
0

jawi-nanot5-small-malaysian-cased

49
0

malaysian-whisper-tiny

47
1

VITS-female-singlish

46
0

nanot5-small-malaysian-translation-v2

44
1

mistral-embedding-191m-8k-contrastive

39
0

roberta-base-bahasa-cased

38
0

VITS-yasmin

37
0

roberta-tiny-bahasa-cased

37
0

nanot5-small-malaysian-cased

36
0

nanot5-small-malaysian-translation-v2.1

35
0

malaysian-whisper-base

33
3

nanot5-base-malaysian-translation-v2.1

33
0

MeloTTS-MS

MeloTTS continue train on MS, forked at https://github.com/malaysia-ai/MeloTTS-MS We uploaded full checkpoints with optimizer states at checkpoints.

32
1

finetune-mnli-nanot5-small

27
1

mallam-3B-4096

NaNK
25
2

sentiment-analysis-nanot5-tiny-malaysian-cased

22
0

embedding-malaysian-mistral-64M-32k

22
0

malaysian-llama2-7b-32k-instructions

NaNK
llama
21
2

mallam-1.1b-20k-instructions-v2

NaNK
20
0

finetune-mnli-t5-super-tiny-standard-bahasa-cased

18
0

VITS-osman

17
1

finetune-qa-t5-small-standard-bahasa-cased

17
0

nanot5-base-malaysian-cased

17
0

malaysian-debertav2-base

17
0

Malaysian-TTS-1.7B-v0.1

Continue pretraining Qwen/Qwen3-1.7B-Base on mesolitica/Malaysian-TTS-v2, 1. Use DistilCodec as speech detokenizer, output in 24k sample rate. 2. Optional controllable pitch and speed for each words. 3. Support context switching between Malay and English. 4. Support streamable text segment. 5. Support `husein` and `idayu` speakers only. 1. Dataset purely synthetic generated using mesolitica/Malaysian-Podcast-Dia-1.6B. 2. Multipacking with proper document masking on 4096 context length. 3. FP32-BF16 mixed precision training. 4. Full parameter finetuning. 5. WanDB at https://wandb.ai/huseinzol05/Qwen-Qwen3-1.7B-Base-4k-TTS-distilcodec 1. output-idayu-chunk.mp3 2. output-husein-chunk.mp3 Source code at https://github.com/mesolitica/malaya-speech/tree/master/session/qwen-tts Special thanks to https://www.sns.com.my and Nvidia for 1x H100!

NaNK
15
1

Malaysian-TTS-0.6B-v1

Continue pretraining mesolitica/Malaysian-TTS-0.6B-v0.1 on much consistent dataset, 1. Use DistilCodec as speech detokenizer, output in 24k sample rate. 2. Support context switching between Malay and English. 3. Better pronunciation for letters. 4. Better repetitive tolerance. 1. husein 2. idayu 3. singaporean 4. DisfluencySpeech 5. singlish-speaker2050 6. singlish-speaker2202 7. haqkiem, private dataset. 1. Multipacking with proper document masking on 4096 context length. 2. FP32-BF16 mixed precision training. 3. Full parameter finetuning. 4. WanDB at https://wandb.ai/huseinzol05/Malaysian-TTS-0.6B-v1 1. husein-0.6b.mp3 2. idayu-0.6b.mp3 3. singaporean-0.6b.mp3 4. DisfluencySpeech-0.6b.mp3 5. singlish-speaker2050-0.6b.mp3 6. singlish-speaker2202-0.6b.mp3 6. haqkiem-0.6b.mp3 1. This model trained on normalized text, so if you have text such as `123`, you have to normalize it first to become `one two three` or `one hundred twenty three` or `satu dua tiga` or `seratus dua puluh tiga`. Feel free to use Malaya for normalization, Malaya support Malay and English normalization, read more at https://github.com/mesolitica/malaya/issues/247#issuecomment-3030313021 2. The repetitive pronunciation dataset does not consistently use commas to for pauses. For example, `A, A, A, A, B, B` in our recordings is spoken as `A A A A B B`. We have no intention to improve it due to cost, but continue finetune using proper dataset should able to solve it. Source code at https://github.com/mesolitica/malaya-speech/tree/master/session/qwen-tts Special thanks to https://www.sns.com.my and Nvidia for 1x H100!

NaNK
15
0

Malaysian-Qwen2.5-7B-Instruct

NaNK
14
1

t5-tiny-standard-bahasa-cased

14
0

mallam-5B-4096

NaNK
13
2

Malaysian-Qwen2.5-14B-Reasoning-GRPO

Online Reinforcement learning using GRPO full parameter on warmup reasoning SFT https://huggingface.co/mesolitica/Malaysian-Qwen2.5-14B-Reasoning-SFT on highly curated Malaysian Reasoning dataset. 1. Multitask reasoning, each datapoint been replicated to 4 generations. 2. Actual online reinforcement learning. To get better performance, use system prompt `You are going to enter reasoning mode. First, you try to think step-by-step in Malay. After that, put your final answer within $\\boxed{}$.` Finetune on combine/combined-malaysian-reasoning.jsonl, this is train set from mesolitica/Malaysian-Reasoning. 1. GRPO full parameters. 5. WanDB at https://wandb.ai/huseinzol05/fpf-Malaysian-Qwen2.5-14B-Reasoning-SFT-GRPO 1. Epoch 1.0, revision cc1032dfe961a56a3e33e36f03c37ed09b33c7fe 2. Epoch 2.0, revision 90896edeb1eb18cb48ac682ad606d4ec51172941 Source code at https://github.com/mesolitica/malaya/blob/master/session/qwen2.5/14b-grpo-fsdp.sh All the benchmarks generate using vLLM, evaluation based on sacrebleu CHRF max@5. Source code for evaluation at https://github.com/mesolitica/malaya/tree/master/session/qwen2.5/evaluate-dialect Source code for evaluation at https://github.com/mesolitica/malaya/tree/master/session/qwen2.5/evaluate-malaymmlu Special thanks to https://www.sns.com.my and Nvidia for 8x H100 node!

NaNK
13
1

pos-t5-tiny-standard-bahasa-cased

13
0

Malaysian-TTS-1.7B-v1

Continue pretraining mesolitica/Malaysian-TTS-1.7B-v0.1 on much consistent dataset, 1. Use DistilCodec as speech detokenizer, output in 24k sample rate. 2. Support context switching between Malay and English. 3. Better pronunciation for letters. 4. Better repetitive tolerance. 1. husein 2. idayu 3. singaporean 4. DisfluencySpeech 5. singlish-speaker2050 6. singlish-speaker2202 7. haqkiem, private dataset. 1. Multipacking with proper document masking on 4096 context length. 2. FP32-BF16 mixed precision training. 3. Full parameter finetuning. 4. WanDB at https://wandb.ai/huseinzol05/Malaysian-TTS-1.7B-v1 1. husein-v1.mp3 2. idayu-v1.mp3 3. singaporean-v1.mp3 4. DisfluencySpeech-v1.mp3 5. singlish-speaker2050-v1.mp3 6. singlish-speaker2202-v1.mp3 6. haqkiem-v1.mp3 Only `singlish-speaker2202` and `haqkiem` had to generate 2 times to get better output that follow exact text input. 1. This model trained on normalized text, so if you have text such as `123`, you have to normalize it first to become `one two three` or `one hundred twenty three` or `satu dua tiga` or `seratus dua puluh tiga`. Feel free to use Malaya for normalization, Malaya support Malay and English normalization, read more at https://github.com/mesolitica/malaya/issues/247#issuecomment-3030313021 2. The repetitive pronunciation dataset does not consistently use commas for pauses. For example, `A, A, A, A, B, B` in our recordings is spoken as `A A A A B B`. We have no intention to improve it due to cost, but continue finetune using proper dataset should able to solve it. Source code at https://github.com/mesolitica/malaya-speech/tree/master/session/qwen-tts Special thanks to https://www.sns.com.my and Nvidia for 1x H100!

NaNK
13
0

Malaysian-Qwen2.5-14B-Instruct

NaNK
12
1

roberta-base-standard-bahasa-cased

12
0

Malaysian-Qwen2.5-3B-Instruct

NaNK
12
0

malaysian-mistral-191M-4096

11
0

malaysian-mistral-7b-32k-instructions-v4

NaNK
10
1

finetune-tatabahasa-t5-small-standard-bahasa-cased

10
0

malay-VITS-multispeaker

10
0

malaysian-mistral-7b-32k-instructions

NaNK
9
3

VITS-multispeaker-clean

9
0

finetune-mnli-nanot5-base

9
0

Malaysian-TTS-0.6B-v0.1

NaNK
9
0

malaysian-parler-tts-tiny-v1

8
0

gemma-3n-e4b-it-audio-encoder

NaNK
7
2

malaysian-llama2-13b-32k-instructions

NaNK
llama
7
0

malaysian-tinyllama-1.1b-16k-instructions

NaNK
llama
7
0

malaysian-tinyllama-1.1b-16k-instructions-v2

NaNK
llama
7
0

Malaysian-Qwen2.5-14B-Reasoning-SFT

Continue finetuning https://huggingface.co/mesolitica/Malaysian-Qwen2.5-14B-Instruct on highly curated Malaysian Reasoning dataset. 1. Reasoning on Math, Science, Translation, Dialects, Multiple choices, coding and Maktabah Al Bakri. 2. Warmup reasoning. Finetune on mesolitica/Malaysian-Reasoning to make the model better reasoning on Malaysian context. 1. Full parameters on 12k context length. 5. WanDB at https://wandb.ai/huseinzol05/fpf-qwen2.5-14b-malaysian-12k-reasoning Source code at https://github.com/mesolitica/malaya/tree/master/session/qwen2.5 All the benchmarks generate using vLLM, evaluation based on sacrebleu CHRF max@5. Source code for evaluation at https://github.com/mesolitica/malaya/tree/master/session/qwen2.5/evaluate-dialect Source code for evaluation at https://github.com/mesolitica/malaya/tree/master/session/qwen2.5/evaluate-malaymmlu Special thanks to https://www.sns.com.my and Nvidia for 8x H100 node!

NaNK
7
0

malaysian-mistral-7b-32k-instructions-v2

NaNK
6
2

malaysian-distil-whisper-large-v3

6
2

Malaysian-Qwen2.5-7B-Speech-Instruct

Speech model on top of mesolitica/Malaysian-Qwen2.5-7B-Audio-Instruct. It designs for voice assistant general question answer. Speech instructions, actual conversations related to coding, politics, chat assistant and general QA. - We use freezed Whisper Large V3 Encoder without any pooling, means 30 seconds audio consumed 1500 tokens or 1 token equal to 0.02 seconds. - Projection, Embedding and LM Head layers are done in full parameter finetuning. - LoRA for other linear layers with rank 64 and alpha 128. - Training done in multipacking with 10240 context length. - WanDB at https://wandb.ai/huseinzol05/lora-embedding-64-audio-qwen2.5-7b-malaysian-10k-stage2 - Revision 513a900f40d372e8d7eb774e0561af043c704449 1. mesolitica/Malaysian-UltraChat-Speech-Multiturn-Instructions, 1 epoch. 2. mesolitica/Malaysian-Multiturn-Chat-Assistant, 1 epoch. 3. mesolitica/Malaysian-Speech-Instructions, 1 epoch. 4. mesolitica/Malaysian-Reasoning-Speech-Instructions, 1 epoch. 5. mesolitica/Malaysian-Speech-Description-Timestamp-Instructions, random sampling 0.2 epoch. 6. mesolitica/Cantonese-Radio-Description-Instructions, random sampling 0.2 epoch. 7. mesolitica/Emilia-Mandarin-Description-Instructions, random sampling 0.2 epoch. 8. mesolitica/Malaysian-SFT/combined-malaysian-sft-5k-sample.jsonl, text corpus, 1 epoch. 9. mesolitica/Malaysian-Instructions/voiceassistant, text only instructions, 1 epoch. 10. mesolitica/Malaysian-Instructions/mixedmanglish, text only instructions, 1 epoch. 11. mesolitica/Malaysian-Instructions/manglish, text only instructions, 1 epoch. 12. mesolitica/Malaysian-Instructions/longerrespond, text only instructions, 1 epoch. With total 3.14B tokens (include text only instructions) or 9584.595 audio hours. You can try more examples at https://github.com/mesolitica/malaya-speech/tree/master/speech/speech-instructions We cover more examples such as RAG multi-turn, force specific languages, voice assistant mode, reasoning and longer respond at https://github.com/mesolitica/malaya/wiki/Malaysian-Speech-Instruct You can use this fork to serve the model in vLLM, https://github.com/mesolitica/vllm-llmaudio Source code at https://github.com/mesolitica/malaya/tree/master/session/audiollm

NaNK
6
2

gpt2-117m-bahasa-cased-v2

6
1

gpt2-117m-bahasa-cased

6
0

bert-tiny-standard-bahasa-cased

6
0

finetune-paraphrase-t5-tiny-standard-bahasa-cased

6
0

electra-base-generator-bahasa-cased

6
0

finetune-dependency-t5-small-standard-bahasa-cased

6
0

llama-1b-hf-32768-fpf

NaNK
llama
6
0

llama2-embedding-1b-8k-contrastive

NaNK
llama
6
0

malaysian-mistral-7b-32k-instructions-v3

NaNK
6
0

conformer-tiny-ctc

6
0

t5-super-super-tiny-standard-bahasa-cased

5
1

VITS-female

5
1

malaysian-llama2-7b-32k-instructions-v2

NaNK
llama
5
1

llama-3-8b-8192-hf

NaNK
llama
5
1

VITS-haqkiem

5
0

finetune-paraphrase-t5-base-standard-bahasa-cased

5
0

finetune-summarization-t5-small-standard-bahasa-cased

5
0

finetune-true-case-t5-tiny-standard-bahasa-cased

5
0

nanot5-large-malaysian-cased

5
0

Malaysian-Llama-3.2-3B-Instruct

NaNK
llama
5
0

Malaysian-Qwen2.5-1.5B-Reasoning-GRPO

Online Reinforcement learning using GRPO full parameter on warmup reasoning SFT https://huggingface.co/mesolitica/Malaysian-Qwen2.5-1.5B-Reasoning-SFT on highly curated Malaysian Reasoning dataset. 1. Multitask reasoning, each datapoint been replicated to 4 generations. 2. Actual online reinforcement learning. To get better performance, use system prompt `You are going to enter reasoning mode. First, you try to think step-by-step in Malay. After that, put your final answer within $\\boxed{}$.` Finetune on combine/combined-malaysian-reasoning.jsonl, this is train set from mesolitica/Malaysian-Reasoning. 1. GRPO full parameters. 5. WanDB at https://wandb.ai/huseinzol05/fpf-Malaysian-Qwen2.5-1.5B-Reasoning-SFT-GRPO 1. Epoch 5.0, revision b4c3d2b391ff08141a0728c6f1868bffed313be6 Source code at https://github.com/mesolitica/malaya/blob/master/session/qwen2.5/1.5b-grpo-fsdp.sh Source code for evaluation at https://github.com/mesolitica/malaya/tree/master/session/qwen2.5/evaluate-dialect Source code for evaluation at https://github.com/mesolitica/malaya/tree/master/session/qwen2.5/evaluate-malaymmlu Special thanks to https://www.sns.com.my and Nvidia for 8x H100 node!

NaNK
5
0

Malaysian-orpheus-3b-0.1-ft

NaNK
llama
4
2

translation-t5-tiny-standard-bahasa-cased

4
1

mistral-embedding-349m-8k-contrastive

4
1

electra-small-discriminator-bahasa-cased

4
0

malaysian-mistral-349M-4096

4
0

malaysian-tinyllama-1.1b-siglip-large-384-vision

NaNK
4
0

malaysian-mistral-64M-4096

4
0

malaysian-whisper-large-v2

4
0

Malaysian-Qwen2.5-1.5B-Instruct-v0.1

NaNK
4
0

Malaysian-Qwen2.5-32B-Instruct

NaNK
4
0

Malaysian-Qwen2.5-72B-Instruct

NaNK
4
0

finetune-whisper-base-ms-singlish-v2

3
1

llama-2b-hf-32768-fpf

NaNK
llama
3
1

finetune-mnli-t5-small-standard-bahasa-cased

3
0

translation-nanot5-base-malaysian-cased

3
0

translation-nanot5-tiny-malaysian-cased

3
0

malaysian-mistral-474M-MLM-512

3
0

malaysian-whisper-medium-v2

3
0

Malaysian-Qwen2.5-1.5B-Instruct

NaNK
3
0

Malaysian-Qwen2.5-1.5B-Reasoning-SFT

NaNK
3
0

Malaysian-Qwen2.5-7B-Audio-Instruct

Audio model on top of mesolitica/Malaysian-Qwen2.5-7B-Instruct. Audio understanding, this is to introduce audio dataset to the LLM. - We use freezed Whisper Large V3 Encoder without any pooling, means 30 seconds audio consumed 1500 tokens or 1 token equal to 0.02 seconds. - Projection, Embedding and LM Head layers are done in full parameter finetuning. - LoRA for other linear layers with rank 64 and alpha 128. - Training done in multipacking with 8192 context length. - WanDB at https://wandb.ai/huseinzol05/lora-embedding-64-audio-qwen2.5-7b-malaysian-8k 1. mesolitica/AudioSet-Audio-Instruction, 1 epoch. 2. mesolitica/Classification-Speech-Instructions, 1 epoch. 3. mesolitica/Animal-Sound-Instructions, 3 epoch. 4. mesolitica/Transcription-Instructions, 1 epoch. 5. mesolitica/Speaker-Diarization-Instructions, 4 epoch. 6. mesolitica/Speech-Translation-Instructions, 2 epoch. 7. mesolitica/CoVoST2-Instructions, 1 epoch. 8. mesolitica/MusicBench-Instructions, 2 epoch. 9. mesolitica/Sampling-Multitask-National-Speech-Corpus-v1, 1 epoch. 10. mesolitica/Malaysian-Speech-Description-Timestamp-Instructions, 1 epoch. 11. mesolitica/Cantonese-Radio-Description-Instructions, 1 epoch. 12. mesolitica/Emilia-Mandarin-Description-Instructions, 1 epoch. 13. mesolitica/Audio-Adversarial-Instructions, revision 4536d60ab09a190e7d12536811be404062d5d38c, 1 epoch. 14. mesolitica/Zeroshot-Audio-Classification-Instructions, revision 7d22438bdcd697af1ce4281228860c6b8663fb76, 1 epoch. Because most of the dataset is about audio understanding, for End-to-End Speech-LLM chat instructions, please use mesolitica/Malaysian-Qwen2.5-7B-Speech-Instruct. You can use this fork to serve the model in vLLM, https://github.com/mesolitica/vllm-llmaudio Source code at https://github.com/mesolitica/malaya/tree/master/session/audiollm

NaNK
3
0

Malaysian-sesame-csm-1b

Full parameter finetuning sesame/csm-1b on mesolitica/Malaysian-Emilia. 1. The finetuning done in FP32-BF16 mixed precision training. 2. Multipacking decoder. 3. Wandb at https://wandb.ai/huseinzol05/sesame-1b-malaysian-emilia-full-mixed-precision Source code at https://github.com/mesolitica/malaya-speech/tree/master/session/sesame-tts Special thanks to https://www.sns.com.my and Nvidia for 8x H100 node!

NaNK
2
3

finetune-keyword-t5-base-standard-bahasa-cased

2
2

llama-7b-hf-2048-fpf

NaNK
llama
2
2

malaysian-llama-3-8b-instruct-16k

NaNK
llama
2
2

finetune-keyword-t5-small-standard-bahasa-cased

2
1

malaysian-parler-tts-mini-v1

Finetuned https://huggingface.co/parler-tts/parler-tts-mini-v1 on Mesolitica/TTS Wandb at https://wandb.ai/huseinzol05/malaysian-parler-tts-mini-v1 Source code at https://github.com/mesolitica/malaya-speech/tree/master/session/parler-tts

2
1

t5-small-standard-bahasa-cased

2
0

t5-super-tiny-bahasa-cased

2
0

finetune-paraphrase-t5-small-standard-bahasa-cased

2
0

gpt2-355m-bahasa-cased

2
0

finetune-whisper-base-ms-singlish

2
0

VITS-orkid

2
0

VITS-bunga

2
0

VITS-tuah

2
0

VITS-male

2
0

translation-nanot5-small-malaysian-cased

2
0

emotion-analysis-nanot5-tiny-malaysian-cased

2
0

ner-t5-tiny-standard-bahasa-cased

2
0

mistral-7b-4096-fpf

NaNK
2
0

llama2-embedding-600m-8k-contrastive

llama
2
0

Malaysian-Llama-3.2-3B-Instruct-v0.2

NaNK
llama
2
0

Malaysian-Qwen2.5-0.5B-Instruct

NaNK
2
0

Malaysian-gemma-3-1b-it

NaNK
2
0

Malaysian-Podcast-sesame-csm-1b

Full parameter finetuning sesame/csm-1b on Malaysian Podcast from mesolitica/Malaysian-Emilia where the permutation for voice conversion only select 80% similar. 1. The finetuning done in FP32-BF16 mixed precision training. 2. Multipacking decoder. 3. Wandb at https://wandb.ai/huseinzol05/sesame-1b-malaysian-emilia-full-mixed-precision-podcast Source code at https://github.com/mesolitica/malaya-speech/tree/master/session/sesame-tts Special thanks to https://www.sns.com.my and Nvidia for 8x H100 node!

NaNK
2
0

Malaysian-Qwen2.5-7B-Dialect-Reasoning-GRPO

NaNK
1
3

llama-13b-hf-2048-fpf

NaNK
llama
1
2

Malaysian-Llama-3.2-3B-Instruct-v0.1

NaNK
llama
1
2

finetune-extractive-qa-t5-base-standard-bahasa-cased

1
1

llama2-embedding-600m-8k

llama
1
1

mallam-5b-20k-instructions

NaNK
1
1

mallam-5b-20k-instructions-v2

NaNK
1
1

malaysian-llama-3-8b-262k

NaNK
llama
1
1

t5-3x-super-tiny-standard-bahasa-cased

1
0

t5-small-bahasa-cased

1
0

finetune-isi-penting-generator-t5-base-standard-bahasa-cased

1
0

finetune-isi-penting-generator-t5-small-standard-bahasa-cased

1
0

electra-small-generator-bahasa-cased

1
0

wav2vec2-base-ms-singlish

1
0

finetune-qa-t5-base-standard-bahasa-cased

1
0

finetune-keyword-t5-tiny-standard-bahasa-cased

1
0

finetune-whisper-tiny-ms-singlish

1
0

finetune-whisper-tiny-ms-singlish-v2

1
0

VITS-jebat

1
0

nanot5-tiny-malaysian-cased

1
0

llama-7b-hf-32768-fpf

NaNK
llama
1
0

llama-600m-hf-32768-fpf

llama
1
0

jawi-nanot5-tiny-malaysian-cased

1
0

constituency-parsing-t5-base-standard-bahasa-cased

1
0

malaysian-tinyllama-1.1b-siglip-large-384-vision-alignment

NaNK
1
0

malaysian-mistral-siglip-base-384-vision-alignment

1
0

malaysian-mistral-474M-4096

1
0

reranker-malaysian-mistral-474M-32k

1
0

malaysian-mistral-64M-MLM-512

1
0

mnli-malaysian-mistral-191M-MLM-512

1
0

llava-v1.6-vicuna-13b-hf-awq

NaNK
1
0

Malaysian-Llama-3.2-1B-Instruct-v0.2

NaNK
llama
1
0

Malaysian-Llama-3.1-8B-Instruct-Marlin

NaNK
llama
1
0

Malaysian-Llama-3.1-70B-Instruct

NaNK
llama
1
0

Malaysian-gemma-3-27b-it

NaNK
1
0

Malaysian-Mistral-Small-3.1-24B-Instruct-2503

Continue finetuning https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503 on highly curated 1.5B tokens Malaysian instruction dataset. 1. Support respond in Mandarin, Tamil, Jawi, Manglish, Johor, Kedah, Kelantan, Pahang, Perak, Sabah, Sarawak, Selangor, Negeri Sembilan and Terengganu. 2. Able to code in Mandarin, Tamil, Jawi, Manglish, Johor, Kedah, Kelantan, Pahang, Perak, Sabah, Sarawak, Selangor, Negeri Sembilan and Terengganu. 3. Multi-turn Malaysian context such as related to Malaysian Legislation, politics, religions and languages. Finetune on mesolitica/Malaysian-SFT to make the model understand Malaysian context. 1. LoRA on `["qproj", "kproj", "vproj", "oproj", "gateproj", "upproj", "downproj", "embedtokens", "lmhead"]`. 2. 256 Rank with alpha 512, or alpha of 2.0 3. Multipacking 8192 context length with proper SDPA causal masking to prevent document contamination and also make sure proper position ids. 4. Chunk CCE loss for LoRA. 5. WanDB at https://wandb.ai/huseinzol05/lora-embedding-256-Mistral-Small-3.1-24B-Instruct-2503-malaysian-8k Source code at https://github.com/mesolitica/malaya/tree/master/session/mistral3 Based on 0-shot official MalayMMLU First token accuracy, Based on 0-shot exact first token match using vLLM Guided Decoding, Special thanks to https://www.sns.com.my and Nvidia for 8x H100 node!

NaNK
1
0

Malaysian-F5-TTS-v3

license:cc-by-nc-4.0
0
3

t5-super-tiny-standard-bahasa-cased

0
1

fasttext-language-detection-bahasa-en

0
1

conformer-medium-malay-whisper

0
1

mistral-1.1b-32768-fpf

NaNK
0
1

mallam-1.1b-20k-instructions

NaNK
0
1

malaysian-tinyllama-1.1b-siglip-base-384-vision

NaNK
0
1

malaysian-Qwen1.5-0.5B-siglip-base-384-vision

NaNK
0
1

phoneme-ipa-lstm

0
1

Malaysian-F5-TTS-v2

license:cc-by-nc-4.0
0
1

Malaysian-Llama-3.1-8B-Instruct-v0.1

NaNK
llama
0
1