skt
kobert-base-v1
Please refer here. https://github.com/SKTBrain/KoBERT
kogpt2-base-v2
A.X-4.0-Light
SK Telecom released A.X 4.0 (pronounced "A dot X"), a large language model (LLM) optimized for Korean-language understanding and enterprise deployment, on July 03, 2025. Built on the open-source Qwen2.5 model, A.X 4.0 has been further trained with large-scale Korean datasets to deliver outstanding performance in real-world business environments. - Superior Korean Proficiency: Achieved a score of 78.3 on KMMLU, the leading benchmark for Korean-language evaluation and a Korean-specific adaptation of MMLU, outperforming GPT-4o (72.5). - Deep Cultural Understanding: Scored 83.5 on CLIcK, a benchmark for Korean cultural and contextual comprehension, surpassing GPT-4o (80.2). - Efficient Token Usage: A.X 4.0 uses approximately 33% fewer tokens than GPT-4o for the same Korean input, enabling more cost-effective and efficient processing. - Deployment Flexibility: Offered in both a 72B-parameter standard model (A.X 4.0) and a 7B lightweight version (A.X 4.0 Light). - Long Context Handling: Supports up to 131,072 tokens, allowing comprehension of lengthy documents and conversations. (Lightweight model supports up to 16,384 tokens length) Benchmarks A.X 4.0 Qwen3-235B-A22B (w/o reasoning) Qwen2.5-72B GPT-4o Instruction Following Ko-IFEval 77.96 77.53 77.07 75.38 LiveCodeBench 2024.10~2025.04 26.07 33.09 27.58 29.30 Long Context LongBench <128K 56.70 49.40 45.60 47.50 Benchmarks A.X 4.0 Light Qwen3-8B (w/o reasoning) Qwen2.5-7B EXAONE-3.5-7.8B Kanana-1.5-8B Instruction Following Ko-IFEval 72.99 73.39 60.73 65.01 69.96 - `transformers>=4.46.0` or the latest version is required to use `skt/A.X-4.0-Light` - `vllm>=v0.6.4.post1` or the latest version is required to use tool-use function The `A.X 4.0 Light` model is licensed under `Apache License 2.0`.
A.X-Encoder-base
A.X Encoder (pronounced "A dot X") is SKT's document understanding model optimized for Korean-language understanding and enterprise deployment. This lightweight encoder was developed entirely in-ho...
A.X-K1
A.X-4.0
SK Telecom released A.X 4.0 (pronounced "A dot X"), a large language model (LLM) optimized for Korean-language understanding and enterprise deployment, on July 03, 2025. Built on the open-source Qwen2.5 model, A.X 4.0 has been further trained with large-scale Korean datasets to deliver outstanding performance in real-world business environments. - Superior Korean Proficiency: Achieved a score of 78.3 on KMMLU, the leading benchmark for Korean-language evaluation and a Korean-specific adaptation of MMLU, outperforming GPT-4o (72.5). - Deep Cultural Understanding: Scored 83.5 on CLIcK, a benchmark for Korean cultural and contextual comprehension, surpassing GPT-4o (80.2). - Efficient Token Usage: A.X 4.0 uses approximately 33% fewer tokens than GPT-4o for the same Korean input, enabling more cost-effective and efficient processing. - Deployment Flexibility: Offered in both a 72B-parameter standard model (A.X 4.0) and a 7B lightweight version (A.X 4.0 Light). - Long Context Handling: Supports up to 131,072 tokens, allowing comprehension of lengthy documents and conversations. (Lightweight model supports up to 16,384 tokens length) Benchmarks A.X 4.0 Qwen3-235B-A22B (w/o reasoning) Qwen2.5-72B GPT-4o Instruction Following Ko-IFEval 77.96 77.53 77.07 75.38 LiveCodeBench 2024.10~2025.04 26.07 33.09 27.58 29.30 Long Context LongBench <128K 56.70 49.40 45.60 47.50 Benchmarks A.X 4.0 Light Qwen3-8B (w/o reasoning) Qwen2.5-7B EXAONE-3.5-7.8B Kanana-1.5-8B Instruction Following Ko-IFEval 72.99 73.39 60.73 65.01 69.96 - `transformers>=4.46.0` or the latest version is required to use `skt/A.X-4.0` - `vllm>=v0.6.4.post1` or the latest version is required to use tool-use function
A.X-3.1
SK Telecom released A.X 3.1 (pronounced "A dot X"), a large language model (LLM) optimized for Korean-language understanding and enterprise deployment, on July 24, 2025. This sovereign AI model was developed entirely in-house by SKT, encompassing model architecture, data curation, and training, all carried out on SKT’s proprietary supercomputing infrastructure, TITAN. The model was trained from scratch on a high-quality multilingual corpus comprising 2.1 trillion tokens, with a primary focus on the Korean language. - Authentic Korean Sovereign AI: A.X 3.1 was trained on a high-quality multilingual dataset—fully curated in-house—using SKT’s proprietary GPU infrastructure. - Highly Efficient Multilingual LLM: A.X 3.1 demonstrates superior performance among Korean LLMs, despite its relatively compact training size of 2.1 trillion tokens. - Superior Korean Proficiency: A.X 3.1 achieved a score of 69.2 on the KMMLU: the leading benchmark for Korean-language evaluation and a Korean-specific adaptation of MMLU, outperforming other Korean-specified models. - Deep Korean Understanding: A.X 3.1 obtained 77.4 on the CLIcK: a benchmark for Korean cultural and contextual comprehension, outperforming other open-source models. - Efficient Token Usage: A.X 3.1 requires approximately 33% fewer tokens than GPT-4o to process equivalent Korean inputs, facilitating more cost-effective and computationally efficient inference. - Long-Context Handling: A.X 3.1 supports up to 32,768 tokens natively, and up to 131,072 tokens by applying YaRN. A.X 3.1 represents an efficient sovereign AI model, developed end-to-end by SKT, encompassing model architecture, data curation, infrastructure deployment, and optimization. Model # Params # Layers # KV-Heads Hidden Dim FFN Dim - We collected and curated a training dataset comprising 20 trillion tokens sourced from diverse domains. - The entire dataset was processed through SKT’s proprietary data pipeline, incorporating synthetic data generation and comprehensive quality filtering. - For training A.X 3.1, a total of 2.1 trillion tokens were utilized, comprising a Korean-focused multilingual corpus. A.X 3.1 EXAONE-3.5-32B Kanana-flag-32.5B Gemma-3-27B Qwen2.5-32B Benchmarks A.X 3.1 Light Kanana-1.5-8B EXAONE-3.5-7.8B Qwen2.5-7B Qwen3-8B (w/o reasoning) Instruction Following Ko-IFEval 70.04 69.96 65.01 60.73 73.39 - `transformers>=4.46.0` or the latest version is required to use `skt/A.X-3.1` - `vllm>=v0.6.4.post1` or the latest version is required to use tool-use feature The `config.json` file of A.X 3.1 uploaded to HuggingFace is configured for maximum token lengths of 32,768. You can simply handle up to 131,072 tokens by modifying `ropescaling` field in `config.json` file into the following parameters: The `A.X 3.1` model is licensed under `Apache License 2.0`.
A.X-4.0-VL-Light
A.X 4.0 VL Light (pronounced “A dot X”) is a vision-language model (VLM) optimized for Korean vision and language understanding as well as enterprise deployment. Built upon A.X 4.0 Light, A.X 4.0 VL Light has been further trained on diverse multimodal datasets, with a particular focus on large-scale multimodal Korean datasets, to deliver exceptional performance in domestic business applications. - Superior Korean Proficiency in Vision and Language: Achieved an average score of 79.4 on Korean image benchmarks, outperforming Qwen2.5-VL-32B (73.4), despite having a significantly smaller model size. On Korean text benchmarks, recorded an average score of 60.2, comparable to VARCO-VISION-2.0-14B (60.4), while using only half the model size. - Deep Cultural Understanding: Scored 80.2 on K-Viscuit, a multimodal benchmark designed to evaluate cultural and contextual comprehension in Korean, exceeding Qwen2.5-VL-32B (72.3). - Advanced Document Understanding: Attained a score of 89.8 on KoBizDoc, a benchmark focused on understanding complex document structures, including charts and tables, performing comparably to Qwen2.5-VL-32B (88.8). - Efficient Token Usage: A.X 4.0 VL Light utilizes approximately 41% fewer text tokens compared to Qwen2.5-VL for the same Korean input, enabling significantly more cost-effective and efficient processing. A brief comparison on representative benchmarks is as follows: Image Benchmark Korean benchmarks, with K-Viscuit translated into Korean. | Category | Benchmarks | A.X 4.0 VL Light | Qwen2.5-VL-7B | InternVL3-8B | VARCO-VISION-2.0-14B | Qwen2.5-VL-32B | |------------------------|---------------------|------------------|---------------|--------------|----------------------|----------------| | Document | KoBizDoc | 89.8 | 84.0 | 73.2 | 83.0 | 88.8 | | | K-DTCBench | 90.0 | 86.7 | 83.8 | 80.8 | 91.7 | | | ChartQA | 79.8 | 80.6 | 79.8 | 78.8 | 81.8 | | | DocVQA | 94.4 | 95.3 | 92.4 | 91.9 | 94.5 | | | InfoVQA | 78.5 | 82.7 | 76.2 | 80.0 | 82.7 | | | SEEDBench2-Plus | 69.7 | 71.2 | 69.7 | 71.9 | 73.3 | | OCR | OutdoorKorean | 97.3 | 91.9 | 72.7 | 79.7 | 86.9 | | | K-Handwriting | 84.3 | 85.0 | 43.5 | 55.2 | 60.1 | | | TextVQA | 82.0 | 85.4 | 82.1 | 80.3 | 79.8 | | Culture | K-Viscuit | 80.2 | 65.0 | 65.3 | 72.0 | 72.3 | | Knowledge | KoEduBench | 58.1 | 53.9 | 53.9 | 39.4 | 52.4 | | | KoCertBench | 54.9 | 50.1 | 39.4 | 51.4 | 47.5 | | | MMMU | 54.1 | 56.3 | 59.4 | 58.3 | 63.6 | | | ScienceQA | 95.3 | 87.2 | 97.8 | 92.2 | 92.4 | | General | K-LLaVA-W | 83.2 | 73.0 | 67.0 | 80.0 | 84.3 | | | K-SEED | 76.5 | 76.4 | 76.4 | 76.9 | 77.3 | | | SEEDBenchIMG | 76.7 | 77.1 | 77.1 | 78.1 | 77.6 | | Hallucination | HallusionBench | 54.2 | 52.7 | 49.6 | 53.8 | 58.0 | | IF | MM-IFEval | 53.5 | 51.4 | 51.9 | 50.8 | 59.3 | The following in-house benchmarks have been established to rigorously assess model performance on Korean vision-language understanding and the comprehension of Korea-specific knowledge domains: - KoBizDoc: A visual question answering (VQA) benchmark designed for understanding Korean business documents. - OutdoorKorean: A benchmark focused on recognizing Korean text in complex outdoor scenes (provided by AIHub). - K-Handwriting: A Korean handwriting recognition dataset comprising various handwritten styles (provided by AIHub). - KoEduBench: A VQA benchmark targeting Korean general academic exams, including GED and CSAT questions, to assess academic reasoning ability. - KoCertBench: A Korean certification exam-based VQA benchmark, covering domains such as civil service, technical licenses, and professional qualifications. | Category | Benchmarks | A.X 4.0 VL Light | Qwen2.5-VL-7B | InternVL3-8B | VARCO-VISION-2.0-14B | |-----------------------|--------------|------------------|---------------|--------------|----------------------| | Knowledge | KMMLU | 60.5 | 45.6 | 50.9 | 58.8 | | | MMLU | 72.6 | 71.9 | 77.5 | 80.7 | | Math | HRM8K | 40.6 | 25.4 | 34.6 | 49.5 | | | MATH | 56.5 | 61.7 | 65.1 | 71.1 | | General | Ko-MT-bench | 68.9 | 51.5 | 59.5 | 75.9 | | | MT-bench | 72.9 | 73.2 | 69.9 | 76.6 | | IF | Ko-IFEval | 71.8 | 55.0 | 46.1 | 57.2 | | | IFEval | 81.9 | 66.6 | 67.5 | 75.3 | - `transformers>=4.49.0` or the latest version is required to use `skt/A.X-4.0-VL-Light` The `A.X 4.0 VL Light` model is licensed under `Apache License 2.0`.
kobart-base-v1
A.X-3.1-Light
Ko Gpt Trinity 1.2B V0.5
Ko-GPT-Trinity 1.2B is a transformer model designed using SK telecom's replication of the GPT-3 architecture. Ko-GPT-Trinity refers to the class of models, while 1.2B represents the number of parameters of this particular pre-trained model. Ko-GPT-Trinity 1.2B was trained on Ko-DAT, a large scale curated dataset created by SK telecom for the purpose of training this model. This model was trained on ko-DAT for 35 billion tokens over 72,000 steps. It was trained as a masked autoregressive language model, using cross-entropy loss. The model learns an inner representation of the Korean language that can then be used to extract features useful for downstream tasks. The model excels at generating texts from a prompt, which was the pre-training objective. Ko-GPT-Trinity was trained on Ko-DAT, a dataset known to contain profanity, lewd, politically charged, and otherwise abrasive language. As such, Ko-GPT-Trinity may produce socially unacceptable text. As with all language models, it is hard to predict in advance how Ko-GPT-Trinity will respond to particular prompts and offensive content may occur without warning. Ko-GPT-Trinity was trained as an autoregressive language model. This means that its core functionality is taking a string of text and predicting the next token. While language models are widely used for tasks other than this, this is an active area of ongoing research. Known limitations include the following: Predominantly Korean: Ko-GPT-Trinity was trained largely on text in the Korean language, and is best suited for classifying, searching, summarizing, or generating such text. Ko-GPT-Trinity will by default perform worse on inputs that are different from the data distribution it is trained on, including non-Korean languages as well as specific dialects of Korean that are not as well-represented in training data. Interpretability & predictability: the capacity to interpret or predict how Ko-GPT-Trinity will behave is very limited, a limitation common to most deep learning systems, especially in models of this scale. High variance on novel inputs: Ko-GPT-Trinity is not necessarily well-calibrated in its predictions on novel inputs. This can be observed in the much higher variance in its performance as compared to that of humans on standard benchmarks. | Model and Size | BoolQ | CoPA | WiC | | ----------------------- | --------- | ---------- | --------- | | Ko-GPT-Trinity 1.2B | 71.77 | 68.66 | 78.73 | | KoElectra-base | 65.17 | 67.56 | 77.27 | | KoBERT-base | 55.97 | 62.24 | 77.60 | Where to send questions or comments about the model Please contact [Eric] ([email protected])