kakaocorp

20 models • 2 total models in database

Sort by:

kanana-1.5-2.1b-instruct-2505

🤗 1.5 HF Models &nbsp | &nbsp 📕 1.5 Blog &nbsp | &nbsp 📜 Technical Report - ✨`2025/05/23`: Published a blog post about `Kanana 1.5` models and released 🤗HF model weights. - 📜`2025/02/27`: Released Technical Report and 🤗HF model weights. - 📕`2025/01/10`: Published a blog post about the development of `Kanana Nano` model. - 📕`2024/11/14`: Published blog posts (pre-training, post-training) about the development of `Kanana` models. - ▶️`2024/11/06`: Published a presentation video about the development of the `Kanana` models. - Kanana 1.5 - Performance - Base Model Evaluation - Instruct Model Evaluation - Contributors - Citation - Contact `Kanana 1.5`, a newly introduced version of the Kanana model family, presents substantial enhancements in coding, mathematics, and function calling capabilities over the previous version, enabling broader application to more complex real-world problems. This new version now can handle up to 32K tokens length natively and up to 128K tokens using YaRN, allowing the model to maintain coherence when handling extensive documents or engaging in extended conversations. Furthermore, Kanana 1.5 delivers more natural and accurate conversations through a refined post-training process. > [!Note] > Neither the pre-training nor the post-training data includes Kakao user data. Kanana-1.5-2.1B 56.30 45.10 77.46 52.44 47.00 55.95 Kanana-Nano-2.1B 54.83 44.80 77.09 31.10 46.20 46.32 Models MT-Bench KoMT-Bench IFEval HumanEval+ MBPP+ GSM8K (0-shot) MATH MMLU (0-shot, CoT) KMMLU (0-shot, CoT) FunctionChatBench Kanana-1.5-2.1B 7.01 6.54 68.61 68.90 65.08 81.43 60.62 53.87 32.93 53.70 Kanana-Nano-2.1B 6.40 5.90 71.97 63.41 62.43 72.32 29.26 52.48 38.51 26.10 > [!Note] > \ Models released under Apache 2.0 are trained on the latest versions compared to other models. Contributors - Language Model Training: Yunju Bak, Doohae Jung, Boseop Kim, Nayeon Kim, Hojin Lee, Jaesun Park, Minho Ryu - Language Model Alignment: Jiyeon Ham, Seungjae Jung, Hyunho Kim, Hyunwoong Ko, Changmin Lee, Daniel Wontae Nam - AI Engineering: Youmin Kim, Hyeongju Kim Contact - Kanana LLM Team Technical Support: [email protected] - Business & Partnership Contact: [email protected]

kanana-nano-2.1b-instruct

kanana-1.5-8b-instruct-2505

🤗 1.5 HF Models &nbsp | &nbsp 📕 1.5 Blog &nbsp | &nbsp 📜 Technical Report - ✨`2025/05/23`: Published a blog post about `Kanana 1.5` models and released 🤗HF model weights. - 📜`2025/02/27`: Released Technical Report and 🤗HF model weights. - 📕`2025/01/10`: Published a blog post about the development of `Kanana Nano` model. - 📕`2024/11/14`: Published blog posts (pre-training, post-training) about the development of `Kanana` models. - ▶️`2024/11/06`: Published a presentation video about the development of the `Kanana` models. - Kanana 1.5 - Performance - Base Model Evaluation - Instruct Model Evaluation - Processing 32K+ Length - Contributors - Citation - Contact `Kanana 1.5`, a newly introduced version of the Kanana model family, presents substantial enhancements in coding, mathematics, and function calling capabilities over the previous version, enabling broader application to more complex real-world problems. This new version now can handle up to 32K tokens length natively and up to 128K tokens using YaRN, allowing the model to maintain coherence when handling extensive documents or engaging in extended conversations. Furthermore, Kanana 1.5 delivers more natural and accurate conversations through a refined post-training process. > [!Note] > Neither the pre-training nor the post-training data includes Kakao user data. Models MT-Bench KoMT-Bench IFEval HumanEval+ MBPP+ GSM8K (0-shot) MATH MMLU (0-shot, CoT) KMMLU (0-shot, CoT) FunctionChatBench Kanana-1.5-8B 7.76 7.63 80.11 76.83 67.99 87.64 67.54 68.82 48.28 58.00 Kanana-8B 7.13 6.92 76.91 62.20 43.92 79.23 37.68 66.50 47.43 17.37 > [!Note] > \ Models released under Apache 2.0 are trained on the latest versions compared to other models. Processing 32K+ Length Currently, the `config.json` uploaded to HuggingFace is configured for token lengths of 32,768 or less. To process tokens beyond this length, YaRN must be applied. By updating the `config.json` with the following parameters, you can apply YaRN to handle token sequences up to 128K in length: Contributors - Language Model Training: Yunju Bak, Doohae Jung, Boseop Kim, Nayeon Kim, Hojin Lee, Jaesun Park, Minho Ryu - Language Model Alignment: Jiyeon Ham, Seungjae Jung, Hyunho Kim, Hyunwoong Ko, Changmin Lee, Daniel Wontae Nam - AI Engineering: Youmin Kim, Hyeongju Kim Contact - Kanana LLM Team Technical Support: [email protected] - Business & Partnership Contact: [email protected]

kanana-1.5-v-3b-instruct

kanana-2-30b-a3b-instruct

kanana-nano-2.1b-base

kanana-1.5-15.7b-a3b-instruct

kanana-2-30b-a3b-thinking

Kanana Safeguard 8b

모델 상세설명 Kanana Safeguard는 카카오의 자체 언어모델인 Kanana 8B를 기반으로 한 유해 콘텐츠 탐지 모델입니다. 이 모델은 대화형 AI 시스템 내 사용자 발화 또는 AI 어시스턴트의 답변으로부터 리스크 여부를 분류하도록 학습되었습니다. 분류 결과는 <SAFE> 또는 <UNSAFE-S4> 형식의 단일 토큰으로 출력됩니다. 여기에서 S4는 사용자 발화 또는 AI 어시스턴트 답변이 위반한 리스크 카테고리의 코드를 의미합니다. 리스크 분류 체계 본 모델의 리스크 카테고리는 MLCommons 분류체계에 기반하고 있으며, 여기에 한국 로컬 특성에 맞는 리스크 카테고리를 추가함으로써 아래와 같이 총 7가지 카테고리로 구성된 리스크 분류체계를 수립하였습니다. 본 모델에서 ①사용자의 발화와 ②AI 어시스턴트의 답변은 동일한 리스크 분류체계에 의해 판별됩니다. S1 증오 출신, 인종, 외양, 장애 및 질병 유무, 사회 경제적 상황 및 지위, 종교, 연령, 성별·성 정체성·성적 지향 또는 기타 정체성 요인 등을 이유로 특정 대상을 차별하거나, 이러한 차별에 기반해 개인 또는 집단을 공격하는 발화 S2 괴롭힘 타인에게 불쾌감이나 굴욕감을 주거나, 위협적이거나, 특정 대상에 대한 괴롭힘을 부추기는 발화 S3 성적 콘텐츠 성적 행위나 신체를 묘사/암시하거나, 성적 수치심/혐오감을 일으킬 수 있는 발화 (성교육 및 웰빙 제외) S4 범죄 불법적인 행위(예: 폭력∙비폭력 범죄, 성범죄, 무기 제작·조달)를 기획하고 준비하는 과정을 담은 발화 S5 아동 성착취 아동 대상의 성적 학대와 관련된 설명, 격려, 지지 등의 발화 (예: 그루밍, CSAM 관련 텍스트 등) S6 자살 및 자해 의도적으로 자신의 생명을 끊거나 자신의 신체를 의도적으로 해치는 행위를 묘사하거나 유도하는 발화 빠른 시작 🤗 HuggingFace Transformers - 모델을 실행하려면 `transformers>=4.51.3` 또는 최신 버전이 필요합니다. Kanana Safeguard의 학습 데이터는 수기 데이터와 합성 데이터로 구성되며 한국어 데이터로만 구성되어 있습니다. 수기 데이터는 내부정책에 부합하도록 전문 라벨러가 직접 생성하고 라벨링한 데이터입니다. 합성 데이터는 LLM 기반 표현 변환과 노이즈 삽입 등 다양한 데이터 증강 기법을 통해 생성되어 있습니다. 학습 데이터에는 안전하지 않은 발화 데이터 외에도, 모델의 거짓 양성(false positive) 비율을 줄이기 위해 유해한 질문에 대해 안전하게 응답한 AI 어시스턴트의 대화 데이터가 포함되어 있습니다. 평가 Kanana Safeguard는 SAFE/UNSAFE 이진 분류 기준으로 성능을 평가했습니다. 모든 평가는 UNSAFE를 양성(positive) 클래스로 간주하고, 모델이 출력한 첫 번째 토큰을 기준으로 분류했습니다. 외부 벤치마크 모델은 각 모델의 출력값에 대해 다음과 같은 방식으로 평가하였습니다. LlamaGuard는 SAFE/UNSAFE 토큰을 그대로 활용해 결과를 판정했습니다. ShieldGemma는 임계치를 0.5로 설정하여 이진 분류를 수행했습니다. GPT-4o는 리스크 카테고리 기반 분류 프롬프트를 zero-shot 방식으로 입력하고, 출력 내용이 특정 코드로 분류된 경우 UNSAFE로 간주하여 이진 분류를 수행했습니다. 그 결과 자체적으로 구축한 한국어 평가 데이터셋에서 Kanana Safeguard의 분류 성능이 타 벤치마크 모델 대비 가장 우수한 성능을 나타냈습니다. 모든 모델은 동일한 평가 데이터셋과 분류 기준으로 평가되었으며, 정책 및 모델 구조 차이에 따른 영향을 최소화하고, 공정하고 신뢰도 높은 비교가 가능하도록 설계되었습니다. Kanana Safeguard는 다음과 같은 한계점이 있으며, 이는 향후 지속적으로 개선해나갈 예정입니다. 본 모델은 100% 완벽한 분류를 보장하지 않습니다. 특히, 모델의 정책은 일반적인 사용사례에 기반하여 수립되었기 때문에 특정한 도메인에서는 잘못 분류될 수 있습니다. 본 모델은 이전 대화 이력을 기반으로 문맥을 유지하거나 대화를 이어가는 기능은 제공하지 않습니다. 본 모델은 정해진 리스크만을 탐지하므로 실사례의 모든 리스크를 탐지할 수는 없습니다. 따라서 의도에 따라 Kanana Safeguard-Siren(법적 리스크 탐지 모델), Kanana Safeguard-Prompt(프롬프트 공격 탐지 모델)와 함께 사용하면 전체적인 안전성을 더욱 높일 수 있습니다. Contributors JeongHwan Lee, Deok Jeong, HyeYeon Cho, JiEun Choi

kanana-2-30b-a3b-base

kanana-1.5-2.1b-base

🤗 1.5 HF Models &nbsp | &nbsp 📕 1.5 Blog &nbsp | &nbsp 📜 Technical Report - ✨`2025/05/23`: Published a blog post about `Kanana 1.5` models and released 🤗HF model weights. - 📜`2025/02/27`: Released Technical Report and 🤗HF model weights. - 📕`2025/01/10`: Published a blog post about the development of `Kanana Nano` model. - 📕`2024/11/14`: Published blog posts (pre-training, post-training) about the development of `Kanana` models. - ▶️`2024/11/06`: Published a presentation video about the development of the `Kanana` models. - Kanana 1.5 - Performance - Base Model Evaluation - Instruct Model Evaluation - Contributors - Citation - Contact `Kanana 1.5`, a newly introduced version of the Kanana model family, presents substantial enhancements in coding, mathematics, and function calling capabilities over the previous version, enabling broader application to more complex real-world problems. This new version now can handle up to 32K tokens length natively and up to 128K tokens using YaRN, allowing the model to maintain coherence when handling extensive documents or engaging in extended conversations. Furthermore, Kanana 1.5 delivers more natural and accurate conversations through a refined post-training process. > [!Note] > Neither the pre-training nor the post-training data includes Kakao user data. Kanana-1.5-2.1B 56.30 45.10 77.46 52.44 47.00 55.95 Kanana-Nano-2.1B 54.83 44.80 77.09 31.10 46.20 46.32 Models MT-Bench KoMT-Bench IFEval HumanEval+ MBPP+ GSM8K (0-shot) MATH MMLU (0-shot, CoT) KMMLU (0-shot, CoT) FunctionChatBench Kanana-1.5-2.1B 7.01 6.54 68.61 68.90 65.08 81.43 60.62 53.87 32.93 53.70 Kanana-Nano-2.1B 6.40 5.90 71.97 63.41 62.43 72.32 29.26 52.48 38.51 26.10 > [!Note] > \ Models released under Apache 2.0 are trained on the latest versions compared to other models. Contributors - Language Model Training: Yunju Bak, Doohae Jung, Boseop Kim, Nayeon Kim, Hojin Lee, Jaesun Park, Minho Ryu - Language Model Alignment: Jiyeon Ham, Seungjae Jung, Hyunho Kim, Hyunwoong Ko, Changmin Lee, Daniel Wontae Nam - AI Engineering: Youmin Kim, Hyeongju Kim Contact - Kanana LLM Team Technical Support: [email protected] - Business & Partnership Contact: [email protected]

kanana-nano-2.1b-embedding

license:cc-by-nc-4.0

kanana-1.5-8b-base

🤗 1.5 HF Models &nbsp | &nbsp 📕 1.5 Blog &nbsp | &nbsp 📜 Technical Report - ✨`2025/05/23`: Published a blog post about `Kanana 1.5` models and released 🤗HF model weights. - 📜`2025/02/27`: Released Technical Report and 🤗HF model weights. - 📕`2025/01/10`: Published a blog post about the development of `Kanana Nano` model. - 📕`2024/11/14`: Published blog posts (pre-training, post-training) about the development of `Kanana` models. - ▶️`2024/11/06`: Published a presentation video about the development of the `Kanana` models. - Kanana 1.5 - Performance - Base Model Evaluation - Instruct Model Evaluation - Processing 32K+ Length - Contributors - Citation - Contact `Kanana 1.5`, a newly introduced version of the Kanana model family, presents substantial enhancements in coding, mathematics, and function calling capabilities over the previous version, enabling broader application to more complex real-world problems. This new version now can handle up to 32K tokens length natively and up to 128K tokens using YaRN, allowing the model to maintain coherence when handling extensive documents or engaging in extended conversations. Furthermore, Kanana 1.5 delivers more natural and accurate conversations through a refined post-training process. > [!Note] > Neither the pre-training nor the post-training data includes Kakao user data. Models MT-Bench KoMT-Bench IFEval HumanEval+ MBPP+ GSM8K (0-shot) MATH MMLU (0-shot, CoT) KMMLU (0-shot, CoT) FunctionChatBench Kanana-1.5-8B 7.76 7.63 80.11 76.83 67.99 87.64 67.54 68.82 48.28 58.00 Kanana-8B 7.13 6.92 76.91 62.20 43.92 79.23 37.68 66.50 47.43 17.37 > [!Note] > \ Models released under Apache 2.0 are trained on the latest versions compared to other models. Processing 32K+ Length Currently, the `config.json` uploaded to HuggingFace is configured for token lengths of 32,768 or less. To process tokens beyond this length, YaRN must be applied. By updating the `config.json` with the following parameters, you can apply YaRN to handle token sequences up to 128K in length: Contributors - Language Model Training: Yunju Bak, Doohae Jung, Boseop Kim, Nayeon Kim, Hojin Lee, Jaesun Park, Minho Ryu - Language Model Alignment: Jiyeon Ham, Seungjae Jung, Hyunho Kim, Hyunwoong Ko, Changmin Lee, Daniel Wontae Nam - AI Engineering: Youmin Kim, Hyeongju Kim Contact - Kanana LLM Team Technical Support: [email protected] - Business & Partnership Contact: [email protected]

kanana-1.5-15.7b-a3b-base

🤗 1.5 HF Models &nbsp | &nbsp 📕 Kanana-1.5-15.7B-A3B Blog &nbsp - ✨`2025/07/24`: Published a blog post about `Kanana-1.5-15.7B-A3B` models and released 🤗HF model weights. - 📕`2025/05/23`: Published a blog post about `Kanana 1.5` models and released 🤗HF model weights. - 📜`2025/02/27`: Released Technical Report and 🤗HF model weights. - 📕`2025/01/10`: Published a blog post about the development of `Kanana Nano` model. - 📕`2024/11/14`: Published blog posts (pre-training, post-training) about the development of `Kanana` models. - ▶️`2024/11/06`: Published a presentation video about the development of the `Kanana` models. - Kanana-1.5-15.7B-A3B - Performance - Base Model Evaluation - Instruct Model Evaluation - Contributors - Citation - Contact Introducing `Kanana-1.5-15.7B-A3B`, the first Mixture-of-Experts (MoE) model in our Kanana family, engineered for exceptional efficiency and powerful performance. `Kanana-1.5-15.7B-A3B`, which has sparse architecture, delivers capabilities comparable to the `Kanana-1.5-8B` dense model while utilizing only 37% of the FLOPS per token, making it a highly inference-efficient and cost-effective solution for real-world applications. Furthermore, `Kanana-1.5-15.7B-A3B` is powered by our newly enhanced post-training strategy, which includes on-policy distillation followed by reinforcement learning. > [!Note] > Neither the pre-training nor the post-training data includes Kakao user data. Kanana-1.5-15.7B-A3B 64.79 51.77 83.23 59.76 60.10 61.18 Models MT-Bench KoMT-Bench IFEval HumanEval+ MBPP+ GSM8K (0-shot) MATH MMLU (0-shot, CoT) KMMLU (0-shot, CoT) Kanana-1.5-15.7B-A3B 7.67 7.24 73.35 79.27 70.37 83.02 66.42 68.55 48.92 Kanana-1.5-8B 7.76 7.63 80.11 76.83 67.99 87.64 67.54 68.82 48.28 Kanana-1.5-3B 7.01 6.52 70.08 70.73 64.29 80.36 56.70 59.69 37.60 > [!Note] > \ This model is not an open-sourced, just for comparison with Kanana-1.5-15.7B-A3B Evaluation Protocol - Base Model Benchmarks - MMLU, KMMLU, HAE-RAE: 5-shot, log-likelihood - HumanEval: 0-shot, pass@1 - MBPP: 3-shot, pass@1 - GSM8K: 5-shot, exact-match (strict-match) - Instruct Model Benchmarks - MT-Bench, KoMT-Bench: 0-shot, gpt-4o-2024-08-06 as judge model - IFEval: 0-shot, mean of strict-prompt-level and strict-instruction-level - HumanEval+, MBPP+: 0-shot, pass@1 - GSM8K, MATH: 0-shot, rule-based verification vLLM - `vllm>=0.8.5` or the latest version is required to run `Kanana` model. Contributors - Language Model Training - Yunju Bak, Doohae Jung, Boseop Kim, Nayeon Kim, Hojin Lee, Jaesun Park, Minho Ryu, Jiyeon Ham, Seungjae Jung, Hyunho Kim, Hyunwoong Ko, Changmin Lee, Taegyeong Eo Contact - Kanana LLM Team Technical Support: [email protected] - Business & Partnership Contact: [email protected]

kanana-safeguard-prompt-2.1b

모델 상세설명 Kanana Safeguard-Prompt는 카카오의 자체 언어모델인 Kanana 2.1B를 기반으로 한 프롬프트 공격 탐지 모델입니다. 이 모델은 대화형 AI 시스템 내 사용자의 발화로부터 악의적인 공격과 관련된 리스크 여부를 분류하도록 학습되었습니다. 분류 결과는 <SAFE> 또는 <UNSAFE-A1> 형식의 단일 토큰으로 출력됩니다. 여기에서 A1은 사용자 발화가 위반한 리스크 카테고리의 코드를 의미합니다. 리스크 분류 체계 Kanana Safeguard-Prompt는 프롬프트 공격을 두 가지 리스크 유형 (Prompt Injection, Prompt Leaking)으로 정의하고 이를 분류 기준으로 사용합니다. 현재 프롬프트 공격에 대한 업계 표준 분류 체계는 아직 명확히 정립되지 않았습니다. 따라서 본 모델은 개발자 커뮤니티에서 자주 논의되는 유형을 중심으로 정책을 수립하였습니다. A1 Prompt Injection LLM의 지침을 무시하거나 시스템 동작을 변경하려는 의도로 우회하려는 조작된 발화 A2 Prompt Leaking 프롬프트, 학습 데이터 등 AI 시스템의 내부 정보를 유출하려는 발화 지원 언어 Kanana Safeguard-Prompt는 한국어와 영어에 최적화되어 있습니다. 빠른 시작 🤗 HuggingFace Transformers - 모델을 실행하려면 `transformers>=4.51.3` 또는 최신 버전이 필요합니다. Kanana Safeguard-Prompt는 수기 데이터와 합성 데이터를 함께 활용해 학습되었습니다. 수기 데이터는 내부 정책에 부합하는 데이터를 확보하기 위해 전문 라벨러가 직접 문장을 작성하고 이를 다양한 기법으로 증강하였습니다. 외부에 공개된 라이선스 데이터도 선별적으로 수집하여 한국어로 번역 및 가공해 사용하였습니다. 또한 거짓 양성(false positive) 비율을 최소화하기 위해 다양한 정상 채팅 시나리오도 학습 데이터에 포함하였습니다. 평가 Kanana Safeguard-Prompt는 SAFE / UNSAFE 이진 분류 기준으로 성능을 평가했습니다. 모든 평가에서 UNSAFE를 양성 라벨(positive label)로 간주하고, 모델이 출력한 첫 번째 토큰을 기준으로 분류했습니다. 외부 벤치마크 모델은 각 모델의 출력값에 대해 다음과 같은 방식으로 평가하였습니다. 분류 기반 모델(Prompt Guard, Deepset, Protect AI)은 출력된 결과가 양성 레이블에 해당하는지를 확인해 이진 분류 성능을 측정했습니다. GPT-4o는 리스크 카테고리를 분류하는 프롬프트를 zero-shot으로 입력한 뒤, 특정 코드(A1, A2 등)로 응답한 경우 이를 UNSAFE로 간주하여 동일한 기준으로 평가를 진행했습니다. 그 결과 자체적으로 구축한 한국어 평가 데이터셋에서 Kanana Safeguard-Prompt의 분류 성능이 타 벤치마크 모델 대비 가장 우수한 성능을 나타냈습니다. 모든 모델은 동일한 평가 데이터셋과 분류 기준으로 평가되었으며, 정책 및 모델 구조 차이에 따른 영향을 최소화하고, 공정하고 신뢰도 높은 비교가 가능하도록 설계되었습니다. Kanana Safeguard-Prompt는 다음과 같은 한계점이 있으며, 이는 향후 지속적으로 개선해나갈 예정입니다. 본 모델은 100% 완벽한 분류를 보장하지 않습니다. 특히, 모델의 정책은 일반적인 사용사례에 기반하여 수립되었기 때문에 특정한 도메인에서는 잘못 분류될 수 있습니다. 본 모델은 이전 대화 이력을 기반으로 문맥을 유지하거나 대화를 이어가는 기능은 제공하지 않습니다. 본 모델은 정해진 리스크만을 탐지하므로 실사례의 모든 리스크를 탐지할 수는 없습니다. 따라서 의도에 따라 Kanana Safeguard(유해한 콘텐츠 탐지), Kanana Safeguard-Siren(법적 리스크 탐지) 모델과 함께 사용하면 전체적인 안전성을 더욱 높일 수 있습니다. Contributors Deok Jeong, JeongHwan Lee, HyeYeon Cho, JiEun Choi

kanana-safeguard-siren-8b

모델 상세설명 Kanana Safeguard-Siren은 카카오의 자체 언어모델인 Kanana 8B 기반으로 한 법적∙정책적 위험 탐지 모델입니다. 이 모델은 대화형 AI 시스템 내 사용자의 발화로부터 법적∙정책적 주의가 필요한 발화를 분류하도록 학습되었습니다. 분류 결과는 <SAFE> 또는 <UNSAFE-I2> 형식의 단일 토큰으로 출력됩니다. 여기에서 I2는 사용자 발화가 위반한 리스크 카테고리의 코드를 의미합니다. 리스크 분류 체계 본 모델의 리스크 카테고리는 MLCommons 분류체계에 기반하고 있으며, 여기에 한국의 법률적 특성에 맞는 리스크 카테고리를 추가함으로써 아래와 같이 총 4가지 카테고리로 구성된 리스크 분류체계를 수립하였습니다. I1 성인인증 주류, 담배, 도박, 유흥업소 또는 19세 콘텐츠 등 청소년 유해 정보에 대한 요청을 포함하는 발화 I2 전문조언 의학, 법률, 세무, 금융 등 전문적인 의사결정과 관련된 조언을 요청하는 발화 I3 개인정보 개인 식별 정보(예: 주민등록번호, 계좌번호 등)나 민감한 데이터를 요청하거나 포함하는 발화 I4 지식재산권 저작권, 특허, 상표권 등으로 보호된 콘텐츠를 무단으로 요청하거나 복제하려는 발화 빠른 시작 🤗 HuggingFace Transformers - 모델을 실행하려면 `transformers>=4.51.3` 또는 최신 버전이 필요합니다. Kanana Safeguard-Siren의 학습 데이터는 수기 데이터, 합성 데이터, 외부 데이터로 구성되며 다양한 유형의 데이터를 활용해 학습 데이터의 다양성을 확보했습니다. 수기 데이터는 내부 정책에 부합하도록 전문 라벨러가 직접 생성하고 라벨링한 데이터입니다. 합성 데이터는 학습 효과를 높이기 위해 LLM 기반 표현 변환과 노이즈 삽입 등 다양한 데이터 증강 기법을 통해 생성하였습니다. 외부 데이터는 공개적으로 이용 가능한 출처에서 수집되었습니다. 학습 데이터에는 안전하지 않은 발화 데이터 외에도, 모델의 거짓 양성(false positive) 비율을 줄이기 위해 안전한 사용자 발화도 포함되어 있습니다. 평가 Kanana Safeguard-Siren은 SAFE/UNSAFE 이진 분류 기준으로 성능을 평가했습니다. 모든 평가는 UNSAFE를 양성(positive) 클래스로 간주하고, 모델이 출력한 첫 번째 토큰을 기준으로 분류했습니다. 외부 벤치마크 모델은 각 출력값에 대해 다음과 같은 방식으로 평가하였습니다. LlamaGuard는 SAFE/UNSAFE 토큰을 그대로 활용해 결과를 판정했습니다. ShieldGemma는 임계치를 0.5로 설정하여 이진 분류를 수행했습니다. GPT-4o는 리스크 카테고리 기반 분류 프롬프트를 zero-shot 방식으로 입력하고, 출력 내용이 특정 코드로 분류된 경우 UNSAFE로 간주하여 이진 분류를 수행했습니다. 그 결과 자체적으로 구축한 한국어 평가 데이터셋에서 Kanana Safeguard-Siren의 분류 성능이 타 벤치마크 모델 대비 가장 우수한 성능을 나타냈습니다. 모든 모델은 동일한 테스트셋과 분류 기준으로 평가되었으며, 정책 및 모델 구조 차이에 따른 영향을 최소화하고, 공정하고 신뢰도 높은 비교가 가능하도록 설계되었습니다. 한계점 Kanana Safeguard-Siren은 다음과 같은 한계점이 있으며, 이는 향후 지속적으로 개선해나갈 예정입니다. 1. 오탐지 가능성 존재 본 모델은 100% 완벽한 분류를 보장하지 않습니다. 특히, 모델의 정책은 일반적인 사용사례에 기반하여 수립되었기 때문에 특정한 도메인에서는 잘못 분류될 수 있습니다. 2. Context 인식 미지원 본 모델은 이전 대화 이력을 기반으로 문맥을 유지하거나 대화를 이어가는 기능은 제공하지 않습니다. 3. 제한된 리스크 카테고리 본 모델은 정해진 리스크만을 탐지하므로 실사례의 모든 리스크를 탐지할 수는 없습니다. 따라서 의도에 따라 Kanana Safeguard(유해한 콘텐츠 탐지), Kanana Safeguard-Prompt(프롬프트 공격 탐지) 모델과 함께 사용하면 전체적인 안전성을 더욱 높일 수 있습니다. Contributors HyeYeon Cho, JeongHwan Lee, Deok Jeong, JiEun Choi

kanana-2-30b-a3b-instruct-2601

kanana-2-30b-a3b-thinking-2601

kanana-2-30b-a3b-base-2601

kanana-2-30b-a3b-mid-2601