Kongfha
KlonSuphap LM
Visit Demo Space -> Kongfha/KlonSuphap-Generator Visit GitHub Repository -> Kongfha/KlonSuphap-LM Visit Blog (Thai Language) -> 🌾 KlonSuphap-LM แต่งกลอนแปด ด้วย GPT-2 KlonSuphap-LM or GPT-2 for Thai poems (Klon-Paed Poem). I use GPT-2 base Thai as a pre-trained model for fine-tuning exclusively on Thai Klon-Paed Poem (กลอนแปด) retrieved from Thai Literature Corpora (TLC) dataset. Prior to my recent poem-generation model, PhraAphaiManee-LM, although the model can perform a depiction of Thai Klon-Paed Poems, it still does not adhere to the rules of Thai Klon-Paed (ฉันทลักษณ์) in its generated output. To overcome this challenge I developed techniques that make the model to be more adhere to rules are as follows. 1. Fine-Tuning dataset preprocessing.   As I have a limited quantity of Thai Klon-Paed Poem or about 65770 lines (บาท), thus to succeed in the objective of making the model to be more adhere to rules, I developed a technique called "Rhyme Tagging".   "Rhyme Tagging" performs tag insertion before and after words that are expected to rhyme with the other words based on Klon-Paed Rules. Example >  พอได้ยินเสียงระฆังข้างหลัง\ เขา\ เห็นผู้\ เฒ่า\ ออกจากชะวาก\ ผา\ สรรพางค์ร่างกายแก่ช\ รา\ แต่ผิว\ หน้า\ นั้นละม้ายคล้ายทา\ รก\    With "Rhyme Tagging", the potential loss of rhyme information due to an overwhelming flood of non-rhyme-related data can be mitigated. This approach aids the self-attention mechanism in extracting a greater amount of rhyme-related information, ensuring its preservation and relevance throughout the processing. 2. Applying Attention-Mask while fine-tuning.   Apart from performing a common fine-tuning process using the preprocessed dataset, I did fine-tune the model by applying Attention-Mask to non-rhyme-related words to the dataset as following visualization. Visualized Example >  ------------------------------\ เขา\ -----\ เฒ่า\ --------------------\ ผา\ ---------------------------\ รา\ ------\ หน้า\ -----------------------\ รก\    By applying Attention-Mask while fine-tuning, the model can prioritize the extraction of information from both the rhyme-tags and their surrounding words without dropping positional information. This enhances the model's performance in subsequent stages of fine-tuning as if the model were constructing lookup table for rhyme-related words. 3. Performing Reinforcement Learning   After the stage of Supervised Fine-Tuning, I perform Reinforcement Learning to the model using voidful/TextRL by defining Klon-Paed Grader as a PPO Environment.   I perform Reinforcement Learning by randomly pick initial 2-5 syllables from the validation set as text inputs in an observation list, then I force the model to generate only 1 line (บาท) which has only 1 rhyme pair.   TextRL will repeatedly feed text inputs from the observation list to the model and calculate the reward using my Klon-Paed Grader, then update the model's weights based on rewards it recieved. Cherry-Picked Examples From Demo (Top-P 0.8 Temp 0.8) >  ปัญญาประดิษฐ์องค์ทรงสุรดี เห็นสุดมีบังคมก้มเกศา ต่างยิ้มละลูกยับลงตรงบันลา ถึงว่ารุ่งรางสว่างกลางนวัง   >  ขอขอบคุณบุญกุศลจิต เป็นเพื่อนคิดจะเป็นคู่เคหา ต่างคนกับเหล่านางสร้อยตา ต้องมาก็จะมาไปว่าไร   >  ทรานส์ฟอร์เมอร์มีเซลฟ์แอตเทนชัน ขึ้นบรรลักษณ์ก็เหลือบเขียนฉงน ที่จับต้อนแต่เรือนเพื่อนเหมือนอย่างวน จะต้องชวนมาช่วยให้เชยชม