small-models-for-glam
Qwen3.5-0.8B-SFT-name-parser-yaml
Qwen3.5-2B-SFT-name-parser-yaml
Qwen3 0.6B SFT Name Parser Yaml
This model is a fine-tuned version of Qwen/Qwen3-0.6B specifically designed for parsing cultural heritage person names into structured YAML format. It has been trained using TRL with supervised fine-tuning (SFT). This specialized model parses person names from cultural heritage contexts (libraries, archives, museums) into structured YAML format with the following fields: - `firstname`: Person's given name - `lastname`: Person's family name or surname - `middlenames`: List of middle names or initials - `temporal`: List of temporal information (birth, death, flourished dates) - `titles`: List of titles, honorifics, or professional designations - `extrainfo`: List of additional information (places, affiliations) The model handles a wide variety of name formats commonly found in cultural heritage contexts: Basic Patterns - `John Smith` - `Smith, John` - `Dr. John Smith` - `John A. Smith` Complex Patterns - `Baron William Henry Ashe A'Court Heytesbury, c. 1809-1891` - `Jones, James Earl, Dr., (fl. 1850-1900)` - `Miller, Chester F. (Chester Frederic), 1886-` - `Rábade Obradó, Ana Isabel` - `彭大铨` (Chinese names) Edge Cases - Mononyms: `Salzmann`, `Mokamba` - Initials: `J. F. Vitry`, `A. E. Borie` - Diacritics: `Péporté`, `Gerencsér` - Temporal data: `Rosana, 1963-` - Parenthetical expansions: `T. (Takeshi) Ohba` Training Data The model was trained on a synthetic dataset of 1,000+ examples generated using a comprehensive template-based approach that covers: - 70% regular examples: Standard name patterns with various combinations of fields - 30% edge cases: Challenging patterns including mononyms, initials, diacritics, and non-Western names Data Generation Features - Multi-cultural support: Names from English, French, German, Italian, Spanish, Dutch, Arabic, and Chinese contexts - Temporal data variety: Birth/death dates, flourished periods, single dates - Title diversity: Academic, religious, nobility, military, and professional titles - Complex surnames: Hyphenated, apostrophized, and particle-based surnames (van, von, de, al-, ibn) Training Configuration - Base model: Qwen/Qwen3-0.6B - Training method: Supervised Fine-Tuning (SFT) using TRL - Output format: YAML with consistent field ordering - Chat template: Standard user/assistant format with "Parse this person name:" prompt Framework Versions - TRL: 0.23.0 - Transformers: 4.56.2 - PyTorch: 2.8.0 - Datasets: 4.1.1 - Tokenizers: 0.22.1 The model demonstrates strong performance on cultural heritage name parsing tasks: - Handles diverse international name formats - Correctly identifies and structures temporal information - Processes titles, honorifics, and professional designations - Manages complex surname patterns and particles - Supports mononyms and abbreviated names - Primarily trained on Western and East Asian name patterns - May struggle with very rare or highly specialized naming conventions - Temporal date parsing assumes Gregorian calendar years - Limited support for ancient or historical dating systems (BCE, regnal years) Primary Use Cases - Digital humanities: Processing historical person names in manuscripts and documents - Library science: Cataloging and standardizing author names in bibliographic records - Archive management: Structuring person names in archival finding aids - Museum collections: Organizing creator and subject names in cultural heritage databases Out-of-Scope Use - Modern person name parsing for contemporary applications - Legal document processing requiring high precision - Real-time person identification or verification - Processing of fictional character names - The model reflects naming conventions present in its training data - Cultural biases may exist toward Western naming patterns - Should not be used for identity verification or legal purposes - Consider cultural sensitivity when processing names from different traditions For questions about this model card or the model itself, please open an issue in the project repository.
iconclass-vlm
historical-illustration-detector
historic-newspaper-illustrations-yolov11
Qwen3-VL-2B-catmus
This model is a fine-tuned version of Qwen/Qwen3-VL-2B-Instruct for transcribing line-level medieval manuscripts from images. It has been trained using TRL on the CATMuS/medieval dataset. This vision-language model specializes in transcribing text from images of line-level medieval manuscripts. Given an image of manuscript text, the model generates the corresponding transcription. The model was evaluated on 100 examples from the CATMuS/medieval dataset (test split). | Metric | Base Model | Fine-tuned Model | Improvement | |--------|-----------|------------------|-------------| | Character Error Rate (CER) | 1.0815 (108.15%) | 0.2779 (27.79%) | +74.30% | | Word Error Rate (WER) | 1.7386 (173.86%) | 0.7043 (70.43%) | +59.49% | Here are some example transcriptions comparing the base model and fine-tuned model: Example 1: - Reference: paulꝯ ad thessalonicenses .iii. - Base Model: Paulus ad the Malomancis · iii. - Fine-tuned Model: Paulꝰ ad thessalonensis .iii. Example 2: - Reference: acceptad mi humilde seruicio. e dissipad. e plantad en el - Base Model: acceptad mi humilde servicio, e dissipad, e plantad en el - Fine-tuned Model: acceptad mi humilde seruicio, e dissipad, e plantad en el Example 3: - Reference: ꝙ mattheus illam dictionem ponat - Base Model: p mattheus illam dictoneum proa - Fine-tuned Model: ꝑ mattheus illam dictione in ponat Example 4: - Reference: Elige ꝗd uoueas. eadẽ ħ ꝗꝗ sama ferebat. - Base Model: f. ligeq d uonear. eade h q q fama ferebat. - Fine-tuned Model: f liges ꝗd uonear. eadẽ li ꝗq tanta ferebat᷑. Example 5: - Reference: a prima coniugatione ue - Base Model: Grigimacopissagazione-ve - Fine-tuned Model: a ꝑrũt̾tacõnueꝰatione. ne This model is designed for: - Transcribing line-level medieval manuscripts - Digitizing historical manuscripts - Supporting historical research and archival work - Optical Character Recognition (OCR) for specialized historical texts This model was fine-tuned using Supervised Fine-Tuning (SFT) with LoRA adapters on the Qwen3-VL-2B-Instruct base model. The model was trained on CATMuS/medieval, a dataset containing images of line-level medieval manuscripts with corresponding text transcriptions. - Base Model: Qwen/Qwen3-VL-2B-Instruct - Training Method: Supervised Fine-Tuning (SFT) with LoRA - LoRA Configuration: - Rank (r): 16 - Alpha: 32 - Target modules: qproj, kproj, vproj, oproj, gateproj, upproj, downproj - Dropout: 0.1 - Training Arguments: - Epochs: 3 - Batch size per device: 2 - Gradient accumulation steps: 4 - Learning rate: 5e-05 - Optimizer: AdamW - Mixed precision: FP16 - TRL: 0.23.0 - Transformers: 4.57.1 - Pytorch: 2.8.0 - Datasets: 4.1.1 - Tokenizers: 0.22.1 - The model is specialized for line-level medieval manuscripts and may not perform well on other types of text or images - Performance may vary depending on image quality, resolution, and handwriting style - The model has been trained on a specific dataset and may require fine-tuning for other manuscript collections If you use this model, please cite the base model and training framework: README generated automatically on 2025-10-24 10:49:05
Qwen3-0.6B-SFT-AAT-Materials
Qwen3-VL-8B-catmus
This model is a fine-tuned version of Qwen/Qwen3-VL-8B-Instruct for transcribing medieval Latin manuscripts from images. It has been trained using TRL on the CATMuS/medieval dataset. This vision-language model specializes in transcribing text from images of medieval Latin manuscripts. Given an image of manuscript text, the model generates the corresponding transcription. The model was evaluated on 100 examples from the CATMuS/medieval dataset (test split). | Metric | Base Model | Fine-tuned Model | Improvement | |--------|-----------|------------------|-------------| | Character Error Rate (CER) | 0.3778 (37.78%) | 0.1997 (19.97%) | +47.14% | | Word Error Rate (WER) | 0.8300 (83.00%) | 0.5457 (54.57%) | +34.25% | Here are some example transcriptions comparing the base model and fine-tuned model: Example 1: - Reference: paulꝯ ad thessalonicenses .iii. - Base Model: paul9adthellalomconceB·iii· - Fine-tuned Model: paulꝰ ad thessalonicenses .iii. Example 2: - Reference: acceptad mi humilde seruicio. e dissipad. e plantad en el - Base Model: acceptad mi humilde servicio, è dissipad, è plantat en el - Fine-tuned Model: acceptad mi humilde seruicio. e dissipad. e plantad en el Example 3: - Reference: ꝙ mattheus illam dictionem ponat - Base Model: q mattheus illam dictionem ponat - Fine-tuned Model: ꝙ mattheus illam dictiõnem ponat Example 4: - Reference: Elige ꝗd uoueas. eadẽ ħ ꝗꝗ sama ferebat. - Base Model: fuge quoniam cade hic quia tama ferebar. - Fine-tuned Model: Fuge qd̾ uoneas. eadẽ ħ ꝗꝗ sana ferebat: Example 5: - Reference: a prima coniugatione ue - Base Model: aprimaconiugazioneue - Fine-tuned Model: a prima coniugatione ue This model is designed for: - Transcribing medieval Latin manuscripts - Digitizing historical manuscripts - Supporting historical research and archival work - Optical Character Recognition (OCR) for specialized historical texts This model was fine-tuned using Supervised Fine-Tuning (SFT) with LoRA adapters on the Qwen3-VL-8B-Instruct base model. The model was trained on CATMuS/medieval, a dataset containing images of medieval Latin manuscripts with corresponding text transcriptions. - Base Model: Qwen/Qwen3-VL-8B-Instruct - Training Method: Supervised Fine-Tuning (SFT) with LoRA - LoRA Configuration: - Rank (r): 16 - Alpha: 32 - Target modules: qproj, kproj, vproj, oproj, gateproj, upproj, downproj - Dropout: 0.1 - Training Arguments: - Epochs: 3 - Batch size per device: 2 - Gradient accumulation steps: 4 - Learning rate: 5e-05 - Optimizer: AdamW - Mixed precision: FP16 - TRL: 0.23.0 - Transformers: 4.57.1 - Pytorch: 2.8.0 - Datasets: 4.1.1 - Tokenizers: 0.22.1 - The model is specialized for medieval Latin manuscripts and may not perform well on other types of text or images - Performance may vary depending on image quality, resolution, and handwriting style - The model has been trained on a specific dataset and may require fine-tuning for other manuscript collections If you use this model, please cite the base model and training framework: README generated automatically on 2025-10-24 10:40:41
Qwen3-VL-4B-catmus
This model is a fine-tuned version of Qwen/Qwen3-VL-4B-Instruct for transcribing line-level medieval manuscripts from images. It has been trained using TRL on the CATMuS/medieval dataset. This vision-language model specializes in transcribing text from images of line-level medieval manuscripts. Given an image of manuscript text, the model generates the corresponding transcription. The model was evaluated on 100 examples from the CATMuS/medieval dataset (test split). | Metric | Base Model | Fine-tuned Model | Improvement | |--------|-----------|------------------|-------------| | Character Error Rate (CER) | 0.8044 (80.44%) | 0.2205 (22.05%) | +72.59% | | Word Error Rate (WER) | 1.2029 (120.29%) | 0.5714 (57.14%) | +52.49% | Here are some example transcriptions comparing the base model and fine-tuned model: Example 1: - Reference: paulꝯ ad thessalonicenses .iii. - Base Model: pauli ad theMAlomontes • 111 • - Fine-tuned Model: Paulꝰ ad thesalonicenses .iii. Example 2: - Reference: acceptad mi humilde seruicio. e dissipad. e plantad en el - Base Model: acceptad mi humilde servició, e dissipad, e plantad en el - Fine-tuned Model: acceptad mi humilde seruicio. e dissipad. e splantad en el Example 3: - Reference: ꝙ mattheus illam dictionem ponat - Base Model: g mattheus illam dictionem proua - Fine-tuned Model: ꝙ mattheus illam dictione in ponas Example 4: - Reference: Elige ꝗd uoueas. eadẽ ħ ꝗꝗ sama ferebat. - Base Model: f. luge quomoc. eade & q. fama ferebat. - Fine-tuned Model: Flige qd̵ uoneas. eadẽ ħ ꝗꝗ fama ferebat. Example 5: - Reference: a prima coniugatione ue - Base Model: d'artimacopinazione ne - Fine-tuned Model: a primiti coniugatione ut This model is designed for: - Transcribing line-level medieval manuscripts - Digitizing historical manuscripts - Supporting historical research and archival work - Optical Character Recognition (OCR) for specialized historical texts This model was fine-tuned using Supervised Fine-Tuning (SFT) with LoRA adapters on the Qwen3-VL-4B-Instruct base model. The model was trained on CATMuS/medieval, a dataset containing images of line-level medieval manuscripts with corresponding text transcriptions. - Base Model: Qwen/Qwen3-VL-4B-Instruct - Training Method: Supervised Fine-Tuning (SFT) with LoRA - LoRA Configuration: - Rank (r): 16 - Alpha: 32 - Target modules: qproj, kproj, vproj, oproj, gateproj, upproj, downproj - Dropout: 0.1 - Training Arguments: - Epochs: 3 - Batch size per device: 2 - Gradient accumulation steps: 4 - Learning rate: 5e-05 - Optimizer: AdamW - Mixed precision: FP16 - TRL: 0.23.0 - Transformers: 4.57.1 - Pytorch: 2.8.0 - Datasets: 4.1.1 - Tokenizers: 0.22.1 - The model is specialized for line-level medieval manuscripts and may not perform well on other types of text or images - Performance may vary depending on image quality, resolution, and handwriting style - The model has been trained on a specific dataset and may require fine-tuning for other manuscript collections If you use this model, please cite the base model and training framework: README generated automatically on 2025-10-24 10:46:50