Persian-OCR is a deep learning model for Optical Character Recognition (OCR), designed specifically for Persian text. The model employs a CNN + Transformer architecture trained with CTC loss to extract text from images.
The model was trained on a custom dataset of approximately 600,000 synthetic Persian text images. These images were generated from Wikipedia text using 49 different Persian fonts, with sequence lengths ranging from 0 to 150 characters.
On this dataset, the model achieves a sequence accuracy of 96%.
The model may benefit from further fine-tuning on real-world data, and contributions or collaborations are warmly welcomed.
🤝 Contributing Contributions are welcome! If you have a dataset of real-world Persian text or improvements to the model, please open an issue or submit a pull request.
📬 Contact For collaboration or inquiries, please reach out via [email protected]
- `pytorchmodel.bin` : PyTorch model weights - `vocab.json` : Character vocabulary - `model.py` : Python script defining the CNN + Transformer OCR model - `utils.py` : Utility functions for OCR, including `ocrpage` and `loadvocab` - `config.json` : Model configuration
import torch import json import sys import importlib.util from huggingfacehub import hfhubdownload
1️⃣ Load vocab vocabpath = hfhubdownload("farbodpya/Persian-OCR", "vocab.json") with open(vocabpath, "r", encoding="utf-8") as f: vocab = json.load(f) idxtochar = {int(k): v for k, v in vocab["idxtochar"].items()}
2️⃣ Import model.py modelfile = hfhubdownload("farbodpya/Persian-OCR", "model.py") specmodel = importlib.util.specfromfilelocation("model", modelfile) modelmodule = importlib.util.modulefromspec(specmodel) sys.modules["model"] = modelmodule specmodel.loader.execmodule(modelmodule) from model import CNNTransformerOCR
3️⃣ Import utils.py utilsfile = hfhubdownload("farbodpya/Persian-OCR", "utils.py") specutils = importlib.util.specfromfilelocation("utils", utilsfile) utilsmodule = importlib.util.modulefromspec(specutils) sys.modules["utils"] = utilsmodule specutils.loader.execmodule(utilsmodule) from utils import ocrpage
4️⃣ Load model weights weightspath = hfhubdownload("farbodpya/Persian-OCR", "pytorchmodel.bin") model = CNNTransformerOCR(numclasses=len(idxtochar)+1) model.loadstatedict(torch.load(weightspath, maplocation="cpu")) model.eval()
5️⃣ Run OCR on an image imgpath = "sample.png" # replace with your own image text = ocrpage(imgpath, model, idxtochar) print("\n=== Final OCR Page ===\n", text)