GuoxinChen

2 models • 1 total models in database

Sort by:

ReForm-8B

🧩 ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization ReForm is a reflective Autoformalization framework that enables large language models to generate → verify → refine formal mathematical statements in an integrated self-corrective loop. It introduces Prospective Bounded Sequence Optimization (PBSO) — a novel reinforcement learning algorithm designed for heterogeneous rewards at different sequence positions — enabling stable, reflective training and substantial gains in semantic fidelity. - 🪞 Reflective Autoformalization Paradigm Turns single-pass translation into an iterative “generate–validate–refine” cycle, allowing the model to autonomously detect and correct semantic errors. - ⚖️ Prospective Bounded Sequence Optimization (PBSO) A reinforcement learning algorithm with position-specific rewards for both task accuracy and critique quality, ensuring stable and interpretable optimization. - 📈 State-of-the-art Semantic Consistency ReForm achieves an average +22.6pp improvement over the strongest baseline across four formalization benchmarks (miniF2F, ProofNet, Putnam, and AIME 2025). Figure: ReForm consistently outperforms Goedel-V2 and Kimina-Autoformalizer on all benchmarks. If you find ReForm useful for your research, please cite:

NaNK

license:apache-2.0

ReForm-32B

NaNK

license:apache-2.0