Engineering (Page 2)

Fine-Tuning Qwen-0.5B and Llama-3.2-1B with GRPO to Beat OpenAI o1-preview

Discover how GRPO fine-tuning and LLM-Judge (LLM-J) helped Qwen-0.5B and Llama3.2 1B surpass OpenAI’s O1-preview in Q&A—optimized in just 50 minutes on Colab A100!

Iddo Gino · Founder & CEO

February 22, 2025