Tülu 3 405B edges DeepSeek

"Following the success of our Tülu 3 release in November, we are thrilled to announce the launch of Tülu 3 405B —The first application of fully open post-training recipes to the largest open-weight models. 


"Tülu 3 405B achieves competitive or superior performance to both Deepseek v3 and GPT-4o, while surpassing prior open-weight post-trained models of the same size including Llama 3.1 405B Instruct and Nous Hermes 3 405B on many standard benchmarks. 

Interestingly, we found that our Reinforcement Learning from Verifiable Rewards (RLVR) framework improved the MATH performance more significantly at a larger scale, i.e., 405B compared to 70B and 8B, similar to the findings in the DeepSeek-R1 report. 

Overall, our results show a consistent edge over DeepSeek V3, especially with the inclusion of safety benchmarks.



Comments

Popular posts from this blog

Perplexity

Hamza Chaudhry