Tülu 3 405B edges DeepSeek

January 31, 2025

"Following the success of our Tülu 3 release in November, we are thrilled to announce the launch of Tülu 3 405B —The first application of fully open post-training recipes to the largest open-weight models.

"With this release, we demonstrate the scalability and effectiveness of our post-training recipe applied at 405B parameter scale.

"Tülu 3 405B achieves competitive or superior performance to both Deepseek v3 and GPT-4o, while surpassing prior open-weight post-trained models of the same size including Llama 3.1 405B Instruct and Nous Hermes 3 405B on many standard benchmarks.

Interestingly, we found that our Reinforcement Learning from Verifiable Rewards (RLVR) framework improved the MATH performance more significantly at a larger scale, i.e., 405B compared to 70B and 8B, similar to the findings in the DeepSeek-R1 report.

Overall, our results show a consistent edge over DeepSeek V3, especially with the inclusion of safety benchmarks.

Search This Blog

chatainews

Tülu 3 405B edges DeepSeek

Comments

Post a Comment

Popular posts from this blog

When their AI chums have Bob's data

Hamza Chaudhry

Supporting Artistes (SAs)