Getting tricky with o1 🦹


"'Reasoning' is a semantic thing in my opinion," Kang told The Register. "They are doing test-time scaling, which is roughly similar to what AlphaGo does. I don't know how to adjudicate semantic arguments, but I would anticipate that most people would consider this reasoning."

The o1 model set —which presently consists of o1-preview and o1-mini  —employs "chain of thought" techniques. 

In a 2022 paper, Google researchers described chain of thought as a series of intermediate natural language reasoning steps that lead to the final output.

OpenAI has explained the technique as meaning o1 "learns to break down tricky steps into simpler ones. It learns to try a different approach when the current one isn't working. This process dramatically improves the model's ability to reason." 

[Noam] Brown cautions that o1 is not always better than GPT-4o. "Many tasks don't need reasoning, and sometimes it's not worth it to wait for an o1 response vs a quick GPT-4o response," he explains. "One motivation for releasing o1-preview is to see what use cases become popular, and where the models need work."



Comments

Popular posts from this blog

Perplexity

Aphorisms: AI

Is this Dalle3 supposed to narrate with images?