OpenAI O1 models probably trained gpt-4o and turbo in chain of thought
Chris Hay Chris Hay
16.1K subscribers
936 views
32

 Published On Sep 30, 2024

did openai train gpt-4o with strawberry using reinforcement learning on Chain of Thought.? openai's new orion o1-preview models has made a step change in logic and reasoning from older models. however many are claiming it's easily replicated just by using chain of thought, but for this to work the models have to be good at chain of thought in the first place. in this video, chris looks under the hood at the generated chain of the thought for the orion o1 models, and compares it with gpt-4o, claude 3.5 sonnet, and llama 3's cot. at the end of this video you'll have a better idea of how this works. he does this using games such as sudoku and tic-tac-toe.

show more

Share/Embed