OpenAI's O1 And Journey Learning OVERFIT: AI, Machine Learning, And Deep Learning Made Simple podcast

Artwork

Innehåll tillhandahållet av Brian Carter. Allt poddinnehåll inklusive avsnitt, grafik och podcastbeskrivningar laddas upp och tillhandahålls direkt av Brian Carter eller deras podcastplattformspartner. Om du tror att någon använder ditt upphovsrättsskyddade verk utan din tillåtelse kan du följa processen som beskrivs här https://sv.player.fm/legal.

OVERFIT: AI, Machine Learning, and Deep Learning Made Simple « »
OpenAI's o1 and Journey Learning

2M ago 7:28

Dela

MP3•Episod hem

Innehåll tillhandahållet av Brian Carter. Allt poddinnehåll inklusive avsnitt, grafik och podcastbeskrivningar laddas upp och tillhandahålls direkt av Brian Carter eller deras podcastplattformspartner. Om du tror att någon använder ditt upphovsrättsskyddade verk utan din tillåtelse kan du följa processen som beskrivs här https://sv.player.fm/legal.

This paper details the authors' research journey to replicate OpenAI's "O1" language model, which is designed to solve complex reasoning tasks. The researchers document their process with detailed insights, hypotheses, and challenges encountered. They present a novel paradigm called "Journey Learning" that enables models to learn the complete exploration process, including trial and error, reflection, and backtracking, which they argue outperforms traditional "shortcut learning" methods. The authors also propose a multi-step evaluation approach that utilizes reasoning trees, reward models, and a human-AI collaborative annotation pipeline to generate high-quality long-form reasoning data.

Read more: https://github.com/GAIR-NLP/O1-Journey/blob/main/resource/report.pdf

… continue reading

71 episoder

Artwork

OpenAI's o1 and Journey Learning

OVERFIT: AI, Machine Learning, and Deep Learning Made Simple

published 2M ago

Dela

MP3•Episod hem

Innehåll tillhandahållet av Brian Carter. Allt poddinnehåll inklusive avsnitt, grafik och podcastbeskrivningar laddas upp och tillhandahålls direkt av Brian Carter eller deras podcastplattformspartner. Om du tror att någon använder ditt upphovsrättsskyddade verk utan din tillåtelse kan du följa processen som beskrivs här https://sv.player.fm/legal.

This paper details the authors' research journey to replicate OpenAI's "O1" language model, which is designed to solve complex reasoning tasks. The researchers document their process with detailed insights, hypotheses, and challenges encountered. They present a novel paradigm called "Journey Learning" that enables models to learn the complete exploration process, including trial and error, reflection, and backtracking, which they argue outperforms traditional "shortcut learning" methods. The authors also propose a multi-step evaluation approach that utilizes reasoning trees, reward models, and a human-AI collaborative annotation pipeline to generate high-quality long-form reasoning data.

Read more: https://github.com/GAIR-NLP/O1-Journey/blob/main/resource/report.pdf

… continue reading

71 episoder

Alla avsnitt

×

Välkommen till Player FM

Player FM scannar webben för högkvalitativa podcasts för dig att njuta av nu direkt. Den är den bästa podcast-appen och den fungerar med Android, Iphone och webben. Bli medlem för att synka prenumerationer mellan enheter.

Lyssna på 500+ ämnen