ChatGPT maker announces OpenAI o1 model with reasoning capabilities: what is it, how it works, price, how to use it

OpenAI o1 is a new family of LLMs (Large Language Models) smarter than GPT 4o (the LLM powering ChatGPT). OpenAI claims o1 can think and reason and “feels surprisingly human”. The AI research company has released several benchmarks to back its claim. Let’s check them out and learn how OpenAI o1 works, how to use it, how much it costs, and what it can and cannot do.

What is OpenAI o1: how is it better than GPT 4o

Before we get to the what, here’s the why, or why o1 matters. At least with AI text generation, GPT 4o and its counterparts function like advanced predictive text systems with autocomplete capability. Well, that is said to change with the newly launched o1 family of LLM models.

OpenAI has announced o1-preview and o1-mini (a scaled-down version of the former tuned for writing and debugging codes). When we mention o1 here in this article, we mean the larger preview model, unless otherwise specified.

o1 is internally called “Strawberry” and this OpenAI Strawberry model has been heavily anticipated due to its alleged human-like reasoning abilities. 

‘o’ in O1 stands for Omni (means ‘all’) and 1 refers to “resetting the counter back to 1”. This suggests the company feels o1 marks a milestone in its AI research and product roadmap.

o1 model is trained to learn, realize mistakes, relearn, and follow different strategies to solve a problem.  

o1 example

How OpenAI o1 works

o1 is trained using a new dataset tailored for it and an optimisation algorithm/process called “reinforcement learning”. o1 is made to analyse and solve complex problems (with mathematics and logical reasoning). Think of it like how a human mind follows a chain of micro thoughts each leading to the other thought. Parallely, it can think of discrete things. Similarly, o1 also learns patterns and as a machine with vast memory, it trains on a large number of problems.

o1 is made to recognise the right answers or choose the right series of steps through a carrot-and-stick approach. 

Where o1 really surprises even OpenAI researchers is when o1 hits a roadblock while solving a problem, it gathers more resources (on its own) and uses them to achieve the goal (source: OpenAI o1 system card). That brings us to:

OpenAI o1 strengths: what it can do

  • o1 can reportedly reason like a human. 
  • It can fact-check itself. 
  • It can give you an illusion of thinking by using human-like phrases “Oh, I’m running out of time, let me get to an answer quickly,” or “I could do this or that, what should I do?”
  • o1 has scored 83 percent in the International Mathematical Olympiad (IMO) for high schoolers in the US. GPT 4o could manage 13 percent only. 

  • Apollo Research has found in its testing that o1 has better self-knowledge, self-reasoning/awareness, and applied theory of mind than GPT-4o.
  • o1 also has multilingual skills, notably in languages like Korean and Arabic.
  • It can solve puzzle games like acrostic and LSAT logic games, answer Ph.D.-grade chemistry questions, help physicists solve complex formulas, help healthcare researchers annotate cell sequencing data, diagnose a person’s illness based on their report including symptoms and history, write codes, and analyse legal briefs.
    OpenAI chief scientist Jakub Pachocki says, “This model can take its time. It can think through the problem — in English — and try to break it down and look for angles in an effort to provide the best answer.”

    OpenAI o1 limitations: what it can’t do

    • o1 isn’t multimodal like GPT 4o and other popular LLM models out there today. In other words, it can’t analyse files, images, videos, etc. It can only read, process and write text.
    • It can’t browse the web for real-time results. 
    • Its knowledge is limited till October 2023, just like GPT-4o.

    OpenAI o1 concerns

    • OpenAI clearly states it hasn’t solved hallucinations and other inherent problems with AI models like biases. 
    • Its responses can still be factually wrong. 
    • Just because it is good at solving maths problems, doesn’t mean it can be a good maths tutor.
    • It is slow and while it seems like it’s thinking (which can even take around 10 seconds), you get a label telling you what it is thinking/doing.
    • OpenAI has for the very first time given one of its models a “medium” rating for chemical, biological, radiological and nuclear weapon risk. The model comes with asterisks warning you of the possible dangers of using it carelessly. 
    You are advised to take the benchmark scores with a proverbial grain of salt until more verifiable results or objective evidence appear.

    OpenAI o1 price, availability

    o1-preview and o1-mini are available now to ChatGPT Plus (costs about Rs 1,677) or ChatGPT Team Plan users and from September 19th, it will be available to the ChatGPT Enterprise and Edu users.

    The paid users can access only 30 messages a week (in the case of o1-preview) and 50 messages a week (in the case of o1-mini).

    o1-preview API costs $15 (about Rs 1,258) per 1 million input tokens and $60 (about Rs 5,032) per 1 million output tokens. Meanwhile, GPT-4o costs $5 (about Rs 419.32) per 1 million input tokens and $15 (about Rs 1,258) per 1 million output tokens.

    o1-mini will be available 80 percent cheaper than o1-preview’s price.

    In case you’re wondering, yes, OpenAI aims to bring o1-mini to free ChatGPT users later. But, the company hasn’t disclosed a release date or timeline.

    How to use OpenAI o1 

    Step 1: Open ChatGPT.com or ChatGPT app

    Step 2: Log in to your account. In case you aren’t a ChatGPT Plus user, upgrade to it.

    Step 3: Once you are on the ChatGPT homescreen, tap the button which says “ChatGPT”

    o1

    Step 4: Choose between the o1-preview or the o1-mini model.

    Step 5: Chat by tapping the message box below. Enter your prompt to test the o1 models.

    No posts to display