rw-book-cover

Metadata

Highlights

  • OpenAI o3 and o4-mini System Card. I’m surprised to see a combined System Card for o3 and o4-mini in the same document (View Highlight)
  • The models use tools in their chains of thought to augment their capabilities; for example, cropping or transforming images, searching the web, or using Python to analyze data during their thought process. (View Highlight)
  • The benchmark score on OpenAI’s internal PersonQA benchmark (as far as I can tell no further details of that evaluation have been shared) going from 0.16 for o1 to 0.33 for o3 is interesting, but I don’t know if it it’s interesting enough to produce dozens of headlines along the lines of “OpenAI’s o3 and o4-mini hallucinate way higher than previous models”. (View Highlight)
  • The paper also talks at some length about “sandbagging”. I’d previously encountered sandbagging defined as meaning “where models are more likely to endorse common misconceptions when their user appears to be less educated”. The o3/o4-mini system card uses a different definition: “the model concealing its full capabilities in order to better achieve some goal” - and links to the recent Anthropic paper Automated Researchers Can Subtly Sandbag. (View Highlight)
  • As far as I can tell this definition relates to the American English use of “sandbagging” to mean “to hide the truth about oneself so as to gain an advantage over another” - as practiced by poker or pool sharks. (View Highlight)
  • (Wouldn’t it be nice if we could have just one piece of AI terminology that didn’t attract multiple competing definitions?) (View Highlight)