rw-book-cover

Metadata

Highlights

  • Prior work creates evaluations with
    crowdwork (which is time-consuming and
    expensive) or existing data sources (which are
    not always available). Here, we automatically
    generate evaluations with LMs. We explore
    approaches with varying amounts of human
    effort, from instructing LMs to write yes/no
    questions to making complex Winogender
    schemas with multiple stages of LM-based
    generation and filtering (View Highlight)
  • It is crucial to evaluate LM behaviors
    extensively, to quickly understand LMs’ potential
    for novel risks before LMs are deployed. (View Highlight)
  • Prior work creates evaluation datasets manually
    (Bowman et al., 2015; Rajpurkar et al., 2016,
    inter alia), which is time-consuming and effortful,
    limiting the number and diversity of behaviors
    tested. Other work uses existing data sources to
    form datasets (Lai et al., 2017, inter alia), but
    such sources are not always available, especially
    for novel behaviors. (View Highlight)
  • Here, we show it is possible to generate many
    diverse evaluations with significantly less human
    effort by using LMs (View Highlight)