• brucethemoose@lemmy.world
    link
    fedilink
    arrow-up
    6
    ·
    edit-2
    17 hours ago

    Chatbots are text completion models, improv machines basically, so they don’t really have that ability. You could look at logprobs I guess (aka is it guessing a bunch of words pretty evenly?), but that’s unreliable. Even adding a “I don’t know” token wouldn’t work because that’s not really trainable into text datasets: they don’t know when they don’t know, it’s all just modeling what next word is most likely.

    Some non-autoregressive architectures would be better, but unfortunately “cutting edge” models people interact with like ChatGPT are way more conservatively developed than you’d think. Like, they’ve left tons of innovations unpicked.