• KeenFlame@feddit.nu
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    2
    ·
    6 hours ago

    There is definitely reason a larger model would have worse hallucinations. Why do you say not? It’s a fundamental problem with data scaling in these architectures