• Dionysus@leminal.space
    link
    fedilink
    English
    arrow-up
    1
    ·
    13 days ago

    And deepseek is based on llama, more than six figures.

    I’m not aware of any larger parameter LLMs not based on one which is absurdly expensive.

    • mindbleach@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      13 days ago

      DeepSeek is trained from-scratch. Only some variants used other LLMs.

      This is a megaphone made from string, a squirrel, and a megaphone.