• stebo@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    6
    ·
    14 hours ago

    I asked mistral to “generate an image with no dog” and it did

    The fact that it chose something else to generate instead makes me wonder if this is some sort of free will?

    • brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      4 hours ago

      Mistral likely does “prompt enhancement,” aka feeding your prompt to an LLM first and asking it to expand it with more words.

      So internally, a Mistral text LLM is probably writing out “sure! Here’s a long prompt with no dog: …” and then that part is fed to the image generator.

      Other “LLMs” are truly multimodal and generate image output, hence they still get the word “dog” in the input.

    • Hoimo@ani.social
      link
      fedilink
      English
      arrow-up
      3
      ·
      5 hours ago

      I think all the big image generators support negative prompts by now, so if it interpreted “no dog” as a negative for “dog”, then it will check its outputs for things resembling dogs and discard those. No free will, just a much more useful system than whatever OP is using.

    • festnt@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      18
      ·
      13 hours ago

      it just did what you wanted, since you asked for an image. free will would be if you asked it not to generate an image but it still did, if it just generated an image without you prompting it to, or if you asked for an image and it just didn’t respond