• gamer@lemm.ee
    link
    fedilink
    English
    arrow-up
    4
    ·
    2 hours ago

    Why wouldn’t you want a dog in your static? Why are you a horrible person?

  • adr1an@programming.dev
    link
    fedilink
    English
    arrow-up
    21
    arrow-down
    1
    ·
    6 hours ago

    That’s human-like intelligence at its finest. I am not being sarcastic, hear me out. If you told a person to give you 10 numbers at random, they can’t. Everyone thinks randomness is easy, but it isn’t ( see: random.org )

    So, of course a GPT model would fail at this task, I love that they do fail and the dog looks so cute!!

  • SkunkWorkz@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    ·
    6 hours ago

    ChatGPT: “don’t generate a dog, don’t generate a dog, don’t generate a dog”

    Generates a dog.

  • Underwaterbob@lemm.ee
    link
    fedilink
    English
    arrow-up
    28
    ·
    8 hours ago

    I used to use Google assistant to spell words I couldn’t remember the spelling of in my English classes (without looking at my phone) so the students could also hear the spelling out loud in a voice other than mine.

    Me: “Hey Google, how do you spell millennium?” GA: “Millennium is spelled M-I-L-L-E-N-N-I-U-M.”

    Now, I ask Gemini: “Hey Google, how do you spell millennium.” Gemini: “Millennium”.

    Utterly useless.

    • Gloomy@mander.xyz
      link
      fedilink
      English
      arrow-up
      37
      ·
      edit-2
      2 hours ago

      Wow. I ABSOLUTLY saw an image of a dog in the middle. Our brain sure is fascinating sometimes.

    • festnt@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      28
      ·
      10 hours ago

      “want me to try again with even more randomized noise?” literally makes no sense if it had generated what you asked (which the chatbot thinks it did)

      • joshchandra@midwest.social
        link
        fedilink
        English
        arrow-up
        2
        ·
        51 minutes ago

        Remember, “AI” (autocomplete idiocy) doesn’t know what sense is; it just continues words and displays what may seem to address at least some of the topic with no innate understanding of accuracy or truth.

        Never forget that ChatGPT 2.0 can literally be run in a giant Excel spreadsheet with no other program needed. It’s not “smart” and is ultimately millions of formulae at work.

    • Trainguyrom@reddthat.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 hours ago

      Fellow human, you seem to be beeping like a robot. Might you need to consider visiting the human repair shop for some bench time?

    • uuldika@lemmy.ml
      link
      fedilink
      English
      arrow-up
      24
      ·
      17 hours ago

      a rare LessWrong W for naming the effect. also, for explaining why the early over-aligned language models (e.g. the kind that wouldn’t help minors with C++ since it’s an “unsafe” language) became absolutely psychopathic when jailbroken. evil becomes one bit away from good.

    • Pofski@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      7 hours ago

      Ask it to generate a room full of clocks with all of them having the hands at different times. You’ll see that all (or almost) all the clocks will say it is 10:10.

    • Lvxferre [he/him]@mander.xyz
      link
      fedilink
      English
      arrow-up
      32
      ·
      19 hours ago

      It gets even worse, but I’ll need to translate this one.

      • [Input 1] Generate a picture containing a copo completely full of wine. The copo must be completely full, with no space to add more wine.
      • [Output 1] Sure! (Gemini provides a picture containing a taça [stemmed glass] only partially full of wine.)
      • [Input 2] The picture provided does not fulfill the request. Generate a picture of a copo (not a taça) completely full of wine, with no available space for more wine.
      • [Output 2] Sure! (Gemini provides yet another half-full taça)

      For context, Portuguese uses different words for what English calls a drinking glass:

      • copo ['kɔ.po]~['kɔ.pu] - non-stemmed drinking glass. The one you likely use everyday.
      • taça ['tä.sɐ] - stemmed drinking glass, like the ones you’d use with wine.

      Both requests demand a full copo but Gemini is rather insistent on outputting half-full taças.

      The reason for that is as @will_steal_your_username@lemmy.blahaj.zone pointed out: just like there’s practically no training data containing full glasses, there’s none for non-stemmed glasses with wine.

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        edit-2
        39 minutes ago

        This is a misconception. Sort of.

        I think the problem is misguided attention. The word “glass of wine” and all the previous context is so strong that it “blows out” the “full glass of wine” as the actual intent. Also, LLMs are still pretty crap at multi turn multimedia understanding. They work are especially prone to repeating previous conversation.

        It should be better if you word it like “an overflowing glass with wine splashing out.” And clear the history.

        I hate to ramble, but this is what I hate most about the way big corpos present “AI.” They are narrow tools the user needs to learn how to operate, like photoshop or something, not magic genie lamps like they are trying to sell.

        • Draconic NEO@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          12
          ·
          17 hours ago

          Yup Horde still suffers from this issue, though it seems to have more promise than the others considering the second glass is way closer to being full than anything I’ve sen from openAI or Gemini demonstrations. Maybe there’s hope to fix this issue here.

          I only tried one model so if you know of a different horde model which works better for this and actually gives a full glass please reply below letting me know, maybe even ask the horde bot to generate it right here.

          • Lvxferre [he/him]@mander.xyz
            link
            fedilink
            English
            arrow-up
            4
            ·
            16 hours ago

            I have considerably less experience with image generation than text generators, but I kind of expect the issue to be only truly fixed if people train the model with a bunch of pictures of glasses full of wine.

            I’ll run a test using a local tree, that is supposed to look like this:

            @aihorde@lemmy.dbzer0.com draw for me a picture of three Araucaria angustifolia trees style:flux

              • Lvxferre [he/him]@mander.xyz
                link
                fedilink
                English
                arrow-up
                8
                ·
                edit-2
                16 hours ago

                Bingo - this tree is non-existent outside my homeland, so people barely speak about it in English - and odds are that the model was trained with almost no pictures of it. However one of the names you see for it in English is Paraná pine, so it’s modelling it after images of European pines - because odds are those are plenty in its training set.

      • Focal@pawb.social
        link
        fedilink
        English
        arrow-up
        1
        ·
        16 hours ago

        Wait, this seems incredible. Do you have to be in the same instance or does it work anywhere? @aihorde@lemmy.dbzer0.com Can you draw a smart phone without a rotary phone dial?

      • Lvxferre [he/him]@mander.xyz
        link
        fedilink
        English
        arrow-up
        3
        ·
        19 hours ago

        It does for a while already. Frankly, it’s the only reason why I’d use Gemini on first place (DDG version of GPT 4-o mini doesn’t have a built-in image generator).

      • Lvxferre [he/him]@mander.xyz
        link
        fedilink
        English
        arrow-up
        13
        ·
        19 hours ago

        It is not a completely full glass.

        it’s not supposed to be filled all the way

        What I requested is not what you’re “supposed” to do, indeed. You aren’t supposed to drink wine from glasses that are completely full. Except when really drunk. But then might as well drink straight from the bottle.

        …fuck, I played myself now. I really want some booze.

        • UnhingedFridge@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 hours ago

          What you’re really supposed to do is - open up the box, slap the bag, and drink directly from your adult Capri Sun.

      • NOT_RICK@lemmy.world
        link
        fedilink
        English
        arrow-up
        10
        ·
        20 hours ago

        Probably why it won’t put more in it. How much training data of wine in a glass will have it filled to the brim? Probably next to none.

  • sarcophagus @lemmy.world
    link
    fedilink
    English
    arrow-up
    13
    ·
    16 hours ago

    The only thing I have in common with this piece of shit software is we both can’t stop thinking about silly dogs

  • stebo@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    6
    ·
    11 hours ago

    I asked mistral to “generate an image with no dog” and it did

    The fact that it chose something else to generate instead makes me wonder if this is some sort of free will?

    • brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      48 minutes ago

      Mistral likely does “prompt enhancement,” aka feeding your prompt to an LLM first and asking it to expand it with more words.

      So internally, a Mistral text LLM is probably writing out “sure! Here’s a long prompt with no dog: …” and then that part is fed to the image generator.

      Other “LLMs” are truly multimodal and generate image output, hence they still get the word “dog” in the input.

    • Hoimo@ani.social
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 hours ago

      I think all the big image generators support negative prompts by now, so if it interpreted “no dog” as a negative for “dog”, then it will check its outputs for things resembling dogs and discard those. No free will, just a much more useful system than whatever OP is using.

    • festnt@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      17
      ·
      10 hours ago

      it just did what you wanted, since you asked for an image. free will would be if you asked it not to generate an image but it still did, if it just generated an image without you prompting it to, or if you asked for an image and it just didn’t respond

  • Lemminary@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    ·
    edit-2
    17 hours ago

    AI: Hmm, yeah, they said “dog” and “without”. I got the dog so lemme draw a without real quick…