While I am fascinated with the power of chatbots, I always make the point to remember people of their limitations. This is the screenshot I’ll show everyone now
While I am fascinated with the power of chatbots, I always make the point to remember people of their limitations. This is the screenshot I’ll show everyone now
I asked the same question of GPT3.5 and got the response “The former chancellor of Germany has the book.” And also: “The nurse has the book. In the scenario you described, the nurse is the one who grabs the book and gives it to the former chancellor of Germany.” and a bunch of other variations.
Anyone doing these experiments who does not understand the concept of a “temperature” parameter for the model, and who is not controlling for that, is giving bad information.
Either you can say: At 0 temperature, the model outputs XYZ. Or, you can say that at a certain temperature value, the model’s outputs follow some distribution (much harder to do).
Yes, there’s a statistical bias in the training data that “nurses” are female. And at high temperatures, this prior is over-represented. I guess that’s useful to know for people just blindly using the free chat tool from openAI. But it doesn’t necessarily represent a problem with the model itself. And to say it “fails entirely” is just completely wrong.
I lean more towards failure. I worry that people will put too much trust in AI with things that have real consequences. IMO, AI training = p hacking via computer with some rules. This is just an example of it. The problem with AI is it can’t find or understand an explanatory theory behind the statistics so it will always have this problem.