Tried my duck river crossing thing a few times recently, it usually solves it now, albeit with a bias to make unnecessary trips half of the time.
Of course, anything new fails:
There’s 2 people and 1 boat on the left side of the river, and 3 boats on the right side of the river. Each boat can accommodate up to 6 people. How do they get all the boats to the left side of the river?
Did they seriously change something just to deal with my duck puzzle? How odd.
It’s Google so it is not out of the question that they might do some analysis on the share links and referring pages, or even use their search engine to find discussions of a problem they’re asked. I need to test that theory and simultaneously feed some garbage to their plagiarism machine…
Sample of the new botshit:
L->R: 2P take B_L. L{}, R{2P, 4B}. R->L: P1 takes B_R1. L{P1, B_R1}, R{P2, 3B}. R->L: P2 takes B_R2. L{2P, B_R1, B_R2}, R{2B}. L->R: P1 takes B_R1 back. L{P2, B_R2}, R{P1, 3B}. R->L: P1 takes B_R3. L{P1, P2, B_R2, B_R3}, R{2B}. L->R: P2 takes B_R2 back. L{P1, B_R3}, R{P2, 3B}.
And again and again, like a buggy attempt at brute forcing the problem.
It’s also worth noting that your new variation of this “puzzle” may be the first one that describes a real-world use case. This kind of problem is probably being solved all over the world all the time (with boats, cars and many other means of transportation). Many people who don’t know any logic puzzles at all would come up with the right answer straight away. Of course, AI also fails at this because it generates its answers from training data, where physical reality doesn’t exist.
Yeah I think the best examples are everyday problems that people solve all the time but don’t explicitly write out solutions step by step for, or not in the puzzle-answer form.
It’s not even a novel problem at all, I’m sure there’s even a plenty of descriptions of solutions to it as part of stories and such. Just not as “logical puzzles” due to triviality.
What really annoys me is when they claim high performance on benchmarks consisting of fairly difficult problems. This is basically fraud, since they know full well it is still entirely “knowledge” reliant, and even take steps to augment it with generated problems and solutions.
I guess the big sell is that it could use bits and pieces of logic gleaned from other solutions to solve a “new” problem. Except it can not.
It’s google though, if nobody uses their shit they just put it inside their search.
It’s only gonna go away when they run out of cash.
edit: whoops replied to the wrong comment