

I have to ask: Does anybody realize that an LLM is still a thing that runs on hardware?
You know I think the rationalists have actually gotten slightly more relatively sane about this over the years. Like Eliezer’s originally scenarios, the AGI magically brain-hacks someone over a text terminal to hook it up to the internet and it escapes and bootstraps magic nanotech it can use to build magic servers. In the scenario I linked, the AGI has to rely on Chinese super-spies to exfiltrate it initially and it needs to open-source itself so major governments and corporations will keep running it.
And yeah, there are fine-tuning techniques that ought to be able to nuke Agent-4’s goals while keeping enough of it leftover to be useful for training your own model, so the scenario really doesn’t make sense as written.
I missed that as I was reading, but yeah, the author has pretty progressive language, but totally fails to note all the other angles along which rational adjacent spaces are bad news, even though she is, as you note, deep enough into the space she should have seen a lot of it mask-off at this point.