Want to wade into the sandy surf of the abyss? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid.
Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.
Any awful.systems sub may be subsneered in this subthread, techtakes or no.
If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.
The post Xitter web has spawned so many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)
Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.
(Credit and/or blame to David Gerard for starting this.)
I hate knowing so many of these clowns in person (or thinking I know them, Jesus fucking Christ) https://futurism.com/artificial-intelligence/openai-staffers-horrified-insane-plan
A rationalist made a top post where they (poorly) argue against political “violence” (scare quotes because they lump in property damage): https://www.lesswrong.com/posts/Sih2sFHEgusDEuxtZ/you-can-t-trust-violence
Highlights include a shallow half-assed defense of dear leader Eliezer’s calls for violence:
True, Eliezer Yudkowsky’s TIME article called on the state to use violence to enforce AI policies required to prevent AI from destroying humanity. But it’s hard to think of a more legitimate use of violence than the government preventing the deaths of everyone alive.
Eliezer called for drone strikes against data centers even if it would start a nuclear war and even against countries that aren’t signatories to whatever hypothetical international agreement against AI there is. That is extremely irregular by the standards of international law and diplomacy, and this lesswronger just elides over those little details
Violence is not a realistic way to stop AI.
(Except for drone strikes and starting a nuclear war.)
They treat a Molotov thrown at Sam Altman’s house as if it were thrown directly at Sam himself:
as critics blamed the AI Safety community for the attacker who threw a Molotov cocktail at Sam Altman
This is a pretty blatant misrepresentation of the action which makes it sound much more violent.
They continue on with minimizing right-wing violence:
Even if there are occasional acts of political violence like the murders of Democratic Minnesota legislators or Conservative pundit Charlie Kirk, we don’t generally view them as indicting entire movements, but as the acts of deranged individuals.
Actually, outside of right-wing bubbles (and right-wing sources masking themselves as centrist), lots of people actually do blame Trump and the leaders of entire right wing movement as at fault for a lot of recent political violence. Of course, this is lesswrong, which has a pretty cooked Overton window, so it figures the lesswronger would be wrong about this.
Following that, the lesswronger acknowledges it is kind of questionable and a conflation of terms to label property damage violence, but then press right on ahead with some pretty weak arguments that don’t acknowledge why some people want to make the distinction.
So in conclusion:
- drone strikes that start nuclear wars: legitimate violence that is totally logical and reasonable
- throw a single incendiary at someone’s home that doesn’t hurt anybody or even light the home on fire: illegitimate violence that must be absolutely condemned without exception
- (bonus) recent right-wing violence: lone deranged individuals and not the fault of Trump or anyone like that. Everyone is saying it.
‘Our goal is AI for all,’ Carney says in Liberal convention speech edit: god knows what this exit tax shit is going to mean for Waterloo
OT: got a job selling tires and I’m really happy to say theres no AI as far as I’ve seen so far. Big relief.
(I get to see all kinds of cars - a Rivan of all things showed up my first day - and I’m learning stuff I can apply to being an RMT. I gotta say I’m pretty content).
Nice. Good luck at your new job!
Nice to hear you’ve got a job, and good luck!
I’m just shocked the managers/supervisors are praising me honestly (I’m very very hard on myself).
The fact that Bitcoin bros are so cooked that there’s probably a moderately stable price floor at $69,420 because of people deciding on that number for the memes…
Nice. </obligatory joke>
Dan Gackle threatens to quit HN over their reluctance to condemn an act of violence towards Sam Altman:
I don’t think I’ve ever seen a thread this bad on Hacker News. The number of commenters justifying violence, or saying they “don’t condone violence” and then doing exactly that, is sickening and makes me want to find something else to do with my life—something as far away from this as I can get. I feel ashamed of this community.
Gackle’s ashamed of people not wanting to protect Altman. Curiously, he doesn’t seem ashamed of openly allowing people with nicknames ending in “88” to post antisemitism, nor of allowing multiple crusty conservatives like John Nagle and Walter Bright to post endorsements of violence against the homeless and queer, nor of allowing posters like
rayinerto port entirely foreign flavors of racism like the Indian caste system into their melting pot of bigotry. This subthread takes him to task for it:Frankly people calling out a post from a billionaire is a good thing. You would have to be terminally detached from reality to not see how all these festering issues - wealth inequality, injustice, cost of living, future employment etc etc - are starting to come to a head which would cause people to feel something - frustrated, angry, wrathful.
The rest of that subthread involves Dan demonstrating that he is, in fact, terminally detached from reality. Anyway, I fully endorse Gackle fucking off and buying a farm. While he’s at it, he should consider following the advice of this reply:
Maybe it’s time to pack it in? I don’t just mean you, I mean that maybe this site has kinda run its course.
Every day, HN users flag into oblivion anything mildly critical of the technological dystopia these tech-bros are trying to manifest. “Politics!” they cry. But Sam Altman comes along with an OpenAI marketing piece dressed up as a condemnation of political violence, and suddenly “politics” are a perfectly acceptable topic. dang has long made it clear whose side he’s on.
Oh, and I hope everyone noted how quickly Sam used this incident as an excuse to place blame on the reporters who published the New Yorker piece that was mildly critical of him:
Words have power too. There was an incendiary article about me a few days ago. Someone said to me yesterday they thought it was coming at a time of great anxiety about AI and that it made things more dangerous for me. I brushed it aside.
That’s a hilarious reaction.
Anyway there’s zip about this incident on LW, which is telling.
edit here’s a very oblique reference https://www.lesswrong.com/posts/igEogGD9TAgAeAM7u/jimrandomh-s-shortform?commentId=zdMRHRqWDcjswhA3i
don’t miss the anarcho-libertarian in the comments
Lesswrong is too centrist-brained to ever even hint at legitimizing (non-state-sanctioned) destruction of property as a means of protest or political action. But according to the orthodox lesswrong lore, Sam Altman’s actions are literally an existential threat to all humanity, so they can’t defend him either. So they are left with silence.
I actually kind of agree with the anarchy-libertarian’s response? It is massively down voted.
This is just elevating your aesthetic preference for what the violence you’re advocating for looks like to a moral principle. The claim that throwing a Molotov cocktail at one guy’s house is counterproductive to the goal of “bombing the datacenters” is a better argument, though one I do not believe.
Bingo. Dear leader Yudkowsky can ask to bomb the data centers, and as long as this action goes through the US political process, that violence is legitimate, regardless of how ill-behaved the US is or it’s political processes degraded from actually functioning as a democracy.
Specifically, a screenshot of a moderator warning him that advocating violence is grounds for a ban there. It would also be grounds for a ban on LW.
That explains why Yud is using twitter so much nowadays. I mean they did ban him right? right?
Someone just made a top post condemning the Molotov but defending and normalizing Eliezer: https://www.lesswrong.com/posts/Sih2sFHEgusDEuxtZ/you-can-t-trust-violence
Ah suddenly when it reaches the class he feels he should be a part of (or is a part of, I don’t know how much money he makes) violence is suddenly a problem.
It’s not easy to be a cop, and that’s basically what you are around here, but thank you for doing it.
…
This NPR article opens with a banger of a line:
In the past few months, AI models have gone from producing hallucinations to becoming effective at finding security flaws in software, according to developers who maintain widely used cyber infrastructure.
The things still fucking hallucinate, it’s not a feature that’s separable from the model.
tldr; one of the MIRI aligned rationalist (Rob Bensinger) complained about how EA actually increased AI-risk long-run by promoting OpenAI and then Anthropic. Scott Alexander responded aggressively, basically saying they are entirely wrong and also they are bad at public communications! Various lesswrongers weigh in, seemingly blind to irony and hypocrisy!
Some highlights from the quotes of the original tweets and the lesswronger comments on them:
-
Scott Alexander tries blaming Eliezer for hyping up AI and thus contributing to OpenAI in the first place. Just a reminder, Scott is one of the AI 2027 authors, he really doesn’t have room to complain about rationalist creating crit-hype.
-
Scott Alexander tries claiming SBF was a unique one off in the rationalist/EA community! (Anthropic’s leadership has been called out on the EA forums and lesswrong for a similar pattern of repeated lying)
-
Rob Bensinger is indirectly trying to claim Eliezer/MIRI has been serious forthright honest commentators on AI theory and policy, as opposed to Open-Phil/EA/Anthropic which have been “strategic” with their public communication, to the point of dishonesty.
-
habryka is apparently on the verge of crashing out? I can’t tell if they are planning on just quitting twitter or quitting their attempts at leadership within the rationalist community. Quitting twitter is probably a good call no matter what.
-
Load of tediously long posts, mired with that long-winded rationalist way of talking, full of rationalist in-group jargon for conversations and conflict resolution
-
Disagreement on whether Ilya Sutskever’s $50 billion dollar startup is going to contribute to AI safety or just continue the race to AGI.
-
Arguments over who is with the EAs vs. Open Philanthropy vs. MIRI!
-
Argument over the definition of gaslighting!
To be clear, I agree with the complaints about EA and Anthropic, I just also think MIRI has its own similar set of problems. So they are both right, all of the rationalists are terrible at pursing their alleged nominal goals of stopping AI Doom.
I did sympathize with one lesswronger’s comment:
More than any other group I’ve been a part of, rationalists love to develop extremely long and complicated social grievances with each other, taking pages and pages of text to articulate. Maybe I’m just too stupid to understand the high level strategic nuances of what’s going on – what are these people even arguing about? The exact flavor of comms presented over the last ten years?
Old Twitter was terrible for people’s souls. I can only imagine what it is like now that the well-meaning professionals are gone and catturd and Wall Street Apes are the leading accounts.
Old Twitter was terrible for people’s souls.
It almost makes me feel sorry for the way the rationalists are still so attached to it. But they literally have two different forums (lesswrong and the EA forum), so staying on twitter is entirely their choice, they have alternatives.
Fun fact! Over the past few years, Eliezer has deliberately cut his lesswrong posting in favor of posting on twitter, apparently (he’s made a few comments about this choice) because lesswrong doesn’t uncritically accept his ideas and nitpicks them more than twitter does. (How bad do you have to be to not even listen to critique on a website that basically loves you and take your controversial foundational premises seriously?)
I’m willing to go out on a limb and say that short-form social media in general (Twitter and imitators, Instagram, TikTok) is essentially a failed set of media. But I’ll concede that’s like cramming a Zyn pouch in my mouth while making fun of a guy chain-smoking Marlboros.
I’ve read speculation that in 30-50 years people will have an attitude towards social media that we have towards cigarettes now.
That would be really nice but that scenario feels pretty optimistic to me on a few points. For one, scientists doing research were able to overcome the lobbying influence and paid think tanks of cigarette companies. I am worried science as a public institution isn’t in good enough shape to do that nowadays. Likewise part of the push back against cigarettes included a variety of mandatory labeling and sin taxes on them, and it would take some pretty major shifts for the political will for that kind of action to be viable. Well maybe these things are viable in the EU, the US is pretty screwed.
The only people I trust as little as I trust the owners of corporate social media are the politicians who have decided to cash in on the moment by “regulating” them. I mean, here in progressive Massachusetts, the state house of representatives just this week passed a bill that, depending on the whims of the Attorney General, would require awful.systems to verify the ages of its users by gathering their government-issued IDs or biometrics. We are, you see, a “public website, online service, online application or mobile application that displays content primarily generated by users and allows users to create, share and view user-generated content with other users”. And so we would have to “implement an age assurance or verification system to determine whether a current or prospective user on the social media platform” is 16 or older. (Or 14 or 15 with parental consent, but your humble mods lack the resources to parse divorce laws in all localities worldwide, sort out issues of disputed guardianship, etc., etc.) The meaning of what “practicable” age verification is supposed to be would depend upon regulations that the Attorney General has yet to write.
So, yeah, as an old-school listserv nerd who had the I am not on Facebook T-shirt 15 years ago, I don’t trust any of these people.
I’m not quite so pessimistic. It’s important to remember that the actual practical purpose of the extant corporate social media* is to convey targeted advertising; i.e. an optimization (possibly the last optimization) on American management of global supply chains. Those supply chains were already starting to be optimized past their breaking point: flooded with dissatisfactory junk, easily spoofed by low-quality sellers, on top of broader externalities besides. And now, they have now been blasted into fine dust by a failed presidency partially funded by the social media and online advertising barons. It may yet be something of a self-correcting problem, albeit having done substantial damage in the meantime.
*Twitter is now a fully dedicated advertising campaign for Elon Musk’s program of white supremacy, with financial returns no object. It’s not quite going according to plan. By this time next decade, the Twitter microblogging permutation of the tech may be thoroughly killed, and if not it’ll be disgustingly cringe. Who do you think you are posting like that, Baby Trump?!?!
The collapse of the current American management of global supply chains isn’t exactly an optimistic expectation, but I guess it beats social media continuing as it is into the future and maybe a better global order will develop in the aftermath.
Haven’t seen any estimates of death toll due to social media but cigarettes is/was pretty staggering (20-40m), way too big to hide - https://www.ucpress.edu/books/golden-holocaust/hardcover - if it’s “only” 50 years to flip the consensus on social media, that would be a faster process, I do hope its possible though. Tobacco execs had the good sense to keep a relatively low profile compared to Zuck and Musk, so that might speed it up.
Bonus race pseudoscience quoted by No77e!
There is a phenomenon in which rationalists sometimes make predictions about the future, and they seem to completely forget their other belief that we’re heading toward a singularity (good or bad) relatively soon. It’s ubiquitous, and it kind of drives me insane. Consider these two tweets:
Richard Ngo @RichardMCNgo: Hypothesis: We’ll look back on mass migration as being worse for Europe than WW 2 was. … high-trust and honogeneous … ethno-religious fractures
Liv Boeree: Would not be surprised if it turns out that everyone outsourcing their writing to LLms will have a similar or worse effect on IQ aslead piping in the long run
(he shares these tweets as photos, I ain’t working harder to transcribe them or using a chatbot)
No77e is correctly noting the discrepancy between the rationalist obsession with eugenics and the belief in an imminent (or even the next 40 years) technological singularity, but fails to realize that the general problem is the eugenics obsession of rationalists. It is kind of frustrating how close but far they are from realizing the problem.
Also, reminder of the time Eliezer claimed Genesmith’s insane genetic engineering plan was one of the most important projects in the world (after AI obviously): https://www.lesswrong.com/posts/DfrSZaf3JC8vJdbZL/how-to-make-superbabies?commentId=fxnhSv3n4aRjPQDwQ Apparently Eliezer’s plan if we aren’t all doomed by LLMs is to let the genetically engineered geniuses invent friendly AI instead.
-
Found an interesting take on YouTube, of all places. Her argument can be summarized (with high compression losses) as “AI companies and technologies are bad for basically all the reasons that non-cultist critics say, but trying to shame and argue people out of using them entirely is less effective than treating them as a normal tool with limitations and teaching people how to limit the harm.” She makes the analogy to drug policy.
I think she makes a very compelling argument, and I’m still digesting it a bit because I definitely had the knee-jerk rejection as an insider shill, but especially towards the end as she talks about how the AI industry targets low-literacy users as ideal customers (because the more you know about it the less you’re likely to actually use them) I found myself agreeing more than not. I do wish she had addressed the dangers of cognitive offloading more, since being mindful of which tasks you’re letting the computer do for you is pretty significant part of minimizing those harms, especially for students and some professionals who face a strong incentive to just coast by on slop if they can get away with it.
I just watched the whole thing. She makes a consistent case.
I felt a little called out by the being tolerant bit. I for sure haven’t had great success in talking to close people about their AI use. And I was maybe a little too cold to colleagues, who tried to get ahead of the AI literacy circus with good intentions, although I grudgingly agreed that they are right.
Maybe I don’t meet enough randos to get feeling on the level of pervasiveness of chatbots. Maybe it’s a personality thing; I worked myself out of depression mostly by disciplining myself and stopping to buy my own excuses, and that’s kind of how I approach every problem now. That sure isn’t a vibe that most people respond to.
There was one part of my AI beliefs that wasn’t adressed. Besides the “front-end” and “back-end” harms, that can be mitigated, the tech as a whole still seems trash to me. That may be boomerism setting in, but chatbots just feel counter to and displacing my positive vision for a social fabric, be it for responsible professional communities or for interpersonal connections.
(I do buy into the use-case for a context-sensitive search engine, e.g. for walls of legalese. But the current framing of the tools is just so harmful, even that use is hazardous as seen in the anecdote.)
I think that’s 100% correct and also it’s year 3 of this nonsense and I cannot be fucked. My response to genAI in any context now is to scream and start doing jumping jacks.
Imagine the drug policy context but then also half of your colleagues are doing meth every day every time you see them, people say shit like “everyone does meth, those that say they don’t are lying”, and meth is a trillion-dollar industry that has been telling you “meth is the future” for years. You’d be much less inclined to argue calmly against meth and much more inclined to start screaming and jumping.
I feel like there’s a difference between alcohol and drugs, something people can make in their back yard and AI which requires a first world country’s entire economy to be oriented towards it to function… a difference in what we should be required to accept.
I don’t buy the general argument about shame either. We teach children to shit in toilets and not sidewalks. I see rampant AI use as just another form of disgusting public indecency and the faster we bring shame in to remedy it the better.
I don’t disagree about the massive costs necessarily associated with thia industry. Even the smaller and lighter models she mentions only exist because of the massive fuckers. At the same time, I think those arguments are for the realm of public policy more than individual choice to use chatbots or not. We’ve talked at length here over the last year or so about how the economics of the bubble are driven largely by a broken B2B SaaS pipeline that separates purchasing decisions from actually having to use the products and by an investment capital sector desperately trying to recapture the glory days of the pre-2008 omnibubble and throwing obscene amounts of money at anything with the right narrative regardless of the numbers. I feel like that keeps happening regardless of how many individual users fall for the hype and make it part of their normal workflows.
I feel like the analogy to the drug trade is still pretty relevant given the violence and predation that the black market pretty much inevitably attracts and sustains. Like, maybe you know a guy who has his own grow op or whatever, but cocaine and heroin money is going through the cartels at some point in the chain and they’re going to use some portion of it for bullets that end up in some journalist’s kids or something. The downstream harms are massive even if the drug industry could theoretically avoid them in ways the AI industry can’t, but any given individual user’s contribution to them is incredibly minor and given the addictive and self-destructive nature of the product it’s both more humane and more effective to treat them as a victim of a broken world that (falsely) offered this as a step up. While I don’t think we should allow slop to invest every forum any more than addicts should be allowed to shoot up on every corner, I think that if shaming makes people less likely to acknowledge that they’re going down a dead-end road and reach out to their communities and support networks for help addressing the root of what drove them to these maladaptive antisolutions in the first place then shaming is making things worse, not better.
Also as the father of a small child I can unfortunately say from recent personal experience that shaming, be it public or private, is far less effective as a means of motivating behavioral change than we want it to be, even for things as basic as not shitting on the goddamn lawn.
Sounds kind of like the Baldur Bjarnason strategy but for your coworkers instead of your boss.
I can see the value of someone with a critical understanding diving into the technology, so they can talk others down from the ledge.
But you also need the social pressure to maintain some slop-free spaces. Not everyone can be asked to accomodate recovering slopaholics.
Found an interesting sneer that compares the AI bubble to the Great Leap Forward.
Also discovered an “anti-distill” program through the article, which aims to sabotage attempts to replace workers with AI “agents”.
Kind of a pseudo-sneer, author is writing a
blog on machine learning engineering, compound AI systems, search and information retrieval, and recsys — exploring machine learning, LLM agents, and data science insights from startups to enterprises.
Here’s the discussion on the red site: https://lobste.rs/s/nmhkdl/ai_great_leap_forward Plenty of people suspect the text being LLM generated. Pangram disagrees, fwiw.
I do think there’s some interesting ideas about how humans will “defend” themselves from being replaced by bots, and that the critical info in a company is seldom in the source code, but in the customer relationships, sales etc.
It seems vaguely AI flavored to me inasmuch as it’s using contrasts too much (it’s not x it’s y) and it’s way too verbose. Also it’s obviously wrong at least in my experience, middle managers aren’t the sparrows, individual contributors (especially juniors) are.
Maybe that’s just a symptom of a person reading too much AI text and thinking a good tweet would make a great substack.
Yeah, they lost me at the middle managers bit too. In my experience your manager is probably the one pushing the metrics to show their team’s contributions to the knowledge base that is feeding into the AI model that’s replacing them. They’re already creatures of the bureaucracy and are more likely to try and fight each other over the few remaining roles that will exist after the majority of their teams are replaced with the confabulatron, rather than be concerned about their own replacements. After all, their job stops existing because their team got downsized, but their time in that job may be dependent on their enthusiastic participation in the process that leads there.
surprise level == 0 (or rather, short odds on the prediction markets)
El Reg: OpenAI puts Stargate UK on ice, blames energy costs and red tape
Work wants to add that new whiz-bang agentic AI into a scheduling service that I have been tasked with building, but in the dumbest way possible kind of similar to the Jet’s text-a-pizza-order thing that worked like shit. I need to find an entirely new profession, everyone in software now is fucking deranged.
I need to find an entirely new profession, everyone in software now is fucking deranged.
Mood
It’s bad for me too.
I’m trying to hang in there until I get some healthcare stuff taken care of over the next year or two but it is getting increasingly difficult. Most of the the good people at my job have been driven out, quit, or been poached by other (AI) companies.
By this point a majority of the programmers at my job (or at least the one’s most active on the mailing lists) are LLM true believers who think that the end times are near. My management chain has explicitly said that LLM programming is required, and that a subsequent increase in “productivity” is expected with it. My department got renamed to something with “AI” in the name. I constantly field questions from people who want me to read a screen full of LLM nonsense, or who push back when I tell them something claiming that the chatbot said differently.
There’s always some frantic push to adopt “MCP” or “Skills” or whatever the next fad will be without any guidance as to how or why. If I ignore this I get nastygrams from my manager.
And at my last doctor visit I had elevated blood pressure :)
and that a subsequent increase in “productivity” is expected with it.
Oh no… they def will blame the users before blaming the faulty tools. Hope you will not be the one who gets blamed as a wrecker or something when the eventual increase isn’t there (or other metrics fall off a cliff).
Up next, when the first agent fails, implement an agent that checks the other agent. Both of these need agents to check for malicious inputs of course. And translation agents.
I run an email server for myself and every once in a while the UCE starts leaking through until I have a few training examples to feed it. In the last couple weeks I noticed that basically all of the escapees look like fancy Claude output for telling me that I should be enticed by Costco gift cards and free chicken sandwiches.
What I suppose this means is that if you use these tools to generate material in the same snappy variety of output template, “but seriously”, nonetheless you will eventually reach aesthetic convergence with meaningless spam. Is there a term for this yet? “Slop-ratchet” is the one that sprang immediately to mind but I am sure someone else noticed this tendency long before I did.
belated shower thought: “slopvergence”
Circular at work states that the standard laptop we get from Dell has increased in price by 50% so they’re looking for alternatives.
Glad that I did the major upgrades in 2024, hopefully they will outlast this bullshit
yeah my kid had to get a gaming PC for school (gamedev) and managed to snag a decent rig before prices went parabolic
Claude Mythos… I’m already sick of hearing about it. The self-imposed critihype is insane.
A friend just pointed out that Anthropic are making all this big noise about having an AI that is “too good” at finding bugs and security problems 1 week after the source code for one of their flagship products was leaked to the public and was found to be riddled with security holes… Why would they not use it themselves?
Same as the
vague markdown filesskills that are supposedly going to make all SaaS redundant and finally kill off all the COBOL running on mainframes that checks notes IBM have spent hundreds of thousands of man hours trying to kill over the last 3-4 decadesHonestly fuck this shit. Bunch of absolute clowns 🤡 🤡 🤡
So, they are planning to use an ai to fix the sec bugs that their ai generates? Good hussle, if a bit obvious.
The fuck is Mythos?
Is it their next model that tbey swear isn’t vaporware but no! It is too dangerous to release into the world because it’ll find too much insecure code or whatever.
Okay but like is it materially different than whatever the current Claude thing is or did they just pump the size of the matrix?
Probably a markdown file telling it “you are a l33t h4x0r”
Okay but that’s already in Claude

they added more leetspeak and changed the vm they run it on to kali linux for extra h4xx0r power
I still laugh every time I see that this is what qualifies as proper “tuning” and “security controls” for these things.
I had hoped that with the whole “agent” push that we would start seeing more sane usage, like having AI be a fuzzy logic step in a chain of formal logic and existing deterministic tools, but the cult still has people treating them like reliable second brains. They’re used as the baseline fucking orchestrator rather than anywhere they might make a bit of sense.
I had hoped that with the whole “agent” push that we would start seeing more sane usage, like having AI be a fuzzy logic step in a chain of formal logic and existing deterministic tools
I think this is the best you can expect out of LLMs, and the relatively more successful “agentic” AI efforts are probably doing exactly this, but their relative success is serving as hype fuel for the more impossible promises of LLMs. Also, if you have formal logic and deterministic tools wrapping and sanity checking the LLM bits… I think the value add of evaporating rivers and firing up jet turbines to train and serve “cutting edge” models that only screw up 1% of the time isn’t there because you can run a open weight model 1/100th the size that screws up 10% of the time instead. (Note one important detail: training costs go up quadratically with model size, so a 100x size model is 10,000x training compute.) I think the frontier LLM companies should have pivoted to prioritizing smaller size, greater efficiency, and actually sustainable business practices 4 years ago. At the very latest, 2 years ago, with the release of 4o OpenAI should have realized pushing up model size was the wrong direction (as they should have realized training Chain-of-Thought was not going to be the magic bullet).
And to be clear I still think this is really generous to the use case of smaller LMs.
Ia ia Claude! Ph’nglui mglw’nafh Claude Anthr’lyeh wgah’nagl fhtagn! Ia! Ia!
Anthropic’s latest model that they haven’t released to the public yet since they’re worried its gonna fuck up cybersecurity this thread goes over it a bit
XCancel link for those of us sick of being badgered to sign up/in
On a more productive note, this feels likely to be tied in with the usual issues of AI sycophancy re: false positive rate. If you ask the model to tell you about security vulnerabilities, it’s never going to tell you there aren’t any, any more than existing scanners will. When I worked for F5 it was not uncommon to have to go down a list of vulnerabilities that someone’s scanner turned out and figure out whether they were actually something that needed mitigation that could be applied on our box, something that needed to be configured somewhere else in the network (usually on their actual servers) or (most commonly) a false positive, e.g. “your software version would be vulnerable here, which is why it flagged, but you don’t have the relevant module activated and if an attacker is able to modify your system to enable it you’re already compromised to a far greater degree than this would allow.” That was with existing tools that weren’t trying to match a pattern and complete a prompt.* Given that we’ve seen the shitshow that is Claude Code I think it’s pretty clear they’re getting high on their own supply and this announcement ought be catnip for black hats.
Wow, sounds like they just automated “shitty infosec teams that only forward scanner output without evaluating it” out of a job. Holy shit they were right that AI was coming for jobs!
True. I will say that the shitty infosec teams are probably being hit less hard than the SMEs they offloaded their jobs onto, because from their perspective it doesn’t actually matter whether it’s f5 support engineer or a chatbot that tells them the answer; either way they’ve successfully offloaded the task of validating security onto another entity that can make up for their shortcomings with a combination of accuracy and authority. Nobody is going to get fired for not fixing a bug that the vendor SME told them wasn’t actually an issue for them, effectively. And when the org has been pushing AI as hard as so many of them have its pretty easy to throw the chatbot under the same bus and expect the bus to stop instead.
On a more productive note, this feels likely to be tied in with the usual issues of AI sycophancy re: false positive rate.
I suspect this is the real limit. Claude Mythos might find real vulnerabilities, but if they are buried among loads of false positives it won’t be that useful to black or white hat hackers and the endless tide of slop PRs and bug reports will keep coming.
I tried looking through Anthropic’s “preview” for a description of the false positive rate… they sort of beat around the bush as to how many false positives they had to sort out to find the real vulnerabilities they reported (even obliquely addressing the issue was better than I expected but still well short of the standard for a good industry-standard security report from what I understand).
They’ve got one class of bugs they can apparently verify efficiently?
Memory safety violations are particularly easy to verify. Tools like Address Sanitizer perfectly separate real bugs from hallucinations; as a result, when we tested Opus 4.6 and sent Firefox 112 bugs, every single one was confirmed to be a true positive.
It’s not clear from their preview if Claude was able to automatically use Address Sanitizer or not? Also not clear to me (I’ve programmed with Python for the past ten years and haven’t touched C since my undergraduate days), maybe someone could explain, how likely is it that these bugs are actually exploitable and/or show up for users?
Moving on…
This process means that we don’t flood maintainers with an unmanageable amount of new work—but the length of this process also means that fewer than 1% of the potential vulnerabilities we’ve discovered so far have been fully patched by their maintainers.
So its good they aren’t just flooding maintainers with slop (and it means if they do publicly release mythos maintainers will get flooded with slop bug fixes), but… this makes me expect they have a really high false positive rate (especially if you rule minor code issues that don’t actually cause bugs or vulnerabilities as false positives).








