We show that large language models can be used to perform at-scale deanonymization. With full Internet access, our agent can re-identify Hacker News users and Anthropic Interviewer participants at high precision, given pseudonymous online profiles and conversations alone, matching what would take hours for a dedicated human investigator. We then design attacks for the closed-world setting. Given two databases of pseudonymous individuals, each containing unstructured text written by or about that individual, we implement a scalable attack pipeline that uses LLMs to: (1) extract identity-relevant features, (2) search for candidate matches via semantic embeddings, and (3) reason over top candidates to verify matches and reduce false positives. Compared to classical deanonymization work (e.g., on the Netflix prize) that required structured data, our approach works directly on raw user content across arbitrary platforms. We construct three datasets with known ground-truth data to evaluate our attacks. The first links Hacker News to LinkedIn profiles, using cross-platform references that appear in the profiles. Our second dataset matches users across Reddit movie discussion communities; and the third splits a single user's Reddit history in time to create two pseudonymous profiles to be matched. In each setting, LLM-based methods substantially outperform classical baselines, achieving up to 68% recall at 90% precision compared to near 0% for the best non-LLM method. Our results show that the practical obscurity protecting pseudonymous users online no longer holds and that threat models for online privacy need to be reconsidered.
How is it possible to validate the results?
The paper has several different datasets and explains how they got them, but for their test data they already knew the link existed. I think this one is probably the most relevant for actual attacks. They split accounts, giving a one year gap in their post history to simulate an abandoned account etc and added some fake profiles that didn’t have a match.
If you mean running this yourself, you can’t, they didn’t post prompts or anything. Just an overview of their pipeline. Sorry at first I thought you meant how could they validate that the users were the same person.
Oh I see, they stripped the usernames and matched the comments. I thought they were claiming to have matched usernames to legal identities.
They did that too, with hackernews and linkedin accounts, as well as some anthropic interviewees. I’m less sure how impressive that is, because the accounts were linked by the owner. So they obviously don’t care about opsec, so they’re probably less careful then they otherwise would be. The paper isn’t a super hard read if you’re interested. Guess we’ll all have to see how well this works in practice.