Let's read what is actually said in the abstract. He never says he is able to identify people. In fact, he says he geolocates posts and that 10% of them originate from universities. This is quite different from "We identify the poster for the majority of posts and find that x% are university faculty, y% are grad students, etc.". If the latter claim were plausibly true, it would be very unusual to understate it so much in the abstract.
Identifying people by name also opens a huge can of worms. What is the level of crime that gets you "doxxed"? Posting here cannot be sufficient. Various people have posted here publicly with virtually no consequence. What about posts that try to push back on problematic posts? Or the majority of posts that just try to work around the problematic posts to have a useful or amusing conversation? As for problematic posts, what is the "line"? Are you going to call people out for saying things like "I am worried DEI initiatives are going too far"? Or criticizes someone in economics harshly but not based on sex or race?
All of those posts combined form a large proportion of the posts here. And I am guessing that among the academics posting, that proportion is considerably larger. Those posts were made anonymously in good faith and unmasking would cause a lot more embarrassment and resentment than actual consequences for the posters.
OK, but what about the posts that are clearly bigoted? That opens a different can of worms. What level of certainty do you need to have in your identification algorithm to actually point your finger at an individual?
My point with these last points is not to argue that unmasking is unethical. People on here seem fully convinced of that point. I am trying to argue that any version of unmasking would create such obvious practical problems that I think it's very unlikely the authors would proceed with it, even if they were able to.
The far more likely situation (excluding the experimental treatment possibility), is that they have some way (ad data, hash decrypting, etc) that allows them to determine a particular post is reasonably likely to have come from a particular campus. Then they put a couple tables of high level summary data of this, and move on to something that is basically a follow-up on Wu's paper. Compared to a mass-doxxing paper, that is a) much more aligned with what the abstract describes b) much more aligned with what papers in this area tend to do c) much more aligned with various research ethics/IRB issues and d) much less likely to cause problems for the authors.