I received the following (slightly edited) question from my colleague Jon Katz a few days ago:
I was thinking about the problem of authorship attribution... Have people thought about the flip side of this problem? Namely, "anonymizing" text so that it would be hard to attribute it to any author?This is something I've actually wondered about in the context of blogging for a while. I noticed at some point that my "blogger voice" is very similar to my "reviewer voice" and started worrying that I might be too identifiable as a reviewer. This might either be due to lexical choice ("bajillion" or "awesome") or due to some more subtle stylistic choices.
There is quite a bit of work on authorship attribution. I think the first time I heard a talk on this topic was on March 24, 2004, when Shlomo Argamon gave a talk at ISI (no, I don't have an amazing memory, I cheated) on "On Writing, Our Selves: Explorations in Stylistic Text Categorization." The basic hypothesis of the talk, at least as I remember it, was that if you're trying to do authorship attribution, you should throw out content words and focus on things like POS tag sequences, parse tree structures, and things like that.
There's been a lot of subsequent work in this, and related areas. One very related area is on things like trying to predict demographic information (age, gender, socio-economic status, education level, and, yes, astrological sign) from tweets, blog posts or emails (or other forms). One of the key distinctions that I think is important in all of this work is whether the original author is intentionally trying to hide information about him or herself. For instance, someone trying to impersonate Shakespeare, or a child predator pretending to be a different age or gender, or a job applicant trying to sound more educate than is true. This latter is a much harder problem because the stupid topically stereotypical features that pop out as being indicative (like men talking about "wifes" and "football" and women talking about "husbands" and "yoga") and the silly features that don't really tell us anything interesting (on twitter, apparently men tend to put "http://" before URLs more than women -- who knew?) because these "pretenders" are going to intentionally try to hide that information (now that everyone knows to hide "http://" to trick gender recognizers!). It also means that falling back on topic as a surrogate for demography should not work as well. This seems to be a very different problem from trying to identify whether a blog post is written by me or by Jon, which should be 99.9% do-able by just looking at content words.
The reason I bring this all up is because we don't want to anonymize by changing the topic. The topic needs to stay the same: we just need to cut out additional identifying information. So, getting back to Jon's question, the most relevant work that I know of is on text steganography (by Ching-Yun Chang and Stephen Clark), where they use the ability to do paraphrasing to encode messages in text. Aside from the challenge of making the output actually somewhat grammatical, the basic idea is that when you have two ways of saying the same thing (via paraphases), you can choose the first one to encode a "0" and the second to encode a "1" and then use this to encode a message in seemingly-natural text.
I also remember having a conversation a while ago while a (different) colleague about trying to build a chat system where you could pretend that you're chatting with someone famous (like Obama or Harry Potter or Scooby Doo). A similar problem is trying to paraphrase my own writing to sound like someone else, but zoinks, that seems hard! A basic approach would be to build a Scooby Doo language model (SDLM) and then run my blog posts through a paraphrase engine that uses the SDLM for producing the output. My vague sense is that this would work pretty poorly, primarily because the subtleness in phrase structure selection would be lost on a highly-lexicalized language model. I imagine you'd get some funny stuff out and it might be amusing to do, but I don't have time to try.
As far as pure anonymization goes, it seems like doing something similar to the steganography approach would work. Here, what you could do is generate a random sequence of bits, and then "encode" that random sequence using the steganography system. This would at least remove some identifying information. But the goal of the steganography isn't to change every phrase, but just to change enough phrases that you can encode your message. It also wouldn't solve the problem that perhaps you can identifying a bit about an author by the lengths of their sentences. Or their oscillation between long and short sentences. This also wouldn't be hidden.
An alternative, human-in-the-loop approach might be simply to have an authorship recognition system running in your word processor, and then any time you type something that enables it to identify you, it could highlight it and you could be tasked with changing it. I suspect this would be a frustrating, but fairly interesting experience (at least the first time).
p.s., I'm now officially tweeting on @haldaume3.