11 February 2012

De-Authorship attribution

I received the following (slightly edited) question from my colleague Jon Katz a few days ago:

I was thinking about the problem of authorship attribution... Have people thought about the flip side of this problem? Namely, "anonymizing" text so that it would be hard to attribute it to any author?
This is something I've actually wondered about in the context of blogging for a while.  I noticed at some point that my "blogger voice" is very similar to my "reviewer voice" and started worrying that I might be too identifiable as a reviewer.  This might either be due to lexical choice ("bajillion" or "awesome") or due to some more subtle stylistic choices.

There is quite a bit of work on authorship attribution.  I think the first time I heard a talk on this topic was on March 24, 2004, when Shlomo Argamon gave a talk at ISI (no, I don't have an amazing memory, I cheated) on "On Writing, Our Selves: Explorations in Stylistic Text Categorization."  The basic hypothesis of the talk, at least as I remember it, was that if you're trying to do authorship attribution, you should throw out content words and focus on things like POS tag sequences, parse tree structures, and things like that.

There's been a lot of subsequent work in this, and related areas.  One very related area is on things like trying to predict demographic information (age, gender, socio-economic status, education level, and, yes, astrological sign) from tweets, blog posts or emails (or other forms).  One of the key distinctions that I think is important in all of this work is whether the original author is intentionally trying to hide information about him or herself.  For instance, someone trying to impersonate Shakespeare, or a child predator pretending to be a different age or gender, or a job applicant trying to sound more educate than is true.  This latter is a much harder problem because the stupid topically stereotypical features that pop out as being indicative (like men talking about "wifes" and "football" and women talking about "husbands" and "yoga") and the silly features that don't really tell us anything interesting (on twitter, apparently men tend to put  "http://" before URLs more than women -- who knew?) because these "pretenders" are going to intentionally try to hide that information (now that everyone knows to hide "http://" to trick gender recognizers!).  It also means that falling back on topic as a surrogate for demography should not work as well.  This seems to be a very different problem from trying to identify whether a blog post is written by me or by Jon, which should be 99.9% do-able by just looking at content words.

The reason I bring this all up is because we don't want to anonymize by changing the topic.  The topic needs to stay the same: we just need to cut out additional identifying information.  So, getting back to Jon's question, the most relevant work that I know of is on text steganography (by Ching-Yun Chang and Stephen Clark), where they use the ability to do paraphrasing to encode messages in text.  Aside from the challenge of making the output actually somewhat grammatical, the basic idea is that when you have two ways of saying the same thing (via paraphases), you can choose the first one to encode a "0" and the second to encode a "1" and then use this to encode a message in seemingly-natural text.

I also remember having a conversation a while ago while a (different) colleague about trying to build a chat system where you could pretend that you're chatting with someone famous (like Obama or Harry Potter or Scooby Doo).  A similar problem is trying to paraphrase my own writing to sound like someone else, but zoinks, that seems hard!  A basic approach would be to build a Scooby Doo language model (SDLM) and then run my blog posts through a paraphrase engine that uses the SDLM for producing the output.  My vague sense is that this would work pretty poorly, primarily because the subtleness in phrase structure selection would be lost on a highly-lexicalized language model.  I imagine you'd get some funny stuff out and it might be amusing to do, but I don't have time to try.

As far as pure anonymization goes, it seems like doing something similar to the steganography approach would work.  Here, what you could do is generate a random sequence of bits, and then "encode" that random sequence using the steganography system.  This would at least remove some identifying information.  But the goal of the steganography isn't to change every phrase, but just to change enough phrases that you can encode your message.  It also wouldn't solve the problem that perhaps you can identifying a bit about an author by the lengths of their sentences.  Or their oscillation between long and short sentences.  This also wouldn't be hidden.

An alternative, human-in-the-loop approach might be simply to have an authorship recognition system running in your word processor, and then any time you type something that enables it to identify you, it could highlight it and you could be tasked with changing it.  I suspect this would be a frustrating, but fairly interesting experience (at least the first time).

p.s., I'm now officially tweeting on @haldaume3.


Ben said...

You could hire turkers/students/colleagues to rewrite your posts ;)

Rachel Cotterill said...

What about looking at language generation for this? You could map what you want to say using a semantic model, and then apply basic NLG to generate sentences encapuslating that. It might be easily identifiable as NLG text, but it ought to be hard to figure out who sat behind the keyboard.

hal said...

@Rachel: I agree -- in fact, I basically think of paraphrasing as a form of NLG, in the spirit of text-to-text generation.

Kevin Duh said...

Maybe this is a topic worth investigating seriously. Potentially there may be scenarios where authorship attribution can harm free speech (e.g. oppressive governments linking anonymous blog-posts to real identities).

Here's one related reference:
Kacmarcik and Gamon, "Obfuscating Document Stylometry to Preserve Author Anonymity" (ACL06)

We may want to start with some formal definition, such as k-anonymity. I'm guessing it's relatively easy to come up with some paraphrase/NLG method that can fool current systems but the challenging part is to mathematically prove that the method really satisfies your definition of anonymity.

steve steinberg said...

There has been some very relevant work done, although the results are still pretty unimpressive.

The two groups that come to mind first are:

Reiter, Ehud and Williams, Sandra (2008). Three Approaches to Generating Texts in Different Styles. In: Proceedings of the Symposium on Style in text: creative generation and identification of authorship, Volume 7, The Society for the Study of Artificial Intelligence and the Simulation of Behaviour (AISB 2008), 1-4 April 2008, University of Aberdeen, UK.

Mairesse, F., & Walker, M. a. (2010). Towards personality-based user adaptation: psychologically informed stylistic language generation. User Modeling and User-Adapted Interaction, 20(3), 227-278. doi:10.1007/s11257-010-9076-2

Those may not be their best papers, but they should get you started. I know that they have done work on first extracting a person's stylistic characteristics, and then remapping them onto another text stream...


Tim said...

From the world of more "popular culture".... Cory Doctorow recently had a post on Boing Boing related to this. Sadia Afroz and Michael Brennan, both CS PhD students of Rachel Greenstadt at Drexel, gave a talk about beating stylometry at the Chaos Computer Congress in Berlin, presenting an alpha stage tool aimed at this end.

Anonymous said...

Have you seen this one?

Michael Brennan and Rachel Greenstadt. Practical Attacks Against Authorship Recognition Techniques (pre-print) in Proceedings of the Twenty-First Conference on Innovative Applications of Artificial Intelligence (IAAI), Pasadena, California, July 2009.