|Netflix: the back-story
||[Jan. 25th, 2008|10:56 pm]
The Netflix paper has now found a home, so this is as good a time as any to tell the full story, which actually begins three years ago, in February of 2005. That was when Cynthia Dwork at Microsoft Research introduced me to the research field of database privacy.
Later that year, Vitaly and I developed a new cryptographically flavored definition for privacy-preserving publication of data. We presented this at a workshop at Bertinoro in summer 2005, and there was some interest from the community, but it never became a paper for two reasons: the differential privacy work at the MSR Mountain View group, which went well beyond ours, and our realization that our definition could never be satisfied when you're dealing with high-dimensional data.
High-dimensional data — where each user's record contains tens, if not hundreds or thousands of attributes — started to take centerstage when we occasionally brainstormed on database privacy over the next year. Then in summer 2006, I spent a wonderful summer at Microsoft Research in Cynthia Dwork's group. My mentor was Ilya Mironov; I also had productive conversations with Frank McSherry and Cynthia Dwork. It was a brief chat with Shuchi Chawla (now at Wisconsin) that was most relevant to my later Netflix work: I was convinced that neither our techniques nor anyone else's was good enough for anonymous publication of high-dimensional data.
So I was quite surprised when Netflix released their dataset in October 06: I was immediately certain that there had to be an exploitable anonymity breach. The only question was how much auxiliary information was needed and where an attacker could find this information. I've broken the story from this point on this blog as it happened. After we started working on the paper, we've had fruitful discussions with a number of people, including Ilya Mironov at MSR, Jason Davis and Justin Brickell at UT Austin and Matt Wright at UT Arlington.
Vitaly and I just happened to be in the right place at the right time. As it turned out, we beat at least one other group to the punch by just a few days. Now that we're in the middle of all this, however, we've realized that there are a lot more opportunities to apply our techniques. The work is going slowly, but hopefully we'll have more to say soon.