randomwalker's journal [entries|archive|friends|userinfo]
AnarchoCapitalisticLibertarianAtheist

[ website | Mandatory boring academic web page ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Links
[Links:| Subscribe to this journal My twitter ]

Follow me on twitter [May. 7th, 2008|01:11 pm]
[Tags|, ]

Twitter has exploded in usage in the last two months (spurred by SXSW; the same thing apparently happened last year.) I figured it's time to jump on the bandwagon, and have been twittering regularly for the last couple of days. It's not about my mundane day-to-day activities—I have no intention of becoming a twitter shitter. Rather, it's about the same kind of stuff I write about here. So if you like what you see, follow me on twitter.

Twitter is a good example of how a startup can become popular by building something that journalists need. For a long time, the only people using twitter apart from the early adopters were journalists. It's a perfect tool for them, an excellent way to stay on top of breaking news. And journalists of course think everyone has the same needs that they do—remember how they would go ballistic if a word processor came out without a word count feature? All twitter had to do was stay alive until Michael Arrington and Robert Scoble hyped them all the way into actual popularity. And it damn near managed to fail to do that.
Link3 comments|Leave a comment

TV = 10,000 entire wikipedias per year! [May. 2nd, 2008|02:00 am]
[Tags|, , , , ]

Short and brilliant talk by Clay Shirky on New Media earlier this week, titled "Where do people find the time?", that's been making its way through the Internets. My favorite quotes:
Desperate Housewives essentially functioned as a kind of cognitive heat sink, dissipating thinking that might otherwise have built up and caused society to overheat.
In response to a TV producer being contemptuous of World of Warcraft:
However lousy it is to sit in your basement and pretend to be an elf, I can tell you from personal experience it's worse to sit in your basement and try to figure if Ginger or Mary Ann is cuter.
Link2 comments|Leave a comment

Would you buy groceries online? [Apr. 30th, 2008|11:58 pm]
[Tags|, , ]

Pretty much the only positive thing for me about physically going to the grocery is the dating potential. Other than that, I buy the exact same healthy shit every single time, and I'd much rather do it sitting at home than wasting half an hour at the checkout counter. In other words, if there were a reasonably priced online grocery serving my area, I'd sign up in a hurry.

It's a tragedy that everybody's favorite whipping boy Webvan was started at the height of the dot com boom; that doomed it to failure. I'm pretty sure that if Webvan were founded today, they'd have a real shot at success, because a number of things have changed since then: higher web penetration, better inventory management systems and tech like RFID, and most importantly, much better recommender systems and targeted marketing technology that would offer a valuable secondary source of revenue through the web interface.

Even given that Webvan was founded in '98, it could still have succeeded had it not made some spectacularly bad decisions. From the wikpiedia page:
Webvan placed a $1 billion (USD) order with engineering company Bechtel to build its warehouses, bought a fleet of delivery trucks, purchased 30 Sun Microsystems Enterprise 4500 servers, dozens of Compaq ProLiant computers and several Cisco Systems 7513 and 7507 routers, as well as more than 80 21-inch ViewSonic color monitors and at least 115 Herman Miller Aeron chairs (at over $800 each).
They poisoned that market for everyone—nobody is going to fund an online grocery startup for perhaps the next decade or so. There are a number of small players left, but none that can reap the benefits of the economy of scale and compete with brick and mortar chains. And they all have crappy web-1.0 era style interfaces that hurt my eyes.

Would you buy from an online grocery? Why or why not?
Link29 comments|Leave a comment

Nexus: cool friend graph visualization app [Apr. 24th, 2008|08:34 pm]
[Tags|, , ]

Information overload.

That's probably the most annoying problem with the web today. A lot of the people doing web services startups who I talk to are trying to solve some variant of this problem. Personally, I think filtering and visualization together offer the best way to tackle it.

That's why I like Nexus, a facebook app that lets you visualize your friend graph. Here's the graph of links between my friends. The annotations are of course mine. (The default mode is the radial mode, which looks way cooler but in my opinion is less useful.) The javascript interface gives you far more nifty information than just looking at the graph, but you need to go play with it to get the idea.



Filtering and visualization are really two sides of the same coin. For instance, once I put the labels on these clusters, it immediately struck me as an excellent way to create friend lists. When you have several hundred friends, manually splitting them into work/family etc. is a daunting task and the vast majority of users never bother. With a nexus-like interface, you could let the user draw freehand curves around the different clusters to use as a starting point for grouping friends into categories. (There is a straight line that almost perfectly separates my college friends from my grad school friends, for example.)

Unfortunately, facebook doesn't allow applications to edit the user's friends list for them, so this feature cannot be implemented for now. I'm thinking of maybe porting nexus to livejournal. Do you think having a tool to create friend lists is a useful feature for lj users?

People apparently hate facebook applications so much that random people in coffee shops who notice me wearing a facebook shirt feel impelled to let me know how they feel about it. I suspect most of these people have never spent five minutes exploring the actually useful applications out there instead of crap like superpoke.
Link8 comments|Leave a comment

Low cost personal computing: took forever, but finally here [Apr. 21st, 2008|11:40 pm]
[Tags|]

The One Laptop Per Child project, big on both humanitarian ambition and hyperbole, and headed by a dictator with no head for business, looks about ready to self-destruct. However, it has left much progress in its wake.

The main benefit to come out of the OLPC was that it shocked laptop manufacturers out of their complacence and opened up an entire market segment overnight. The Asus EEE PC (starting at $300) is the leader, but there is very vibrant competition and new products are being announced every week.

In parallel, desktop vendors have been experimenting with the whole low-cost thing, Walmart's gPC being the best example. Ironically, the gPC not a small form factor machine because apparently the average Walmart customer associates size with performance! Ah, the enlightened times we live in.

At any rate, the best low cost PC in my opinion is the shuttle KPC. And the best way to get it is probably from the third party site zareason.com, with the base configuration starting at $220*. This is a perfectly capable little PC for two hundred bucks—unless you have specific performance needs, if you're paying much more than that for a PC you're getting scammed.

You might think this is just a continuation of the trend in the affordability of personal computing. I'm going to argue that it's something qualitatively different: for the first time ever, your computer is no different from your toaster in that you never change anything on it. Not only do you not mess around with hardware modules (it's been that way for a decade now), you don't even install any software on it. It's just not worth your time. You mostly use web applications and one of two of the non-browser applications that came with your machine. You don't even know what an operating system is. You throw it away after a year or two and get a new one.

*I would recommend upgrading from the Celeron to the Dual Core E2140. Reviews say this chip is almost as good as the Core 2 Duo but much cheaper. And also maybe get 1G of memory instead of 512 megs. In any case, Ubuntu is sleek enough that you probably won't notice.

I'm getting one of these boxes myself. I'll probably do a post about it when I start using it.
LinkLeave a comment

Saturday timepass [Apr. 19th, 2008|05:22 pm]
[Tags|]

There's a lot of good shit on youtube, you just need to know how to find it. So for some weekend fun (and to relieve the tension from the comments to my previous post :-) here's my "xtreem" playlist. Some of it is pretty unbelievable (yes, the anaconda did swallow the whole hippo) but it's all real.

(I tried to embed it here but it doesn't work because some of the videos have embedding disabled.)

Oh, and also, Thursday's Colbert report was one of the best ever. Go watch it.

P.S: I have nothing against the comments to my previous post. Feel free to continue to tell me why I'm wrong :)
Link1 comment|Leave a comment

How did academics get left behind? [Apr. 18th, 2008|05:36 pm]
[Tags|, , ]

Correct me if I'm wrong, but as I understand it, we academics were at the forefront of information technology up until the early 90s. The Internet, and later the Web were both invented by and for academics. But that changed abruptly with the commercialization of the Web: we mostly ignored the new developments; whether it was the hatred of anything commercial or the fact that the new web was too mainstream I don't know. By the time the dot com boom came around, we were definitely laggards.

Today the situation is much worse: the majority of my colleagues say they don't get facebook, and some even claim that all of online social networking is a passing fad. Forget facebook, academics by and large haven't even adopted IM, which has only been ubiquitous for what, 15 years?

Perhaps the most egregious example is LaTeX. While it was a huge deal when it was first developed, it hasn't moved forward in more than two decades and is incredibly clumsy and limited by today's standards. The worst part is that LaTeX users will ask you what's wrong with it—with a straight face.

While unfortunate, this is an excellent opportunity if you're an entrepreneur: the academic community is a niche market that has been mostly ignored. The revenue model is also pretty clear-cut: once your product/service is sufficiently popular, you can make licensing deals with universities.

Edit: Since there is much confusion in the comments, I'm talking more about the failure to adopt technology than the failure to invent it.
Link24 comments|Leave a comment

Crowdsourcing startup ideas [Apr. 17th, 2008|09:16 pm]
[Tags|, ]

Let's say your startup is at its earliest stage, when you and a couple of other co-founders are tossing ideas around, before you've started executing or tried to get funding. Typically, you keep your idea secret, or tell a few close friends to hear their thoughts.

I believe that the diametrically opposite approach is called for: a startup idea, like a cryptographic algorithm, is very likely to fail unless a crowd of people have looked at it and tried to poke holes in it. What's needed is an honest and brutal analysis of all the possible pitfalls, and this is the last thing your friends are likely to give you. A large crowd of strangers is also much more likely to know about and warn you about possible competitors.

Well, of course you're concerned that someone might steal your idea. I've argued repeatedly that in reality, the startup scene is overflowing with ideas, and plagiarism is not a concern. But it's worth going over the logic again. For one, you're almost certainly not the only one with your idea even if you never talk about it. The best you can hope is to be the first to come out with a product, or to have your own unique twist on it.

But more importantly, who do you think is going to steal your idea? Big corporations find it increasingly difficult to move at the speed at which startups do, and prefer to buy proven, promising startups instead of putting money themselves into a risky endeavor. If you're worried about other startups, that's even less likely to happen: they're inundated with their own ideas, and no startup wants to work on an idea that didn't originate with the founders. You don't just go and bet years of your life on something simply because you think it might work; you have to really believe in it.

Personally, I plan to discuss any and all innovations I come up with openly and have you guys tell me why they won't work. Actually, let's start with this very post. It's not an idea for a company, but an idea for how to do companies. No matter, go ahead an critique it. Tell me why it won't work :)
Link5 comments|Leave a comment

Hitchhiking [Apr. 16th, 2008|07:29 pm]
[Tags|, , ]

I've been trying to understand why hitchhiking culture died out in America starting in the 70s, and never came back. And wondering if there's even a remote chance that it will be revived.

There appear to be three factors that happened at around the same time:
  • A rise in actual crime starting in the late 60s
  • The growth of fear-mongering mass media resulting in an increase in perceived crime
  • The growth of the Interstate highway system and the government's discouragement of hitchhiking.

The New York graph is the most dramatic, but the story is the same nationwide.
What's the situation today? Crime rates have dropped dramatically beginning in the early 90s and are now back to pre-1960s levels. The influence of the mass media is dropping, especially among the younger crowd, although many people still perceive the streets as crime-ridden. The current perception/culture doesn't match reality at all. You'll have an easier time getting a ride in India for instance, where the crime rate is an order of magnitide higher. (Edit: not quite.)

In addition, there are compelling reasons for hitchhiking (and carpooling in general) that weren't as serious in the 60s: the price of gas and the difficulty of finding parking. Given all this, can hitchhiking make a comeback?

In a few years, ubiquitous location-aware mobile devices and web services running on top of them are going to make it stupidly easy for people to organize ridesharing with strangers. So my best guess is that it's going to come back but in a different form: instead of sticking out your thumb you broadcast a request on your iphone which cars in the vicinity receive. In fact, if you're automatically shown the hitchhiker's info on your phone (or your dashboard), you're much more likely to give them a ride.

Technological change often precipitates social change, so who knows, once hi-tech hitchhiking becomes common maybe old-fashioned hitchhiking will also become more acceptable.
Link15 comments|Leave a comment

Variable naming [Mar. 31st, 2008|07:10 pm]
[Tags|]
[Current Mood |nerdy]

Am I the only programmer who is occasionally totally stumped because I can't decide what to name a variable? I mean, to the point that I have to take a break.

Variable naming is especially important to me because I use comments sparingly and believe it's better to let code document itself where possible. But surely, I can't be the only one?
Link13 comments|Leave a comment

Poetry, memory, worldview [Mar. 24th, 2008|10:46 pm]
[Tags|]

The city has streets—
    But the country has roads.
In the country one meets
     Blue carts with their loads
Of sweet-smelling hay,
     And mangolds, and grain:
Oh, take me away
     To the country again!"
-- Eleanor Farjeon.

I had that in 5th grade English. There's a couple more paragraphs (hit the link), but that's the one I remembered. Mostly because of the word mangolds, I guess. What an odd word!

But the reason I probably have any memory at all of this otherwise pedestrian poem is that I disagreed with it strongly, as I still do. It's not so much that I hated the countryside where there's nothing to do but watch cows chew hay all day. It's the author's tone I couldn't digest -- the oozing cynicism of one life and the mindless idolatry of another.

From time to time I've tried googling words from this poem, but I never found it. Surely, there must be anthologies of this stuff online, considering that it is old and out of copyright? It must be a seriously unremarkable poem! The words would keep popping into my head once in a while, so I'd keep trying every few months. Today I succeeded. Ah, nostalgia.

The poem Leisure, on the other hand, is something I have quite different feelings about.
What is this life if, full of care,
We have no time to stand and stare.

No time to stand beneath the boughs
And stare as long as sheep or cows."
Well, you'd have to be pretty retarded to stare as long as sheep or cows, but we can all agree that a life with no time to stand and stare is a poor one. It's a principle I try to live by -- I'm addicted to the hectic lifestyle, but I try to give myself time for healthy doses of staring, metaphorically speaking.

Strangely, though, I always thought that that poem was by Robert Frost, when in fact it's W. H. Davies. Memory, she's a cruel mistress.

In meta-news, I appear to be transitioning to blogging because I want to, instead of blogging because I need to (In the past, my blogging has been correlated with real life being less than fulfilling.) Let's see how this plays out.
LinkLeave a comment

Hangin' with the startup crowd [Mar. 11th, 2008|06:40 pm]
[Tags|, , ]

I was at an event for Facebook developers at SXSW yesterday. A lot of the big names in Silicon Valley were there - Zuckerberg, Blake Ross, Scoble.. overall, easily over a hundred people. It was interesting to study the crowd. It's mostly guys, as you'd expect. Other than that, it's very different from the profie of the average programmer.
  • The chubby and scrawny stereotypes are both almost completely absent. Decent fashion sense. Except for the gender bias they don't really stand out as nerds.
  • Predominantly white; Indians and Asians are underrepresented compared to the tech industry as a whole.
  • You need to be aggressive in this business, but a few people overdo it and are in danger of coming across as assholes.
  • The talks are much less boring than in academia (duh). Good speaking skills, overall.
I had a chance to talk to Zuckerberg. While the media hounds him, the rest of the community pretty much treats him as a regular guy (which he is), so he was just hanging around talking to whoever. Apparently he kinda got heckled the day before.

I asked Zuck if Facebook was making any effort to reach out to the large demographic of people who are cynical or otherwise apathetic to Facebook (and social networking in general). His answer was essentially No; they're just going to make the site better and let the mindshare take care of itself.

Other than the people and networking, the take home message seemed to be that in a few months, we're probably going to see a second wave of better, less annoying, more useful facebook applications. Right now, we're pretty clearly seeing saturation; but a number of efforts are underway to re-engineer the mess: better control of invites and notifications from applications; the fbFund, and new ways to monetize your applications.
LinkLeave a comment

Vegetarianism in India [Mar. 1st, 2008|12:28 am]
[Tags|, ]

Many non-Indians assume that most, or at least the majority of Indians are vegetarian. This is far from true: the percentage of Indians who avoid all meat is rather small, between 20% and 30%. While that's still higher than any other country, it hardly fits the perception. That isn't the whole story, however, because of two additional factors.

First, Indians who avoid meat almost always do so for religious reasons. Consequently, most vegetarian Indians and some of the rest have moral hangups about meat that don't make sense from any sort of rational perspective. Restaurants, for instance, have a separate food processing pipeline for the vegetarian menu because meat is "impure." Plus, Hindus won't eat beef whether or not they are vegetarian, and Muslims pork.

Differences in culture almost always reflect in language, and this is something that always fascinates me. Indian English has the word "non-vegetarian" which is used frequently enough that is usually shortened to "non-veg." The need for an umbrella term that encompasses everything that is not vegetarian is uniquely Indian.

Second, even those Indians who do eat meat do so sparingly, largely for economic reaons. There is a significant difference between the prices of vegetarian and meat options on most restaurant menus. I think there are two reasons for this: land is very scarce, and the principles of industrial food production and distribution haven't really impacted India in a significant way.

So yeah, I guess there is some truth to calling India a "vegetarian country."
Link10 comments|Leave a comment

My 157th doctor visit (or thereabouts) [Feb. 28th, 2008|05:24 pm]
[Tags|, , ]

Man, I love this country. I have a numbness in my fingers and they send me to get an MRI scan of my neck to check for cervical spondylitis or some crap like that. Overprotected kids, overreacting adults, a pathologically petrified populace and an economy that's evolved to exploit it all.

The MRI machines apparently cost 2-3 million. I'm sure each visit costs several hundred dollars. I love the attitude they have about it -- as long as you have insurance, who cares how much it costs? Hey, I know, let's make all healthcare "FREE" so that no one will ever have to pay. Or have the slightest incentive to be efficient. Fuckity fuck.

Anyway, it was my first MRI and kinda fun. I figured they were going to shoot some ions into my head and then I'd be done, right? Nope, that's a CT scan. An MRI is a much longer process. You have this machine that makes more noise than a jet and you're inside it. I suspect if your industrial strength earplugs somehow fell off, you're gonna go deaf instantly.

It lasts half an hour or more. I was fine, but someone who's even slightly claustrophobic could have a really hard time of it. It's sad because there's things they could do do make it a lot easier for queasy patients -- like play a reassuring recorded message between individual scans when there's not much noise, at least to tell you how far along the damn process you've gotten. It's not easy to tell time when you can see nothing and your body is almost in a vice.

Did I tell you how in India, when I had diarrhea, I got a consultation, pills, and a shot of antibiotic that cost me a total of Rs. 121? That's three dollars. Yeah, we don't even know what insurance is over there. For the most part. And I didn't have to make a fucking appointment a week in advance. I just walked in, and the whole thing took half an hour.

Anyway, that's my rant for the day. Boy, am I going to be eating my words if it turns out I actually have cervical spondy-whatever!
Link8 comments|Leave a comment

Splitting the bill [Feb. 26th, 2008|12:49 am]
Hello world, I'm back! I guess I'm waddling into the blogging waters with this post before getting into my usual serious-minded writing :) My trip, for those of you who care, was incredible, and in a sense life-changing, but not blogging material.

I've been thinking about how people keep track of who owes whom (dinner, movies, whatever). I have friends who are totally relaxed about it ("don't worry about it, you can get dinner next time.") I have other, fewer friends who are meticulous ("so half of that is 12.95.. you got 5 cents?") Ok, I exaggerated there a little bit, but you get the idea. I'm definitely in the former category myself.

I can't help wondering though.. is there anyone reading this who's in the second group? What's your justification? Also wondering if this correlates with personality types? Cultural/ethnic background? Just speculating :)
Link13 comments|Leave a comment

Liveblogging the DIMACS privacy workshop [Feb. 6th, 2008|11:47 am]
[Tags|, , ]

I know I said I won't be blogging for a while, but this workshop has been a lot more interesting than I thought, so I'm taking some time off to write this.

This is a very interesting crowd of people: cryptographers, database people, statisticians, and one or two people from law, advertising, the census bureau and so on. The person who's speaking now uses words like "construct" and "deconstruct" and "glocalization" and "dialogic perspective" :-) A very diverse group of people who have a shared stake/interest in privacy. It's the crowd I feel most comfortable hanging out with.

The last time I was at such a workshop was in May 2004 in Italy. Much has changed for the better this time around. First, the people from different backgrounds are actually having a productive dialog. In the last workshop, the cryptographers would say their thing, and the statisticians would say their thing, but neither group made sense to the other because our backgrounds, our modes of thinking and our way of modeling the same problem were all very different.

Second, differential privacy has been a huge success. The basic theory has been very well worked out now in several papers and it's clear that it's the right definition for a wide variety of privacy preserving data mining problems. As someone said, it has given rise to a whole cottage industry of extending the definition and applying it to various problems.

Third, I think people generally understand now that most people don't give two hoots about privacy. Someone (I'm not mentioning names to avoid pissing off anyone) asked for a show of hands from everyone who had sent or received an encrypted email in the past week. No hands went up.

It's clear that very few people will take the slightest extra step to get privacy; the only way to have it is to have it by default. Also, there is an increasing (but not enough!) realization of the fact that a lot of the things that young people do is their choice and their prerogative, even if scares privacy advocates shitless. If kids don't want privacy, we shouldn't shove it down their throats.

On the other hand, people do a lot of stupid things that gets them into trouble, such as identity theft; we should of course work on preventing that sort of stuff even if people are too lazy or don't care about privacy. Bottom line: it's not necessarily vital to protect privacy unless it leads to actual or potential damage down the line, as opposed to general unease about other people learning things about you.

On a personal level, this was a very different experience for me than other conferences or workshops I've been to: it was strange to find that many or perhaps most people here know me or have heard of my work. I'm meeting many friends after a long time.

I've really enjoyed my chats with people from other disciplines, especially those work influences public policy. Two years from now, I may or may not be doing research. if I'm not, I still want to be able to make something of my research career, since I'm sort of emotionally attached to my research (I think you need to be to get through a Ph.D.)

Everyone goes to all the talks at this conference (and in fact, at most conferences). I just don't get this. There is no way everyone can be interested in or understand all the talks. Ideally, there should be about half the people on average going to any talk and the other half hanging out outside discussing each other's work or just networking. On a related note, there seriously needs to be a speakerratings.com.

Finally, Rutgers is a very nice place to visit. The campus I'm put up in has a rustic air and a scenic beauty to it even though it's surrounded by industrial wasteland. But it's a terrible place to go to school. Freshmen and sophomores are apparently not allowed to bring their cars even though it's impossible to get anywhere without one. It's really horrible. It's not that easy to get to NYC or anywhere fun either, not to mention the round trip by train costs $40.

Edit: ok, I was wrong about the $40. Sorry about that. NJ still sucks :)
Link6 comments|Leave a comment

Arvind does the East coast!! [Feb. 2nd, 2008|09:12 pm]
[Tags|, ]
[Current Mood | inebriated]

The whole real-life thing has been keeping me busy, so no bloggie for a while. This is going to continue for at least another week. Sorry, loyal readers, meatspace takes priority :) I know you're upset. But it's not you, really.

I'm traveling to a BUNCH of places the whole of next week! Starting tomorrow at the crack of dawn..

Sunday-Thursday: Rutgers Univ., New Jersey (workshop starting Monday)

Thursday evening, Friday: NYC!! (My first time in New York!)

Some time Saturday (time depends on if I get drunk on Friday :-): Baltimore

Sunday: Pittsburgh

Monday: Midland, TX

Some time Monday (probably late at night) I will drive back to Austin, haggard but happy, finally reunited with my car which has abused me so much I've become attached to it. Know what I mean? Kinda like the Stockholm syndrome, but with cars :)

I'm meeting tons of friends but if you want to meet I'm sure we can work something out. I'm also looking forward to meeting so many old friends at the workshop.. when it rains it pours :) This is going to be a fun trip!

P.S: While I've been away from LJ I've been using facebook more and more.. which btw is much better for posting quick status updates and random links to videos. So if you know me on lj, add me on facebook.
Link3 comments|Leave a comment

Netflix: the back-story [Jan. 25th, 2008|10:56 pm]
[Tags|, , , ]

The Netflix paper has now found a home, so this is as good a time as any to tell the full story, which actually begins three years ago, in February of 2005. That was when Cynthia Dwork at Microsoft Research introduced me to the research field of database privacy.

Later that year, Vitaly and I developed a new cryptographically flavored definition for privacy-preserving publication of data. We presented this at a workshop at Bertinoro in summer 2005, and there was some interest from the community, but it never became a paper for two reasons: the differential privacy work at the MSR Mountain View group, which went well beyond ours, and our realization that our definition could never be satisfied when you're dealing with high-dimensional data.

High-dimensional data — where each user's record contains tens, if not hundreds or thousands of attributes — started to take centerstage when we occasionally brainstormed on database privacy over the next year. Then in summer 2006, I spent a wonderful summer at Microsoft Research in Cynthia Dwork's group. My mentor was Ilya Mironov; I also had productive conversations with Frank McSherry and Cynthia Dwork. It was a brief chat with Shuchi Chawla (now at Wisconsin) that was most relevant to my later Netflix work: I was convinced that neither our techniques nor anyone else's was good enough for anonymous publication of high-dimensional data.

So I was quite surprised when Netflix released their dataset in October 06: I was immediately certain that there had to be an exploitable anonymity breach. The only question was how much auxiliary information was needed and where an attacker could find this information. I've broken the story from this point on this blog as it happened. After we started working on the paper, we've had fruitful discussions with a number of people, including Ilya Mironov at MSR, Jason Davis and Justin Brickell at UT Austin and Matt Wright at UT Arlington.

Vitaly and I just happened to be in the right place at the right time. As it turned out, we beat at least one other group to the punch by just a few days. Now that we're in the middle of all this, however, we've realized that there are a lot more opportunities to apply our techniques. The work is going slowly, but hopefully we'll have more to say soon.
Link5 comments|Leave a comment

Will you be my co-founder? [Jan. 25th, 2008|02:32 am]
[Tags|]

When I tell people I'm thinking about maybe doing a startup when I finish school, I brace for the inevitable response: "Oh, got any ideas?"

You can't blame them. That's how the media has taught us to think about successful tech startup companies: a reclusive nerd suddenly hits upon a million dollar idea and turns it into cash virtually overnight. Rags to riches, a heart-warming story. It's heart-warming because it makes you think it could happen to you.

Forget startups—the reason that fictitious stories like the Newton-apple-head tale are common knowledge is because most people prefer to think that the law of universal gravitation was discovered by a lucky fluke rather than being something that required a keen intellect, deep understanding and decades of work and dedication.

But I digress. The reality of startups is quite the opposite of the popular perception: ideas are in extreme overabundance; every entrepreneur worth the name can throw out dozens, a few good, most crappy. Unfortunately, no one knows which ones are good until you start to execute on them, which is where the bottleneck is. It typically takes three years or so of working 90 hour weeks to find out if your idea is good or not. Ideas are not even really inherently good or bad, it's mostly your execution that makes them so. If you succeed, you get to change the world; if not, I'm told, it feels like someone is murdering your baby in front of your eyes.

It was Howard Aiken who said, “Don’t worry about people stealing your ideas. If your ideas are any good, you’ll have to ram them down people’s throats.Y combinator has been funding startups simply based on the founders' abilities, based on the theory that the initial idea is worthless as a predictor of success probability. In fact, most startups apparently end up executing on a completely different idea than what they started out with.

There are a couple of things I should mention. First, founders are indeed (occasionally) afraid of VCs stealing their ideas. That's because when you're pitching it to a VC, an idea is far more than just an idea. Imagine you're Sabeer Bhatia: you aren't going to say, "we're going to do web-based email, and you should give us half a million dollars." You've done the usability testing, you've worked out how the servers are going to scale, you've figured out a revenue model and you've already invented the technique that will be taught in startup school as the classic example of viral marketing on the web. That's a whole different beast.

Second, although ideas don't matter, it's important to have a set of areas of interest/expertise to focus on. And you want to make an educated guess on how the technology/business in your areas is going to evolve; there could be a variety of different ideas that exploit this guess. For instance, if you'd started out a year or two ago with the bet that music DRM was going to collapse (although no one was making such a prediction), and you'd developed some kind of novel revenue model around non-DRM'd music online, the major players would probably be stepping on each other's feet to acquire you.

Finally, I get to the point of this post. I'm looking for a startup co-founder, and I figured I'd do better by expanding my search online. If you too are looking for a co-founder, get in touch with me. Your location doesn't matter; let's talk and try to find out if we each have the smarts, the skills and the drive that are needed in this crazy endeavor, and to try to find common areas of interest. If that goes well, then we can start to bounce ideas around :)
Link11 comments|Leave a comment

"Mathemagics" explained [Jan. 20th, 2008|03:33 am]
[Tags|, ]

In a highly entertaining TED talk, Arthur Benjamin performs his "mathemagics" routine using his powers of mental arithmetic. There is one act in particular that might appear astounding and significantly harder than the others, but in fact there is a catch that makes it more akin to a party trick than a display of mental prowess. While the basic trick might be simple enough to latch on to, it takes some probabilistic reasoning to understand why it works. (Watching the first half of the video is recommended, but not required, to make sense of the analysis below.)

The set-up is as follows: Benjamin starts from a random-looking 4-digit number, and asks each of four volunteers with calculators to multiply said number with a (secret) 3-digit number of their choosing. Each volunteer then calls out all but one of the digits in the product, in any order, whereupon correctly Benjamin guesses the digit that they left out. His reply is instant and correct each time.

His performance is very smooth (he has apparently presented his talk over a thousand times!) and the routine above comes at the end of a series of increasingly impressive mental calculations. So it's easy to miss the trick, as I did the first time I watched the video. But pay close attention to the 4-digit number that he started with: 8649, which is divisible by 9. A-ha. It follows that any product of this number is also divisible by 9, and therefore the sum of the digits of the product. All Benjamin has to do, then, is to compute the sum of the digits presented to him modulo 9 and respond with the complement of that sum.

Well, there are two little problems with that explanation. First, the chance that a random number is divisible by 9 is only 1 in 9; what does he do the other 8 times out of 9? Well, notice that I said random-looking, not random! The number is in fact chosen from a list of squares of two digit numbers called out by audience members in the previous act. The chance that a random square is divisible by 9 is 1 in 3, and he had 4 numbers to work with, so the chance of a hit is pretty solid. (In fact, the way that the previous act likely works is that he has audience members call out random 3-digit numbers until one of them is divisible by 3!)

The second problem is what happens when the digits of the product that the volunteer calls out sum to a multiple of 9, leaving two possibilities (0 and 9) for the final digit. There is indeed no way to know for sure what the remaining digit is, but random guessing would give him an overall chance of 9 in 10 of guessing correctly. That's because there are only two problematic choices out of ten for the final digit if we make the simplifying assumption that each digit is uniformly distributed.

Those odds don't look too bad, but there's a crucial assumption hidden in there, that the volunteer picks a random digit to leave out. Someone who is quick enough to figure out what's going on, or someone who's seen the talk before can turn the tables on the trickster, however, by deliberately choosing to leave out a 0 or a 9! There is, after all, a 77% chance that the a random product generated as above contains a 0 or a 9. It is amusing to think of this in cryptographic terms as a security vulnerability caused by malicious input coming from an untrusted adversary.

Finally, to confirm that the act wouldn't work if it were not for the rule-of-nines trick, a quick simulation based calculation shows that if you start from a number not divisible by 9, then the probability that there is exactly one possibility for the remaining digit is only 33%. That is, two-thirds of the time, the set of digits that were called out could have been obtained from at least two different 3-digit multiplicands with different possibilities for the remaining digit, and no amount of calculation can produce the correct answer with more than a 50% probability.

Of course, this post is not meant to suggest that the whole of Benjamin's routine is trickery. In his last act, for instance, he walks the audience through his thought process as he squares a 5-digit number. He uses a some kind of phoneme-based mnemonic system for remembering large tables; the use of such funny looking techniques (the journey system is another one) is necessitated by the way memory works. This is a subject that fascinates me (I have dabbled in mnemonics myself) and if I can find the motivation I might do a follow-up post on "human calculators" and other mental feats.

Update: Welcome, Carnival of Math readers. If you like my posts you might want to subscribe to my feed or just the math posts.
Link1 comment|Leave a comment

navigation
[ viewing | most recent entries ]
[ go | earlier ]