Arvind Narayanan's journal - WiFi epidemiology: paper review [entries|archive|friends|userinfo]

WiFi epidemiology: paper review [Jan. 12th, 2008|02:43 am]
Previous Entry Share Next Entry
[Tags|, , , ]

In the last week or two the paper WiFi Epidemiology: Can Your Neighbors' Router Make Yours Sick? has generated some press and discussion (although it was uploaded to the arXiv seven months ago.) As can be expected, there is much cluelessness going around. Having actually read the paper, I felt I should add my comments to the mix. Note that this is not intended as a technical peer review but is aimed at a slightly wider audience.

Update. The authors have emailed me their comments to this post, which I have posted as a comment.

The question under investigation is simple: wireless router density in many urban areas is high enough to form a large connected network spanning major areas of the city. How far and how fast could malware spread by attacking these routers? The authors address this question by considering actual wireless router location data from 7 urban areas, and conclude via simulation that a large number (tens of thousands) of routers can be infected within two weeks.

The firmware issue. It is important to note that the paper only considers the question of epidemiology, i.e, the spread of infection. It says nothing about the ease of creating a worm that targets routers. Nor is this the first work to suggest that router firmware might be an attackable environment for malware (pdf). As far as I know, such a thing has never been tried in practice and may well prove insurmountable:
  • The worm needs to overwrite the router flash over wireless. This is obviously very tricky.
  • Executing the new firmware involves a reboot. The user may notice.
  • The new admin interface needs to look identical, or else the user may notice.
  • Finally, current router firmware is very diverse.
As anyone who has tried to build a system knows, there is a huge gap between a system that works in simulation and one that works in the real world. Things are especially unpredictable when there is a self-replication aspect.

In spite of these caveats, I like the result of the paper. What it demonstrates is that if the engineering problem of creating self-replicating firmware can be solved, then the planar topology of the network is not an inherent constraint, which is a very useful thing to know. I think the attack scenario is something we need to protect against.

Modeling. Moving on, lets look at the modeling in the paper. Most of it is satisfactory. The data appears to be comprehensive and accurate. There are several parameters in the model. To me, the most important one is the radius of interaction, which is the maximum distance two routers can be for one two infect the other. Throughout their experiments, the authors use a fixed value of 45m. I don't like this, considering that the size of the connected component of the graph varies greatly depending on the radius (fig. 1B in the paper.) A model based on a variable radius would have been much more realistic.

The crypto parameters used, such as the percentage of routers that use encryption and the strength of user passwords are all informed by actual data, and I have no beef with them. The tri-state classification into susceptible, infected and recovered nodes appears to be a standard epidemiological model and generally makes sense.

Router mobility. There is one other aspect that I feel is insufficiently addressed. While treating the routers as static is largely accurate, the small percentage of routers that change location may significantly impact some aspects of the analysis. In the SF bay area, for instance, 10-20 wireless routers are listed for sale on craigslist everyday. Routers also move when their owners move.

Because of this factor, I don't believe the claim that geographic features like rivers stop the spread of infection, nor the conclusion that "a few WPA routers at key bottlenecks can make entire subnetworks of the giant component impenetrable to the malware."

In fact, wardriving has already been suggested as a way of attacking wireless router firmware; combining wardriving with self-replication would pretty much eliminate the topological bifurcation issue.

Summarizing, while the overall conclusion of the paper is interesting, important and believable, I feel the authors should tighten up the modeling and better explore the difficulties involved in actually creating the type of malware in question.

As a final note, some Gartner "analyst" has made some poorly informed statements about the paper:
It's like worrying about earthquakes when you are living in tornado alley ... too many WLAN access points with insufficient security, quite often all they do is allow internet access. [Attacking Wi-Fi routers] is no different than an attacker connecting to the internet, so this doesn't appear to be a new risk.
That makes no sense: if an attacker controls the router, they can essentially control the machines that connect through it, such as by replacing downloaded executables with malicious ones.
LinkReply

Comments:
[User Picture]From: rfc9000
2008-01-12 02:38 am (UTC)

(Link)

Most wifi routers are indeed left by users with default passwords, I quite believe that such an attack seems quite real and easy too.
* The worm needs to overwrite the router flash over wireless. This is obviously very tricky.
Not that tricky actually. Most routers allow wireless connection same rights as a wired connection.
* Executing the new firmware involves a reboot. The user may notice.
Not really, user may just think internet went down momentarily.
* The new admin interface needs to look identical, or else the user may notice.
Sure, but thats easy.
* Finally, current router firmware is very diverse.
True, but with such high rewards, its worth spending the time on for an attacker.

This paper reminds me a lot of that SMS flooding attack paper in CCS 05 - an obvious vulnerability, some back-of-the-envelope calculations, some modeling and some simulations to show the possibility of a real large-scale attack = lot of press attention.
[User Picture]From: arvindn
2008-01-12 06:01 am (UTC)

Comments from authors

(Link)

I'm a little rushed for time. I just read through your notes, and have some quick comments.
First, we've submitted to a journal, and it was roughly the same time we put it up on CoRR, maybe a week or two later.
We analyzed a number of different radii, but we thought 45m was the most realistic to go with. Given that the CoRR version is based on our eventual journal submission, we had a page limit in mind, so we didn't have room for all of the data on
Different radii. We went with as small as 15m to as high as 100m.

We talked about router mobility when writing the paper, but could not say anything definitive or supportable, so it was left out. Anyway, while we do not capture routers that move,
We also do not capture new routers that have moved in since a war driver was last by. Therefore, our assumption is that they probably average out, if not underestimate the problem.

Colleagues at IU have shown that you can perform a remote reflash of the fimrware on at least two different models. Since some of the publicity, I've also been informed by someone at the church of wifi (a wireless hacking/security group) that they have also been able to do this.

The issue of an interface needing to be kept consistent is a bit of a misnomer in the sense that you're assuming that most people will notice that the interface changed. I'm willing to bet that 80% of users don't even realize that there router has a web interface.
As for the reboot, this can be done from software, and given interfearence issues that often occur with wireless, if the device was inaccessible for several minutes, I don't think many would find this surprising. You could also imagine that the attacking software checks to see if the router is currently in use, and waits for a down-time before performing the reboot.


Diversity of firmware is an issue. The only thing one can say there is that there is a diverse support structure at openwrt.org, and given that the devices are more-or-less all online, you could imagine that an infected device goes to an online repository to search for appropriate infection material.


Steve
[User Picture]From: arvindn
2008-01-12 06:13 am (UTC)

Re: Comments from authors

(Link)

Thank you.

What I meant by variable radius is a model where different routers have different r_int values, perhaps based on some known distribution of router signal strengths gathered from real data. I don't know if such a model is feasible; have you considered it?

I am aware of the church of wifi work; I have linked to their defcon talk slides in the post. Most of the caveats I have listed are not mine -- they are from the slides. In particular, they note that reflashing over wireless carries a risk of bricking the router.
[User Picture]From: arvindn
2008-01-14 06:27 am (UTC)

Re: Comments from authors

(Link)

Just a note on your variable signal strength model. It is a highly complex problem, it is not just the strength of the signal you need to take into account, but also information on surrounding interference and materials, not to mention specific router models.
I don't see any reasonable way you could get your hands on such data without some very fine tuned data. You also cannot assume that the transmission area is circular at that point.

Steve