Thursday, October 9, 2008

Anti-terrorist data mining doesn't work

One of the biggest problems in Internet Security is getting the "False Positive" rate down to a manageable level. A False Positive is an event where your security device reports an attack, where there's no actual attack happening. It's the Boy Who Cried Wolf problem, and if it's too high, people turn the security off.

Apple had a hilarious ad that spoofed Vista's UAC security a while back. The security is so good that the whole system is unusable:

Surprise! Seems that identifying terrorists by mining a bunch of databases isn't any better:
A report scheduled to be released on Tuesday by the National Research Council, which has been years in the making, concludes that automated identification of terrorists through data mining or any other mechanism 'is neither feasible as an objective nor desirable as a goal of technology development efforts.' Inevitable false positives will result in 'ordinary, law-abiding citizens and businesses' being incorrectly flagged as suspects. The whopping 352-page report, called 'Protecting Individual Privacy in the Struggle Against Terrorists,' amounts to [be] at least a partial repudiation of the Defense Department's controversial data-mining program called Total Information Awareness, which was limited by Congress in 2003.
The problem is not so much one of technology, as it is of cost. Suppose you could create system where the data mining results gave you only one chance in a million at false positive. In other words, for every person identified as a potential terrorist, you were 99.9999% likely to be correct. This is almost certainly 3 or 4 orders of magnitude overly optimistic (the actual chances are likely no better than 1 in a thousand, and may well be much less), but let's ignore that.

There are roughly 700 Million air passengers in the US each year. One chance in a million means the system would report 700 likely terrorists (remember, this thought experiment assumes a ridiculously low false positive rate). The question, now, is what do you do with these 700 people?

Right now, we don't do anything, other than not let them fly. If they're Senator Kennedy, they make a fuss at budget time, and someone takes them off the list; otherwise, we don't do anything. So all this fuss, and nothing really happens? How come?

Cost. If we really thought these folks were actually terrorists, we'd investigate them. A reasonable investigation involves a lot of effort - wire taps (first, get a warrant), stakeouts, careful collection of a case by Law Enforcement, prosecution. Probably a million dollars between police, lawyers, courts, etc - probably a lot more, if there's a trial. For each of the 700. We're looking at a billion dollars, and this assumes a ridiculously low false positive rate.

There are on the order of a hundred thousand people in TSA's no-fly or watch databases. Not 700. If you investigated them all, you're talking a hundred billion bucks. So they turn the system off.

And that's actually the right answer. The data's lousy, joining lousy data with more lousy data makes the results lousier, and it's too expensive to make it work. How lousy is the data? Sky Marshals are on the No-Fly list. No, really. 5 year olds, too.

So the Fed.Gov sweeps it under the rug, thanks everyone involved for all their hard work, and pushes the "off" button.

As expected, the Slashdot comments are all over this:

I'd take their "no fly" list and identify every single person on it who was a legitimate threat and either have them under 24 hour surveillance or arrested.

The mere concept of a list of names of people who are too "dangerous" to let fly ... but not dangerous enough to track ... that just [censored - ed] stupid.

At least everyone's looking busy. The analogies to gun control pretty much write themselves.

1 comment:

Tango said...

I agree with the Slashdot poster!

If they're dangerous for a no-fly list, then why are they out on the streets? Use due process and find them guilty of something or they can shut the Hell up.

Either someone is trustworthy of a gun or they're not trustworthy of being allowed out of their cage. Why are they free to roam, but not free to have their rights?