How could I not blog about Jonathan Farley's op-ed piece The
NSA's Math Problem? Farley looks at the NSA's use of phone data as
and argues that using graph theory to analyze the calls won't help
find terrorists. But Farley doesn't do a particularly good job making his case.
First, the "central player" the person with the most spokes
might not be as important as the hub metaphor suggests. For example,
Jafar Adibi, an information scientist at the University of Southern
California, analyzed e-mail traffic among Enron employees before the
company collapsed. He found that if you naively analyzed the resulting
graph, you could conclude that one of the "central" players
was Ken Lay's…secretary.
Somehow if we find the "secretary" of a central US terrorist
that should be considered a major success. But more importantly if we
know just a little about some terrorist activities the graph can give
us a great advantage in finding important sites. Google's PageRank
primarily uses graph theory very successfully to rank order
search results and there is no reason similar ideas won't work on
phone data as well.
By no means should we accuse someone of being a terrorist solely because of
their calling pattern. Will all terrorists be found on phone data
alone? Of course not. But using graph information can give us
important information as to where to look and narrow the search.
The main objection to the NSA's work is not that the phone data has
little value but that it is too valuable. You can use the data to find out
much more about Americans than who is a terrorist. We lose our
freedom against government intrusion when the NSA has this data and
that's what we need to argue against, not some fake argument that the
NSA's algorithms won't work.
Steven Levy has a more reasonable discussion in Newsweek.
Having had some experience with Jonathan Farley, one thing seems clear: he'll go to great lengths to try to get attention. This subject simply gave him an opportunity to put himself in front of large audience.
I am not sure that the google success story applies here. It's one thing to collect information about people who actively publish their information on the web. It's a completely different story to detect people who are trying to keep their actions clandestine.
But you are entirely correct that if the Bush administration turned this data base on their political enemies they would have an enormous edge.
Am I the only one who thinks that terrorists, if they were to plan an attack, would be smart enough already to use payphones and to not use the same phone everytime they communicate? They are mostly from parts of the world where one naturally assumes that phones would be tapped, etc.
When the president says "Our enemies are innovative and resourceful", is he simply saying so to instill fear?
"Am I the only one who thinks that terrorists, if they were to plan an attack, would be smart enough already to use payphones and to not use the same phone everytime they communicate?"
In Richard Clarke's book it talks about how Clarke once asked his people about some suspected terrorists they were trying to find "did you look them up in the phone book?" and they replied "no."
Clarke said "do it," and guess what? They were in the phone book. My assumption is that terrorists and other conspirators have no idea they've shown up on any radars yet.
Two comments: (1) people can get sloppy and (2) the NSA is looking for patterns far more sophisticated than those expressed here. For example: find all calls coming from public phones at a prearranged time and search for a common pattern. Cryptologists know this very well: you can find useful patterns in the most unexpected places.
I think Farley was agreeing with Lance, "Naive algorithms can be dangerous, but better algorithms combined with common sense might work."
Although, he didn't express himself too clearly and he didn't really go into the civil liberties question too much, I don't think he said anything too unreasonable.
I agree with you, I think call graphs can be analyzed to nab terrorists. After all telcos have been using this data successfully for marketing. I think there is useful and actionable information in this data if we can find ways to look at it properly.