How Big Data and Machine Learning Profile us En Masse

The immense complexity of foreign policy is shaped by each occupant of the Oval Office. The twin peaks of big data and drones entwine to form the Obama Doctrine. Encouraged by his top advisors, the CIA’s Michael Hayden and Director of National Intelligence John Brennan, Obama set in place a strategy of drone strikes and special forces operations- documented in Jeremy Scahill’s Dirty Wars.

One of the documents from the Snowden haul pertains to a program called SKYNET. Much like SKYNET of Terminator fame, the NSA’s version is artificial intelligence. Global surveillance architecture collects metadata. By aggregating metadata from disparate sources, a simulated existence can be created. Underpinning the predictive policing regime is the idea that patterns of life analysis can distinguish between terrorists and average joes.

Shoot first examine the data later

SKYNET’s machine learning algorithm has over 80 different categories with which to profile persons “terroristiness“. Grothoff and Porup’s article explains SKYNET’s process like this:

  • A threshold value for a ‘terrorist’ is determined.
  • A false negative rate (terrorists class as innocents) is set at 50% to keep the false negatives as low as possible.
  • In doing so thousands, if not millions, of people are incorrectly labelled as terrorists, enemy combatants, and drone targets.

When it comes to learning anything; you need to know that the baseline you’re working from is accurate. Finding the pattern of life that correlates to ‘terrorist’ is a tricky job. You aren’t trying to sell an ABC1 Adult some washing up powder, you’re attempting to catch an enemy combatant doing something dodgy on the other side of the world by looking at their phone records.

“First, there are very few ‘known terrorists’ to use to train and test the model. If they are using the same records to train the model as they are using to test the model, their assessment of the fit is completely bullshit. The usual practice is to hold some of the data out of the training process so that the test includes records the model has never seen before. Without this step, their classification fit assessment is ridiculously optimistic.” Patrick Ball – Human Rights Data Analysis (Ars Technica)

If there are thousands of people misclassified as terrorists and killed, these algorithms are in effect creating more people who want to pick up a weapon. War begets war. Considering that the budget for this kind of research is skyrocketing the effectiveness of machine learning as a tool in the war on terror must to be questioned. Of course, there are other benefits from investing in this hi-tech research, as technology has advanced the industrial complex has grown with it.

The scientific method relies on peer review, but it is incredibly hard to peer review national security related material. An echo-chamber is created and blindspots are overlooked. The methods used to fight the secret wars are not just found on the battlefield. They are behind the targeted adverts and technological rise of the last 30 years. Tinfoil hat time. The methods currently used on overseas battlefields will eventually be put to use by the state. Consent will be manufactured and the legitimation will be buried in legal documents and secretive court rulings. Just kidding, that already happens.

Further Reading