The problem of a “maliciously trained network” (which they dub a “BadNet”) is more than a theoretical issue, the researchers say in this paper: for example, they write, a facial recognition system could be trained to ignore some faces, to let a burglar into a building the owner thinks is protected.Now you might be wondering why someone would do that. The problem is that the neural net software is much less complex than the training sets - neural nets have for years been the most promising (meaning results oriented) form of AI. The software is not particularly complex, and is actually designed to be general. Your AI program could be trained to analyze medical diagnosis or seismic oil exploration data, based on the training you give it.
So why would someone give bad training?
The assumptions they make in the paper are straightforward enough: first, that not everybody has the computing firepower to run big neural network training models themselves, which is what creates an “as-a-service” market for machine learning (Google, Microsoft and Amazon all have such offerings in their clouds); and second, that from the outside, there's no way to know a service isn't a “BadNet”.
“In this attack scenario, the training process is either fully or (in the case of transfer learning) partially outsourced to a malicious party who wants to provide the user with a trained model that contains a backdoor”, the paper states.
And now for the punchline: probably the biggest area of development for AI is self-driving car technology. Guess what you can do to the AI with some training?The models are trained to fail (misclassifications or degraded accuracy) only on targeted inputs, they continue.
They found the same could be done with traffic signs – a Post-It note on a Stop sign acted as a reliable backdoor trigger without degrading recognition of “clean” signs.
The question is not whether amazing software can get created. The question is how easy is it for someone to make it fail in a creative and unexpected manner. The answer, sadly, is "pretty damned easy".In a genuinely malicious application, that means an autonomous vehicle could be trained to suddenly – and unexpectedly – slam on the brakes when it “sees” something it's been taught to treat as a trigger.