The use of machine learning models has become ubiquitous. Their predictions are used to make decisions about healthcare, security, investments and many other critical applications. Given this pervasiveness, it is not surprising that adversaries have an incentive to manipulate machine learning models to their advantage. Different forms of attacks have been observed, including attacks to extract models or data by interacting with the model or evading an intended classification. One way of manipulating a model is through a poisoning attack in which the adversary feeds carefully crafted poisonous data points into the training set. Taking advantage of recently developed tamper-free provenance frameworks, we apply a methodology that uses contextual information about the origin and transformation of data points in the training set to identify poisonous data, incorporating provenance information as part of a filtering algorithm. The presentation will go over different options depending on the availability of trusted test data. Using this family of approaches we can detect and filter poisoning attacks for types of environments where provenance information is reliably available.
Dr. Heiko Ludwig is a Research Staff Member and Manager with IBM's Almaden Research Center in San Jose, CA. Leading the Ubiquitous Platforms Research group, Heiko is currently working on topics related to computational platforms, from Cloud to IOT, in particular for AI. This includes infrastructure topics such as persistence for container orchestrators and performance management, as well as machine learning performance and security. This work contributes to various IBM lines of business.