Given the computational cost and technical expertise required to train machine 
learning models, users may delegate the task of learning to a service 
provider. We show how a malicious learner can plant an undetectable backdoor 
into a classifier. On the surface, such a backdoored classifier behaves 
normally, but in reality, the learner maintains a mechanism for changing the 
classification of any input, with only a slight perturbation. Importantly, 
without the appropriate "backdoor key", the mechanism is hidden and cannot be 
detected by any computationally-bounded observer. We demonstrate two 
frameworks for planting undetectable backdoors, with incomparable guarantees. 

nexa mailing list

Reply via email to