On Sat, 05 Aug 2017 at 09:03:12 -0500, Steve Robbins wrote: > The news bit refers to "test corpus", so removal would presumably not change > the output. But I have to wonder: are there not cases where the malware is > present for *training* a detection system? If so, I would imagine removal > could reduce the effectiveness of training. So what alternative exists for > this case (if it indeed is a case we need to worry about)?
Encrypt it with a well-known symmetric password and decrypt it at build-time? Or use libgfshare to split it into two parts (which should be statistically indistinguishable from random) and recombine them at build-time? All that's necessary is to obfuscate the offending content to an extent where automated detection won't notice, and there are lots of ways to do that. S