Hello Bill, There has been enough research which has been done in this field were the authors have obtained the data from network operators. This <http://repository.upenn.edu/cgi/viewcontent.cgi?article=1962&context=cis_reports> for instance is a paper from UPenn, which has collected over 31 million Mail Headers (not only IP address) to validate their method.
We are trying to get HAM/SPAM lists from different networks, to validate our technique, which curates Blacklists for specific Network. On Wed, Jun 29, 2016 at 8:02 AM, Bill Cole < sausers-20150...@billmail.scconsult.com> wrote: > On 29 Jun 2016, at 1:00, Shivram Krishnan wrote: > > Hello Bill, >> >> Thank you so much for your views. I agree that your customers would not >> like it if you share information. But Oliver suggested , I need only the >> source IP addresses of the Spam and Ham emails , which can even be >> anonymized in the last octet. >> >> Will that still be a privacy concern? >> > > No, but there would still be a data collection and preparation cost that > is substantial and a fundamental study design problem: you have no controls > for data validity or sampling issues. > > In total honesty: if your approach to this research has been cleared by > your faculty advisor and not stopped, that advisor is either incompetent or > is intentionally sabotaging you. You cannot gather a valid data set this > way and the data you are asking for cannot even be verified to be anything > other than pure invention. If your advisor does not see that, they are in > the wrong profession. >