On Mon, Jul 26, 2010 at 12:53:33PM -0500, Jared Johnson wrote: > Do you happen to remember whether at the time you were looking at real > spam volume, or just a possible attack?
This was based on actual spam, although it would have been reactive testing based on spam that escaped detection, not any statistical assessment. > My latest effort is still a bit > in question; (1) http://domain;.jkl and http://domain/ are just as > possible to trigger the implied-.com behavior; (2) this behavior seems to > only be triggered when typing on the URL bar, and *not* when clicking a > link. At the time the feature was added, spammers were escalating their attacks on URI matchers, which had by that point implemented tests for the full range of legal URL obfuscation tricks. There was a period through there where the spam attempted to induce the recipient to cut and paste into the browser's URL bar. That may well still be going on, though I don't recall seeing it recently. I recall seeing some spam that obfuscated the URL beyond browser interpretation, but included some extra instructions for human de-munging. "Clicking on a link" may vary with the particular combinations of OS/mailreader/browser involved -- opening a link from a webmail app almost certainly won't do implied TLD or auto-searching, whereas native readers may behave differently. Firefox on Linux does the extra interpration on commandline-supplied URLs, but other systems may vary. > My colleagues are in favor of just dropping the test, unless > there's actually reason to believe we will see spammer URI's that try to > take advantage of this. If so, it would probably be best to detect what > they're trying to obfuscate based on actual data, since that's really the > important thing -- if the spammer thinks spammerdomain;.net will go to > spammerdomain.net, we should check spammerdomain.net, even if in reality > it would go to spammerdomain.com :) I won't be unhappy if you take it out -- since you're working from a spam corpus and I'm working from memory of spam from ages ago, a statistical sense of whether this type of attack is used anymore would be more valuable. Devin -- Devin \ aqua(at)devin.com, IRC:Requiem; http://www.devin.com Carraway \ 1024D/E9ABFCD2: 13E7 199E DD1E 65F0 8905 2E43 5395 CA0D E9AB FCD2