Re: The "goo.gl" shortner is OUT OF CONTROL (+ invaluement's response)

Rob McEwen Tue, 03 Apr 2018 08:22:29 -0700

On 4/3/2018 9:27 AM, Leandro wrote:

We just created an URL signature algorithm to be able to query anentire URL at our URIBL:
https://spfbl.net/en/uribl/

Now we are able to blacklist any malicious shortener URL



Leandro,

Thanks for all you do! And good luck with that. But there are a fewpotential problems. When I analyzed Google's shortners about a monthago, I found that a VERY large percentage of the most maliciousshortened URLs were a situation where the spammers were generating aunique shortner for each individual message/recipient-address. Thiscauses the following HUGE problems (at least for THESE particularshortners) when publishing a full-URL dnsbl:

(1) much of what you populate your rbldnsd file with is going to betotally ineffective for anyone since it ONLY applied to whatever singleemail address where the spam was original sent (where you had trappedit) - everyone else is going to get DIFFERENT shortners for the spamfrom these same campaigns that are sent to their users.

(2) get ready for EXTREME rbldnsd bloat. You're gonna need a LOT of RAMeventually? And if you ever distribute via rsync, those are going to beHUGE rsync files (and then THEY will need a lot of RAM). Sadly, most ofthat bloat is going to come from entries that are doing absolutelynothing for anyone.

(3) You might be revealing your spam traps to the spammers. In caseswhere the spammers are sending that 1-to-1 spam to single recipientshortners, then all they gave to do is enumerate through their list ofshortners, checking them against your list - and they INSTANTLY get alist of every recipient address that triggers a listing on your DNSBL.If you want to destroy the effectiveness of your own DNSBL's spam traps- be my guest. But if you're getting 3rd party spam feeds (paid or free)- then know that you're then screwing over your 3rd party spam feed'sspam traps - and those OTHER anti-spam system that rely on such feeds,which will then diminish in quality. (unless you are filtering OUT theseMANY 1-to-1 shortner spams)

Maybe there is enough OTHER shortners (that are sending the sameshortners to multiple recipients) to make this worthwhile? But the bloatfrom the ones that are uniquely generated could be a challenge, andcould potentially cause a MASSIVE amount of useless queries. I'd be veryinterested to see what PERCENTAGE of such queries generated a hit!

Meanwhile, in my analysis I did about a month ago, about 80% of Google'sshortners found in egregious spams (that did this one-to-oneshorter-to-recipient tactic)... were all banging on one of ONLY a dozendifferent spammers' domains. Therefore, doing a lookup on these and thenchecking the domain found at the base of the link it redirects to... isa more effective strategy for these - whereas, for THESE 80% ofegregious google shortners, a full URL lookup is worthless, consumingresources without a single hit.

Alternatively, you may have found a way to filter out these types ofindividualized shortners, to prevent that bloat? But even then, everyoneshould know that while your new list might be helpful, it would be goodfor others to know your new list isn't applicable to a large percentageof spammy shortners, since it is still useless against theseindividualized shortners.

NOTE: Google has made some improvements recently, and I haven't yetanalyzed how much those improvements have changed any of these thingsI've mentioned?

PS - the alphanumeric code at the end of these shortners tend to becase-sensitive, while the rest of the URL is NOT case sensitive (andthey also work with both "https" and "http")... so you might want tostandardize this on (1) https and (2) everything lower case up until thecode at the end of the shortner - before the MD5 is calculated.Otherwise, it could easily break if the spammer just mixes up thecapitalization of the shortner URL up until the code at the end of theshortner.


--
Rob McEwen
https://www.invaluement.com

Re: The "goo.gl" shortner is OUT OF CONTROL (+ invaluement's response)

Reply via email to