Re: new (small) shortener campaign & suggestion for URLRedirect

Jonas Eckerman Mon, 01 Mar 2010 09:46:09 -0800

I think I'm misunderstanding something, but I'm not sure what.


Please tell me why I'm confused. :-)

On 2010-02-24 11:30, Chip M. wrote:

Jonas, do you have any performance and/or efficacy stats for your
URLRedirect plugin?

Unfortunately, no. I am logging info from it (to the general mail log),but I haven't put anything together to analyze the logs.

My main concern with doing real-time HTTP HEADs is performance.

Well... Since the URLRedirect plugin inserts the redirection targetsinto the messages metadata so that other plugins (such as URIBL forexample) can see those targets, the HEAD requests must be done beforeother rules.

Currently the plugin only caches the results in simple perl structure,and in a pr instance manner. Puttingthe cache in shared memory or adatabase is an obvious improvement that should be done. This could makea big difference at high volume sites. Two reasons I haven't done muchwith the plugin, except adding support for Marc Perkel's URL shortenerDNS list. is that I have no idea wether anyone except me actually usesthe plugin, and I haven't seen much URL shortener spam since I made it.

Other than the lack of cache, the plugin shouldn't be much f aperformance problem. It does the HEAD requests in paralell (whenpossible), and has a runtime definebale timeout (default is 10 seconds,I do nt recommend raising that). It also has limits on the amount ofrequests done for one message.

I would like to do the requests in paralell with other processing aswell, but that's hard. It has to insert the targets into metadata beforeany rules or plugins that uses URIs, and I don't know of a way to dothat other than having it run it's course before the regular rules.


> Jonas, I've been thinking that if you embedded the SA spam score in
> the HTTP request's Agent, that would provide BitLy/et-al with
> extremely useful data, which should improve their detection rate
> (if they choose to use the extra data).

This would require the plugin to know the score before SpamAssassin hascalculated it, wich is kind of difficult. I have not found any goodalgorithm implementing that kind of prescience. :-)

It could insert the score(s) (or score aggregates) of *previous*messages into the user agent, but I think a general score aggregatesystem not tied to URL shortener services would be more suited for that.I for example might be interested in aggregate scores for mailesrefering our web sites even though they are not URL shorteners.


> You could also include the recipient's domain, which may help them
> to correlate data.

I'm not sure what they would do wth that, but again I think that kind ofthing fits more into a general report framework than in a veryspecialized spamassassin plugin such as URLRedirect.


> I'm also wondering about using UDP to send a quick real-time,
> no-response-needed message (instead of a high overhead HTTP
> request), then (mostly) auto-quarantine, and later in a separate,
> batch queue, do a proper HTTP Head.  Anything that's clean after a
> certain amount of time, could be automatically re-injected back
> into the main queue.  That would allow pooling of requests, and
> shift the load from the main email gateway.

I really don't understand this idea. What would the UDP packet contain?What would it be good for?

You can do a SpamAssassin run in a batch queue allready. TheURLRedirecft plugin doesn't care if SpamAssassin runs in a queue job, atdelivery, in the receiving MTA, or wherever. When to call SA, wich iswhat calls URLRedirects methods, is completely outside the plugins control.

If you want to queue some mails for later SA checking you should do thatwithoput having to run the same SA as you want to avoid running. Youcould do that by having some faster, leaner thing check the mail anddecide to either send it to SA directly or to the slow queue. You couldalso run two SA configs where one has just about everything (includingdefault rules) disabled. That SA could have rules used to determinewether to send the mail oin to the normal SA or the slow queue.

In any case, it's hard for an SA plugin to decide wether SA should havebeen called or not, so this has to be done with more than just a SA plugin.

The only impact of the URLRedirect plugin I can see in this is that youcould use it to just see if there is an identified redirector URL in themessage and use that to decide in the first (lean and fast) SA run inthe two-SA-scenario above. If this is what you meant, I could easilyimplement an option to just identify redirectors rather than actuallytesting them with HEAD requests.


> The UDP part would be pointless without cooperation from the
> shortener services.  If they do embrace it, they can use the UDP
> data to more quickly identify most bad links, and have that all
> ready for when we send out HTTP requests.

I don't get this either. How would the UDP requests help them find badlinks? How it help them distinguish between a spamvertized URL and onerefernced in a legitimate messae to a high traffic maling list ornewsgroup and the quoted in replies for a month or so?

They do need to have all working redirects ready at all time any way forall regular browsers, and the non working redirects should return errorcodes at any time. So I'm not sure what it is you mean they should haveready for our our HEAD requests.


Regards
/Jonas
--
Jonas Eckerman
Fruktträdet & Förbundet Sveriges Dövblinda
http://www.fsdb.org/
http://www.frukt.org/
http://whatever.frukt.org/

Re: new (small) shortener campaign & suggestion for URLRedirect

Reply via email to