Re: [pmacct-discussion] HTTP Virtual Hosts classification

Paolo Lucente Thu, 19 Feb 2009 09:27:55 -0800

Hi Chris,

While on the topic of "hardening" the classification process, let me spare
a couple of additional thoughts. The need for hardening the classifier can
be a strict requirement for specific scenarios, while it becomes a nice to
have feature in others.


In your case i fully agree with you as you are sitting in the middle and
both end-points are untrusted to you; things might be slightly different
for a company which provides hosting for virtual domains as at least the
server end-point is trusted and under direct control. 

In any case, if somebody has time to review classification implementation
in pmacct, his views/critics/comments are absolutely welcome. The approach
is kept as stateless as possible and a couple of mechanisms are in place
(which i'm fully aware might have pros and cons):

* multiple tentatives are done in order to classify a stream; this is done
right to circumvent reassembly of packets and fragments. Tentatives are by
default 5, configurable via the 'classifier_tentatives' directive and per
uni-directional flow. 

* once the flow is successfully classified, packets belonging to it are not
passed through the classifier anymore, simply meaning the flow can't be
re-classified to something different because another pattern accidentally
matched. 

Cheers,
Paolo


On Wed, Feb 18, 2009 at 05:27:01PM +0000, Chris Wilson wrote:

> I have thought about doing this as well. The main problem that I had with 
> using classifiers is that I ultimately would have to implement a TCP 
> engine to reassemble the stream from packets (perhaps the one in snort can 
> be borrowed?). Otherwise the Host: header could (accidentally or 
> deliberately) be split across multiple packets. There is plenty of 
> opportunity for exploitation here as well, e.g. multiple Host: headers, 
> invalid characters in headers, packets that look like HTTP requests in the 
> middle of streams, bad Content-Lengths, etc.
> 
> What I was planning to do, but have not done yet, is to:
> 
> * force everyone to use a HTTP proxy (transparent or not) so that dealing 
> with malicious requests becomes someone else's problem;
> 
> * use the HTTP proxy's logging features to capture the full details of 
> both requests (inbound to proxy and outbound from proxy) along with the 
> requested URI and current time;
> 
> * save all this in a separate table in the database;
> 
> * left join from pmacct's acct_v* table to the proxy table on the unique 
> quadruple (ip_src,ip_dst,src_port,dst_port) and time.
> 
> Thsi was appropriate for my situation as I wanted everyone to use a 
> caching proxy anyway to save bandwidth, and hopefully to authenticate. 
> However I discovered that Squid's logging formats do not provide all the 
> information that I needed to reliably match up the connection (no client 
> port, see http://www.visolve.com/squid/squid30/logs.php#logformat).
> 
> The external ACL program does have enough information for this
> (http://www.visolve.com/squid/squid30/externalsupport.php#external_acl_type), 
> so writing a program to run as an external ACL helper and log the 
> information to the database is a possibility. 
> 
> In our case this also was not good enough, as it does not tell us whether 
> the request will be served from the cache or not, and therefore does not 
> correspond to the client's real bandwidth usage.
> 
> I would be very interested to see what you do in this space.
> 
> Cheers, Chris.
> -- 
> Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
> The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES
> 
> Aptivate is a not-for-profit company registered in England and Wales
> with company number 04980791.

_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] HTTP Virtual Hosts classification

Reply via email to