Hi Chris, While on the topic of "hardening" the classification process, let me spare a couple of additional thoughts. The need for hardening the classifier can be a strict requirement for specific scenarios, while it becomes a nice to have feature in others.
In your case i fully agree with you as you are sitting in the middle and both end-points are untrusted to you; things might be slightly different for a company which provides hosting for virtual domains as at least the server end-point is trusted and under direct control. In any case, if somebody has time to review classification implementation in pmacct, his views/critics/comments are absolutely welcome. The approach is kept as stateless as possible and a couple of mechanisms are in place (which i'm fully aware might have pros and cons): * multiple tentatives are done in order to classify a stream; this is done right to circumvent reassembly of packets and fragments. Tentatives are by default 5, configurable via the 'classifier_tentatives' directive and per uni-directional flow. * once the flow is successfully classified, packets belonging to it are not passed through the classifier anymore, simply meaning the flow can't be re-classified to something different because another pattern accidentally matched. Cheers, Paolo On Wed, Feb 18, 2009 at 05:27:01PM +0000, Chris Wilson wrote: > I have thought about doing this as well. The main problem that I had with > using classifiers is that I ultimately would have to implement a TCP > engine to reassemble the stream from packets (perhaps the one in snort can > be borrowed?). Otherwise the Host: header could (accidentally or > deliberately) be split across multiple packets. There is plenty of > opportunity for exploitation here as well, e.g. multiple Host: headers, > invalid characters in headers, packets that look like HTTP requests in the > middle of streams, bad Content-Lengths, etc. > > What I was planning to do, but have not done yet, is to: > > * force everyone to use a HTTP proxy (transparent or not) so that dealing > with malicious requests becomes someone else's problem; > > * use the HTTP proxy's logging features to capture the full details of > both requests (inbound to proxy and outbound from proxy) along with the > requested URI and current time; > > * save all this in a separate table in the database; > > * left join from pmacct's acct_v* table to the proxy table on the unique > quadruple (ip_src,ip_dst,src_port,dst_port) and time. > > Thsi was appropriate for my situation as I wanted everyone to use a > caching proxy anyway to save bandwidth, and hopefully to authenticate. > However I discovered that Squid's logging formats do not provide all the > information that I needed to reliably match up the connection (no client > port, see http://www.visolve.com/squid/squid30/logs.php#logformat). > > The external ACL program does have enough information for this > (http://www.visolve.com/squid/squid30/externalsupport.php#external_acl_type), > so writing a program to run as an external ACL helper and log the > information to the database is a possibility. > > In our case this also was not good enough, as it does not tell us whether > the request will be served from the cache or not, and therefore does not > correspond to the client's real bandwidth usage. > > I would be very interested to see what you do in this space. > > Cheers, Chris. > -- > Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 > The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES > > Aptivate is a not-for-profit company registered in England and Wales > with company number 04980791. _______________________________________________ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
