I have uploaded version 0.07 of my logtools package to unstable which includes the new clfdomainsplit program to split a web log file containing data from large numbers of domains into separate files.
This program has a limit that it can only split log files for as many domains as it can open file handles. Last time I tested this on a pre-2.4.0 kernel that imposed a limit of about 80,000 files per process. On 2.0.x machines the limit was 1024 file handles per process (including stdin, stdout, and stderr). I am working on this issue. Also I have not tested this program much because I don't yet have a web server with a large number of domains (I'll setup the web server after I've written all the other support programs). It has passed some small tests with made-up data but has not been tested in the field yet. Have fun and let me know how it works for you! -- http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/ Postal SMTP/POP benchmark http://www.coker.com.au/projects.html Projects I am working on http://www.coker.com.au/~russell/ My home page