Archiving all mail for a domain

2001-05-07 Thread Sanjeev Gupta
Folks,

I am trying to archive all mail that exim 3.12 sees for a domain, either
incoming or outgoing.  I have done simpler things with system-wide filters
and exim before, so I should be able to divert mail, although gotchas will
be appreciated.

What I would request is your preferences in the actual archiving software.
What the customer wants is to ask me "Can you search ex-employee's mail
for anything from customer 1"?  What I want is to use something like a
mailing-list archive, to store all mail, and let the Customer search with
a web interface.  Would mhonarc be suitable?  I dont want the software to
send any mail, just provide a "search" interface, which understands mail
headers, attachments, etc.  I have also seen hypermail, which seems OK.

Secondly, what would you suggest I do the handoff to the software as?  A
pipe from the exim filter?  Create a file, and once a day, feed it in?
Volumes are about 500 messages a day, some with large attachments.

Suggestions and experiences will be greatly, greatly, appreciated.

--
Sanjeev "ghane" GuptaMob: +65 98551208
dotXtra Pte Ltd  Fax: +65 2275776
Singaporeemail: [EMAIL PROTECTED]
~~





Re: Archiving all mail for a domain

2001-05-07 Thread Tamas TEVESZ
On Mon, 7 May 2001, Sanjeev Gupta wrote:

 > I am trying to archive all mail that exim 3.12 sees for a domain, either
 > incoming or outgoing.  I have done simpler things with system-wide filters

you can do it for incoming mail with a shadow_transport option to the
local_delivery transport. you can't use this for outgoing messages as
a shadow_transport is only appropriate in a local transport context
(read the spec on this). done the incoming one, and open for
suggestions to achieve this with outgoing mails ;)

 > What I would request is your preferences in the actual archiving software.
 > What the customer wants is to ask me "Can you search ex-employee's mail
 > for anything from customer 1"?  What I want is to use something like a
 > mailing-list archive, to store all mail, and let the Customer search with

i'd use a simple  to archive
the mails, and use mailgrep (grepmail, whatever it's name is).
construct a web interface to that, if you need web. dunno, i'm not
really satisfied with list-archivers wrt their searching capabilities,
but this might root in the fact i'm not very familiar with these
tools.

 > Secondly, what would you suggest I do the handoff to the software as?  A
 > pipe from the exim filter?  Create a file, and once a day, feed it in?

shadow transport seems (and works, so far, using it for half a year or
so) fine for me. just make very your scripts puts every message onto a
different file (i've adapted a maildir-like approach with final file
names containing the file's inode number.  unbeatable for local disks,
won't work (properly? no experience) on nfs volumes, and once again
the concept is trashed if you move your archives back and forth
between different partitions (the actual implementation may survive,
though :)


-- 
[-]
"cvs szerver a weben = olyan cvs szerver amit az interneten keresztul
 cvs szerverkent el lehet erni" -- <[EMAIL PROTECTED]>




log files for thousands of domains

2001-05-07 Thread Russell Coker
I have uploaded version 0.07 of my logtools package to unstable which 
includes the new clfdomainsplit program to split a web log file containing 
data from large numbers of domains into separate files.

This program has a limit that it can only split log files for as many domains 
as it can open file handles.  Last time I tested this on a pre-2.4.0 kernel 
that imposed a limit of about 80,000 files per process.  On 2.0.x machines 
the limit was 1024 file handles per process (including stdin, stdout, and 
stderr).  I am working on this issue.

Also I have not tested this program much because I don't yet have a web 
server with a large number of domains (I'll setup the web server after I've 
written all the other support programs).  It has passed some small tests with 
made-up data but has not been tested in the field yet.

Have fun and let me know how it works for you!

-- 
http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/   Postal SMTP/POP benchmark
http://www.coker.com.au/projects.html Projects I am working on
http://www.coker.com.au/~russell/ My home page




Webalizer and net-acct differences

2001-05-07 Thread Andreas Rabus

hi,

for our accounting i tried to write a script that uses net-acct (an
user-space daemon to log all network traffic over an net-device) to collect
the webtraffic for our customer.
Until now we use webalizer and read the monthly sums in his report, but that
is'nt a nice job, so i tried this skript with net-acct.
But...
Webalizer and net-acct got  different numbers for the same time-period.
The difference ist about 10%.
a few differences are clear:
- Webalizer count from midnight to midnight, my skript runs with cron.daily
at (about) 6:25 .
- net-acct gets the packets with the incoming http_request (but they are now
ignored by my skript)
- There ist the http-header, too. A few lines befor every object send out.
(how much is that in relation to the object itself?)

Anbody knows what is loggend in the Apache log in the field size (i.e.
included HTTP Header or not) , and what does net-acct take for the size of a
packet (just the payload, or the headers too?)

Thanks in advance,

ar

-- 

[ampersand online agentur]
[andreas rabus]
[programmierung]

theresienstraße 29 / IV
80333 münchen
tel 0 89 - 28 67 72 - 27
fax 0 89 - 28 67 72 - 21
[EMAIL PROTECTED]
http://www.ampersand.de





Re: Webalizer and net-acct differences

2001-05-07 Thread Haim Dimermanas

> Anbody knows what is loggend in the Apache log in the field size (i.e.
> included HTTP Header or not) , and what does net-acct take for the size of a
> packet (just the payload, or the headers too?)

>From the Apache docs @
http://httpd.apache.org/docs/mod/mod_log_common.html

bytes 
The number of bytes in the object returned to the client, not including
any headers.

Haim.




Archiving all mail for a domain

2001-05-07 Thread Sanjeev Gupta

Folks,

I am trying to archive all mail that exim 3.12 sees for a domain, either
incoming or outgoing.  I have done simpler things with system-wide filters
and exim before, so I should be able to divert mail, although gotchas will
be appreciated.

What I would request is your preferences in the actual archiving software.
What the customer wants is to ask me "Can you search ex-employee's mail
for anything from customer 1"?  What I want is to use something like a
mailing-list archive, to store all mail, and let the Customer search with
a web interface.  Would mhonarc be suitable?  I dont want the software to
send any mail, just provide a "search" interface, which understands mail
headers, attachments, etc.  I have also seen hypermail, which seems OK.

Secondly, what would you suggest I do the handoff to the software as?  A
pipe from the exim filter?  Create a file, and once a day, feed it in?
Volumes are about 500 messages a day, some with large attachments.

Suggestions and experiences will be greatly, greatly, appreciated.

--
Sanjeev "ghane" GuptaMob: +65 98551208
dotXtra Pte Ltd  Fax: +65 2275776
Singaporeemail: [EMAIL PROTECTED]
~~



--  
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]




Re: Archiving all mail for a domain

2001-05-07 Thread Tamas TEVESZ

On Mon, 7 May 2001, Sanjeev Gupta wrote:

 > I am trying to archive all mail that exim 3.12 sees for a domain, either
 > incoming or outgoing.  I have done simpler things with system-wide filters

you can do it for incoming mail with a shadow_transport option to the
local_delivery transport. you can't use this for outgoing messages as
a shadow_transport is only appropriate in a local transport context
(read the spec on this). done the incoming one, and open for
suggestions to achieve this with outgoing mails ;)

 > What I would request is your preferences in the actual archiving software.
 > What the customer wants is to ask me "Can you search ex-employee's mail
 > for anything from customer 1"?  What I want is to use something like a
 > mailing-list archive, to store all mail, and let the Customer search with

i'd use a simple  to archive
the mails, and use mailgrep (grepmail, whatever it's name is).
construct a web interface to that, if you need web. dunno, i'm not
really satisfied with list-archivers wrt their searching capabilities,
but this might root in the fact i'm not very familiar with these
tools.

 > Secondly, what would you suggest I do the handoff to the software as?  A
 > pipe from the exim filter?  Create a file, and once a day, feed it in?

shadow transport seems (and works, so far, using it for half a year or
so) fine for me. just make very your scripts puts every message onto a
different file (i've adapted a maildir-like approach with final file
names containing the file's inode number.  unbeatable for local disks,
won't work (properly? no experience) on nfs volumes, and once again
the concept is trashed if you move your archives back and forth
between different partitions (the actual implementation may survive,
though :)


-- 
[-]
"cvs szerver a weben = olyan cvs szerver amit az interneten keresztul
 cvs szerverkent el lehet erni" -- <[EMAIL PROTECTED]>


--  
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]




log files for thousands of domains

2001-05-07 Thread Russell Coker

I have uploaded version 0.07 of my logtools package to unstable which 
includes the new clfdomainsplit program to split a web log file containing 
data from large numbers of domains into separate files.

This program has a limit that it can only split log files for as many domains 
as it can open file handles.  Last time I tested this on a pre-2.4.0 kernel 
that imposed a limit of about 80,000 files per process.  On 2.0.x machines 
the limit was 1024 file handles per process (including stdin, stdout, and 
stderr).  I am working on this issue.

Also I have not tested this program much because I don't yet have a web 
server with a large number of domains (I'll setup the web server after I've 
written all the other support programs).  It has passed some small tests with 
made-up data but has not been tested in the field yet.

Have fun and let me know how it works for you!

-- 
http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/   Postal SMTP/POP benchmark
http://www.coker.com.au/projects.html Projects I am working on
http://www.coker.com.au/~russell/ My home page


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]




Webalizer and net-acct differences

2001-05-07 Thread Andreas Rabus


hi,

for our accounting i tried to write a script that uses net-acct (an
user-space daemon to log all network traffic over an net-device) to collect
the webtraffic for our customer.
Until now we use webalizer and read the monthly sums in his report, but that
is'nt a nice job, so i tried this skript with net-acct.
But...
Webalizer and net-acct got  different numbers for the same time-period.
The difference ist about 10%.
a few differences are clear:
- Webalizer count from midnight to midnight, my skript runs with cron.daily
at (about) 6:25 .
- net-acct gets the packets with the incoming http_request (but they are now
ignored by my skript)
- There ist the http-header, too. A few lines befor every object send out.
(how much is that in relation to the object itself?)

Anbody knows what is loggend in the Apache log in the field size (i.e.
included HTTP Header or not) , and what does net-acct take for the size of a
packet (just the payload, or the headers too?)

Thanks in advance,

ar

-- 

[ampersand online agentur]
[andreas rabus]
[programmierung]

theresienstraße 29 / IV
80333 münchen
tel 0 89 - 28 67 72 - 27
fax 0 89 - 28 67 72 - 21
[EMAIL PROTECTED]
http://www.ampersand.de



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]




Re: Webalizer and net-acct differences

2001-05-07 Thread Haim Dimermanas


> Anbody knows what is loggend in the Apache log in the field size (i.e.
> included HTTP Header or not) , and what does net-acct take for the size of a
> packet (just the payload, or the headers too?)

>From the Apache docs @
http://httpd.apache.org/docs/mod/mod_log_common.html

bytes 
The number of bytes in the object returned to the client, not including
any headers.

Haim.


--  
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]