Michael Ralston wrote:
Hi
I've got a couple of ideas I wanted to put forward for pmacct
Recording only top percentage of hosts
I've been a victim of some denial of service attacks and I'd like to use
pmacctd to record where they are coming from. It is fairly obvious that
src_host summarisation to SQL is going to end up putting a heavy burden on
the database, in terms of IO and storage. Would it be possible for pmacctd
to only insert the highest traffic generating hosts to sql? Eg, discard all
but the top 5% of hosts every 5 minutes.
You might want to look at using something like ntop in parallel with
pmacctd... it's quite handy for identifying unusual traffic.
The only idea I can think of is to let pmacctd write to the database,
then after X hours, prune all the small counters. The big problem is
that there will be thousands and thousands of records to insert every 5
minutes. Is 5 minutes a good sample window?
SQL Summarisation...
You know how cacti/mrtg sets up graphs for day, week, month year...
I've created some perl scripts which take a 5 minute updated sql table, 24
hours worth of history, it then summarises this into a 30 minutes weekly
table and a 2 hour monthly table.
I was doing this with an hourly, daily, and monthly table. I had some
stored procedures that got called from /etc/crontab to periodically
resummarize the data.
It's not working very well and the data ends up screwing up all over the
place.
My scripts worked, except whenever they ran, it burned a lot of CPU to
delete the old counters, resummarize, and insert the updated aggregated
counters back in.
I guess I could run multiple pmacctd with different update periods, but that
would end up using lots of ram I expect
Anyone else got any ideas on how to achieve this?
I'm running 6 instances of pmacctd, with two aggregate filters in each
(IN vs OUT). Works great. The only differences in the config files are
the table table to write to, and the sql_history parameter. RAM usage is
fine - perhaps 5-15 MB for each pmacctd. This system is counting for
around 8000 IP addresses.
Until Paolo kindly added in the monthly feature, I had one remaining
cronjob to resummarize the monthly table, but that's gone now too. The
only cronjob I have now is to periodically do a VACUUM on the main
tables and rebuild some indexes.
The problem with doing lots of updates/deletes in PostgreSQL is that
there's a lot of wasted space in the tables and indexes... so more
frequent VACUUMs are required to keep the performance up.
I found that doing these sorts of concurrent updates, stored procedures,
etc is handled a lot better by PostgreSQL then MySQL due to better
transactional support.
Some other quick questions...
Does pmacctd only record tcp/udp or also other ip protocols, icmp etc?
Is there a way to find out which protocol certain traffic was?
Sure... I think there's an aggregate for ip_proto. So if it's in the
sql_table, it'll get populated with 1=icmp, 17=udp, and so on, as per
/etc/protocols. I was doing this for a while on a dst_host basis, but
found it created about 5 times as many database records without
providing a lot of extra value to us... so I dropped the column and
stopped counting on it.
Cheers,
Wim