Hi Paolo,

On Sat, 14 Mar 2009, Paolo Lucente wrote:

> Any signs of massive packet drops on any port throughout your switches? 
> I ask because the traffic reported might not have been actually 
> delivered to the end host.

The switch has been up for 12.25 days, and in that time has recorded 
2,085,458,896 octets sent and 4,161,359,962 octets received by that port 
(which seems unusually low), and 77,060,310 packets sent and 66,840,066 
packets received.

Over the same period, pmacctd logged 57,439,276,227 bytes and 105,873,327 
packets sent to that host alone, or 129,777,361 packets including another 
host which I know is on the same port.

The switch shows 242 RX errors (all CRC alignment) on that port and no 
other errors or discards. There are no errors or discards on the port that 
my router/pmacct box is attached to. packet numbers are in the same 
region, i.e. a bit less than 100 million. I suspect that the switch's byte 
counters are wrapping.

> Can you do a bit of profiling? Like: see what is the average traffic 
> download/upload for the host X; also what is the average bytes per 
> packet value. Then, when you see an huge downstream traffic rate, see 
> what happens to the upstream. Do you see any correspondence with respect 
> to the average values?

Running this query:

select a.stamp_inserted,
   a.ip_src, a.ip_dst, a.bytes, a.packets,
   b.ip_src, b.ip_dst, b.bytes, b.packets 
from acct_v7 as a
left join acct_v7 as b
on a.stamp_inserted = b.stamp_inserted
where a.bytes > 100000000
and (a.ip_src<>b.ip_src or a.ip_dst<>b.ip_dst);

to find all records with the same timestamp as the excessive ones, I can 
see that:

* when a host is accused of sending a lot of traffic, it doesn't receive a
   lot of traffic at the same time; but

* when a host is accused of sending a lot of traffic, other hosts are also
   accused of sending (but not receiving) a lot of traffic; and

* the same goes for s/sending/receiving/g and vice versa.

> Yes, enable 64-bit counters and see what happens. If you see in a single 
> entry ~8GB of traffic, then everything was correct. Otherwise something 
> must have been wrong on the pmacct side. Running tcpdump in parallel 
> would be great for double-checking. And yes, pmacct honours timestamps 
> within pcap trace files.

OK, done. I assume the default snaplen of 96 bytes is OK for pmacct?

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to