Good morning folks,

I'm working with nfacctd, slapping data into pgsql in (what I think is) a pretty simple manner.

Now what's unclear, is where this behavior started. I have a collector for sflow data running pmacct-0.14.2, which I haven't seen this happening on, but it may be that the NetFlow volume we're getting exceeds it... or it could be changes between there and 1.5.x; I just haven't dug that deep as of yet. (With any luck, someone smarter than I can put their finger on this in short order, and I may not need to. ;-) )

Basically, with about 30-100k flows per minute, nfacctd started core dumping. Adding some debug and a little gdb massaging revealed:
[New process 1]
Core was generated by `nfacctd'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000488725 in PG_cache_purge (queue=0x7f7ff7b38000, index=10764, idata=0x7f7ffffd0260) at pgsql_plugin.c:528 528 if (reprocess_queries_queue[j]->valid == SQL_CACHE_COMMITTED) sql_query(&bed, reprocess_queries_queue[j], idata);
(gdb) bt
#0 0x0000000000488725 in PG_cache_purge (queue=0x7f7ff7b38000, index=10764, idata=0x7f7ffffd0260) at pgsql_plugin.c:528 #1 0x000000000048c8d7 in sql_cache_handle_flush_event (idata=0x7f7ffffd0260, refresh_deadline=0x7f7ffffd0258, pt=0x7f7ffffd0440) at sql_common.c:486 #2 0x000000000048716f in pgsql_plugin (pipe_fd=4, cfgptr=0x7f7ff7b24128, ptr=0x78d060) at pgsql_plugin.c:178 #3 0x000000000043129d in load_plugins (req=0x7f7fffffdbd0) at plugin_hooks.c:212 #4 0x00000000004202b2 in main (argc=4, argv=0x7f7fffffdc80, envp=0x7f7fffffdca8) at nfacctd.c:709
(gdb)

A little sifting around, and we're looking at:
if (reprocess_queries_queue[j]->valid == SQL_CACHE_COMMITTED) sql_query(&bed, reprocess_queries_queue[j], idata);

Simply put, j is pointing to a null pointer, and the wheels fall off. Adding a quick (reprocess_queries_queue[j] != NULL) smooths that out... but I haven't got my head around the structures enough to grok why the case is possible.

In addition, although it's now committing without issue (I get my "Purge cache - END events"), for whatever reason, PG_DB_Close isn't getting called, so pgsql consistently reports "LOG: unexpected EOF on client connection". Again, I haven't sat down to read the SQL plugin structure to comprehend why not... but it makes be wonder if these two are related.

Would sincerely appreciate some more informed input this... before I start making uneducated patches.

nfacctd.conf follows.

Best Mike.

===== nfacctd.conf =====
nfacctd_disable_checks: true
nfacctd_port: 2055

plugin_pipe_size: 409600000
plugin_buffer_size: 409600

sql_db: pmacct
sql_table: acct_v7_%Y%m%d
sql_table_schema: /usr/pkg/etc/pmacct/acct_v7.schema
sql_table_version: 7
sql_passwd: bwahahahaha
sql_user: pmacct
sql_refresh_time: 60
sql_history: 1m
sql_history_roundoff: h
sql_dont_try_update: true
sql_cache_entries: 10472900

plugins: pgsql[fw]
aggregate[fw]: src_host, dst_host, src_port, dst_port, proto


--
Mike Bowie
Chief Electron Disturbance Facilitation Officer (CTO)
RocketSpace, Inc

Office: +1 415 625 3155
Direct: +1 415 230 2214
Mobile: +1 707 234 5386
   Fax: +1 415 373 3988
E-mail: [email protected]
   Web: rocketspace.com
 Tweet: @mike_bowie

_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to