Re: [pmacct-discussion] nfacctd+pgsql, SQL/DB LOCK problem

Jakub Wartak Mon, 21 Nov 2005 22:32:03 +0100

Dnia niedziela, 20 listopada 2005 23:59, Paolo Lucente napisał:
> Hello Jakub,


Hello Paolo, thanks for reply :)

> thank you for your words about the project and for the data about your
> acutal issue. The increased number of processes is basically a queue of
> writers that wait to obtain the permission to "use" the DB. This is because
> the use of the DB is protect by LOCKs. A new writer joins the queue each
> 'sql_refresh_time' seconds.
>
> When everything works correctly, a writer is finished with the DB before
> the next one starts. What i've not been able to understand is whether the
> *same* configuration causes the *same* problem under MySQL.

Yes, exactly the same configuration based on MySQL had the same issues.

>
> The hardware is ok and the number of tuples in both the tables seems fine.
> I would suggest to upgrade to either 0.9.3, 0.9.4p1 or the development
> snapshot 0.9.5 in order to further troubleshoot the issue (the process
> title can give additional informations while firing a "ps auxw").

Okay, I self-compiled 0.9.4p1, the results are identical, after several 
minutes:

netflow=> select * from pg_stat_activity;
 datid | datname | procpid | usesysid | usename |                               
                                
current_query          |          query_start
-------+---------+---------+----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------
 17144 | netflow |   30462 |      100 | pmacct  | UPDATE acct_in SET 
packets=packets+6, bytes=bytes+288, stamp_updated=CURRENT_TIMESTAMP(0) WHERE 
ABSTIME(1132588800)::Timestamp::Timestamp without time zone = stamp_inserted 
AND ip_dst='192.168.86.98' | 2005-11-21 18:18:27.93401+01
 17144 | netflow |    4273 |      100 | pmacct  | <IDLE>          | 2005-11-21 
18:18:23.891838+01

( i tunned pgsql to provide some more statistics about queries that are 
currently running ), notice again the "+288" bytes, for me it is very 
strange, because as far as i understand ( based on sql_refresh_time option ), 
the database shouldn be updated only every 30mins, not all the time... And 
that continues updating of database is causing postgres/mysql to generate 
very high load-average. And another thing: 192.168.86.98 doesn't exist.

biuro:~# !ps
ps uaxwww | grep nfacct
root      3913  0.0  1.0 16284 10496 ?       Ss   17:30   0:00 nfacctd: Core 
Process [default]
root     28327  0.0  1.1 17412 11884 ?       S    17:30   0:00 nfacctd: 
PostgreSQL Plugin [in]
root      6138  0.0  0.9 15432 9824 ?        S    17:30   0:00 nfacctd: 
PostgreSQL Plugin [out]
root      1143  0.0  1.1 17556 12092 ?       S    18:15   0:00 nfacctd: 
PostgreSQL Plugin -- DB Writer [in]

biuro:~# ps uaxww | grep postgres
postgres 21139  0.0  0.3 23520 3968 ?        S    Nov20   
0:00 /usr/lib/postgresql/bin/postmaster -D /var/lib/postgres/data
postgres  2744  0.0  0.4 14320 4728 ?        S    Nov20   0:09 postgres: stats 
buffer process
postgres 21894  0.0  0.3 13464 3856 ?        S    Nov20   0:08 postgres: stats 
collector process
postgres 30462 93.7  1.3 24276 14284 ?       R    18:15  10:53 postgres: 
pmacctnetflow [local] UPDATE

> The fact that no nfacctd process consumes significative shares of the CPU
> and the strace seems to confirm that it should not be matter of any endless
> loop. Are you firing regularly VACUUMs of the tables ? 

I think that Debian scripts are doing this already, from /etc/cron.d/postgres:

# Run VACUUM ANALYSE on all databases every 5 hours
2 0,5,10,15,20 * * 1-6 postgres if ! 
pidof /usr/lib/postgresql/bin/pg_autovacuum > /dev/null && 
[ -x /usr/lib/postgresql/bin/do.maintenance ]; 
then /usr/lib/postgresql/bin/do.maintenance -a; fi

# On Sunday run a VACUUM FULL ANALYSE as well
# If you run a 24/7 site, you may want to comment out this line and save 
VACUUM
# FULL for when you think you really need it
# 10 3 * * sun postgres /usr/bin/test 
-x /usr/lib/postgresql/bin/do.maintenance
&& /usr/lib/postgresql/bin/do.maintenance -a -f -F

> From your 
> configuration i see you have enabled the 'sql_optimize_clauses'; did you
> setup an INDEX on the tables in order to speed up the queries ? Does the
> EXPLAIN confirm that the queries are using such index ? You can try
> wrapping a query from your strace report.

No i haven't setup an INDEX on the tables, but for now i disabled 
'sql_optimize_clauses', after several minutes i'm getting the same.
After several 'select * from pg_stat_activity' i got:

netflow=> select * from pg_stat_activity;
 datid | datname | procpid | usesysid | usename |           current_query       
                                                           
|          query_start
-------+---------+---------+----------+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------
 17144 | netflow |   29851 |      100 | pmacct  | <IDLE>                        
                                          
| 2005-11-21 18:35:36.918+01
 17144 | netflow |   25666 |      100 | pmacct  | UPDATE acct_in SET 
packets=packets+6, bytes=bytes+288, stamp_updated=CURRENT_TIMESTAMP(0) WHERE 
ABSTIME(1132592400)::Timestamp::Timestamp without time zone = stamp_inserted 
AND ip_dst='192.168.96.195' AND port_src=0 AND port_dst=0 AND ip_proto=0 AND 
ip_sr | 2005-11-21 18:35:36.750585+01
(2 rows)

>
> In the end, i see that you are using an extremely big 'sql_refresh_time'.
> Try reducing it, say to 60-120 secs; and commenti out the
> 'sql_startup_delay' directive.
>
Done.


> Cheers,
> Paolo
>
>
> _______________________________________________
> pmacct-discussion mailing list
> [email protected]
> http://muffin.area.ba.cnr.it/mailman/listinfo/pmacct-discussion

-- 
Jakub Wartak
-vnull
FreeBSD/OpenBSD/Linux/Solaris/Network Administrator
http://vnull.pcnet.com.pl/

Re: [pmacct-discussion] nfacctd+pgsql, SQL/DB LOCK problem

Reply via email to