So that looks like its taking ages ... some questions for you: * What version of Postgresql are you running and on what distro & version exactly? * Are you using the distros packages or someone elses? * Is autovacuum switched on at all (check your postgresql.conf)? * Do you know the last time the database was vacuum'd and/or analyzed? Try this link for a way to check this: http://heatware.net/databases/how-when-postgres-tables-auto-vacuum-analyze/ * Can you get me some box metrics, memory size, cpu size/number, disk spindle speeds etc. of your postgresql box? So I can get an idea of the throughput available to you. * Maybe any information about the postgresql tuning items - shared_buffers is probably a good start, or have you left the tuning alone?
You probably don't need debug with log-slow-statements, I wasn't aware that was an option but I think that gives a good amount of data for this problem. On Tue, Jan 15, 2013 at 8:24 PM, Chuck <cssc...@gmail.com> wrote: > I will work on setting up a puppetdb server I can put into debug mode. For > now here is a "normal" GC run log output and my configuration files. > > Thanks, > > > 2013-01-15 19:20:57,767 INFO [clojure-agent-send-off-pool-12] > [cli.services] Starting database garbage collection > 2013-01-15 19:26:40,383 WARN [clojure-agent-send-off-pool-12] > [jdbc.internal] Query slower than 10s threshold: actual execution time: > 342.6148 seconds; Query: DELETE FROM catalogs WHERE NOT EXISTS (SELECT * > FROM certname_catalogs cc WHERE cc.catalog=catalogs.hash); Query Params: > 2013-01-15 19:33:29,304 WARN [clojure-agent-send-off-pool-12] > [jdbc.internal] Query slower than 10s threshold: actual execution time: > 408.9210 seconds; Query: DELETE FROM resource_params WHERE NOT EXISTS > (SELECT * FROM catalog_resources cr WHERE > cr.resource=resource_params.resource); Query Params: > 2013-01-15 19:33:29,308 INFO [clojure-agent-send-off-pool-12] > [cli.services] Finished database garbage collection > 2013-01-15 19:33:29,308 INFO [clojure-agent-send-off-pool-12] > [cli.services] Starting sweep of stale nodes (1 day threshold) > 2013-01-15 19:33:29,390 INFO [clojure-agent-send-off-pool-12] > [cli.services] Finished sweep of stale nodes (1 day threshold) > > ########################################################## > # > # config.ini > # > > # See README.md for more thorough explanations of each section and > # option. > > [global] > # Store mq/db data in a custom directory > vardir = /var/lib/puppetdb > # Use an external log4j config file > logging-config = /etc/puppetdb/conf.d/../log4j.properties > > # Maximum number of results that a resource query may return > resource-query-limit = 20000 > > [command-processing] > # How many command-processing threads to use, defaults to (CPUs / 2) > threads = 10 > > ########################################################## > # > # database.ini > # > > [database] > # For the embedded DB: org.hsqldb.jdbcDriver > # For PostgreSQL: org.postgresql.Driver > # Defaults to embedded DB > classname = org.postgresql.Driver > > # For the embedded DB: hsqldb > # For PostgreSQL: postgresql > # Defaults to embedded DB > subprotocol = postgresql > > # For the embedded DB: > file:/path/to/database;hsqldb.tx=mvcc;sql.syntax_pgs=true > # For PostgreSQL: //host:port/databaseName > # Defaults to embedded DB located in <vardir>/db > subname = //PUPPETDB:5432/puppet > > # Connect as a specific user > username = PUPPETDBUSER > > # Use a specific password > password = PASSWORDPUPPETDB > > # How often (in minutes) to compact the database > # gc-interval = 60 > > # Number of seconds before any SQL query is considered 'slow'; offending > # queries will not be interrupted, but will be logged at the WARN log level. > log-slow-statements = 10 > > # How old (in days) to deactivate nodes that have not reported > node-ttl-days = 1 > > # How often to send a keepalive > conn-keep-alive = 4 > > # How long to keep connections open > conn-max-age = 30 > > > > On Tuesday, January 15, 2013 1:27:58 PM UTC-6, Ken Barber wrote: >> >> Hey Chuck, >> >> I've had a chat with my colleagues and they raised some concerns about >> your gc performance. I wouldn't mind drilling into that problem if >> thats okay with you? >> >> For starters - what are your settings looking like? In particular I'm >> interested in node-ttl-seconds and report-ttl-seconds. >> >> FYI there are a few actions that occur when gc kicks in: >> >> * sweeping stale nodes >> * sweeping stale reports >> * removing any unassociated data (catalogs or params) >> >> It would be useful to find out which of these are taking up the most >> amount of your time. >> >> I wouldn't mind a dump of your logs when in a specific debug mode if >> thats at all possible, this should give me a rough idea of the time >> the sweeper spends doing its various deletes so I can at least get a >> focus on your issue. In particular the bits between 'Starting database >> garbage collection' and 'Finished database garbage collection'. >> >> The easiest way to get some debugging is to modify your >> log4j.properties (usually in /etc/puppetdb) and add the line: >> >> log4j.logger.com.jolbox.bonecp=DEBUG >> >> This will log *ALL* SQL statements, so it might be heavy - I'm not >> sure if you can isolate one of your nodes for this purpose or somehow >> mitigate this risk ... we only need a heavy GC run, it probably >> doesn't need to be actively receiving requests and such ... >> >> ken. >> >> On Tue, Jan 15, 2013 at 4:12 PM, Chuck <css...@gmail.com> wrote: >> > >> > >> > On Tuesday, January 15, 2013 9:55:48 AM UTC-6, Ken Barber wrote: >> >> >> >> > Well I currently have 7,000 nodes in my PuppetDB instance and 8 >> >> > puppetdb >> >> > servers in various geographic locations. The garbage collect seems >> >> > to >> >> > be a >> >> > pretty intensive operation on the Postgres DB server in the current >> >> > design. >> >> >> >> Thats a shame to hear. What's the impact for you? I see from your >> >> screenshot its averaging 16 seconds, is that what you are seeing? >> > >> > >> > This is normally what we see 8 times of course. >> > >> >> >> >> > We have also been having an issue that started at the beginning of >> >> > 2013 >> >> > that >> >> > we have not been able to narrow down at this time (we were on >> >> > puppetdb >> >> > 1.0.1 >> >> > for a long time with no issues. Updated to 1.0.5 last week with no >> >> > improvement) >> >> >> >> This issue, is it related to the gc problem or something else? Can you >> >> elaborate? >> >> >> > We are still trying to determine what is happening. We just see a huge >> > spike in the number of rows being deleted. >> > >> >> >> >> > I was planning on having my puppet manifest, for the puppetdb >> >> > servers, >> >> > elect >> >> > one or two master puppetdb servers by querying for active puppetdb >> >> > servers. >> >> > So I would be dynamically electing the master(s) based on an >> >> > algorithm. >> >> >> >> Sounds good to me. Do you want to raise a feature request for this? >> > >> > >> > Yeah I can do that. >> > >> >> >> >> http://projects.puppetlabs.com/projects/puppetdb >> >> >> >> Just more or less cover what you've said about wanting to disable by >> >> setting to zero. I've already more or less got a patch that can >> >> disable it when zero (its really just a conditional clause in the >> >> right place), but I haven't done any thorough testing or gotten review >> >> on it yet. >> >> >> >> ken. >> > >> > -- >> > You received this message because you are subscribed to the Google >> > Groups >> > "Puppet Users" group. >> > To view this discussion on the web visit >> > https://groups.google.com/d/msg/puppet-users/-/OkVPHnluxpcJ. >> > >> > To post to this group, send email to puppet...@googlegroups.com. >> > To unsubscribe from this group, send email to >> > puppet-users...@googlegroups.com. >> > For more options, visit this group at >> > http://groups.google.com/group/puppet-users?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Users" group. > To view this discussion on the web visit > https://groups.google.com/d/msg/puppet-users/-/vuZyS8chcXYJ. > > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to > puppet-users+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/puppet-users?hl=en. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.