I will work on setting up a puppetdb server I can put into debug mode. For now here is a "normal" GC run log output and my configuration files.
Thanks, 2013-01-15 19:20:57,767 INFO [clojure-agent-send-off-pool-12] [cli.services] Starting database garbage collection 2013-01-15 19:26:40,383 WARN [clojure-agent-send-off-pool-12] [jdbc.internal] Query slower than 10s threshold: actual execution time: 342.6148 seconds; Query: DELETE FROM catalogs WHERE NOT EXISTS (SELECT * FROM certname_catalogs cc WHERE cc.catalog=catalogs.hash); Query Params: 2013-01-15 19:33:29,304 WARN [clojure-agent-send-off-pool-12] [jdbc.internal] Query slower than 10s threshold: actual execution time: 408.9210 seconds; Query: DELETE FROM resource_params WHERE NOT EXISTS (SELECT * FROM catalog_resources cr WHERE cr.resource=resource_params.resource); Query Params: 2013-01-15 19:33:29,308 INFO [clojure-agent-send-off-pool-12] [cli.services] Finished database garbage collection 2013-01-15 19:33:29,308 INFO [clojure-agent-send-off-pool-12] [cli.services] Starting sweep of stale nodes (1 day threshold) 2013-01-15 19:33:29,390 INFO [clojure-agent-send-off-pool-12] [cli.services] Finished sweep of stale nodes (1 day threshold) ########################################################## # # config.ini # # See README.md for more thorough explanations of each section and # option. [global] # Store mq/db data in a custom directory vardir = /var/lib/puppetdb # Use an external log4j config file logging-config = /etc/puppetdb/conf.d/../log4j.properties # Maximum number of results that a resource query may return resource-query-limit = 20000 [command-processing] # How many command-processing threads to use, defaults to (CPUs / 2) threads = 10 ########################################################## # # database.ini # [database] # For the embedded DB: org.hsqldb.jdbcDriver # For PostgreSQL: org.postgresql.Driver # Defaults to embedded DB classname = org.postgresql.Driver # For the embedded DB: hsqldb # For PostgreSQL: postgresql # Defaults to embedded DB subprotocol = postgresql # For the embedded DB: file:/path/to/database;hsqldb.tx=mvcc;sql.syntax_pgs=true # For PostgreSQL: //host:port/databaseName # Defaults to embedded DB located in <vardir>/db subname = //PUPPETDB:5432/puppet # Connect as a specific user username = PUPPETDBUSER # Use a specific password password = PASSWORDPUPPETDB # How often (in minutes) to compact the database # gc-interval = 60 # Number of seconds before any SQL query is considered 'slow'; offending # queries will not be interrupted, but will be logged at the WARN log level. log-slow-statements = 10 # How old (in days) to deactivate nodes that have not reported node-ttl-days = 1 # How often to send a keepalive conn-keep-alive = 4 # How long to keep connections open conn-max-age = 30 On Tuesday, January 15, 2013 1:27:58 PM UTC-6, Ken Barber wrote: > > Hey Chuck, > > I've had a chat with my colleagues and they raised some concerns about > your gc performance. I wouldn't mind drilling into that problem if > thats okay with you? > > For starters - what are your settings looking like? In particular I'm > interested in node-ttl-seconds and report-ttl-seconds. > > FYI there are a few actions that occur when gc kicks in: > > * sweeping stale nodes > * sweeping stale reports > * removing any unassociated data (catalogs or params) > > It would be useful to find out which of these are taking up the most > amount of your time. > > I wouldn't mind a dump of your logs when in a specific debug mode if > thats at all possible, this should give me a rough idea of the time > the sweeper spends doing its various deletes so I can at least get a > focus on your issue. In particular the bits between 'Starting database > garbage collection' and 'Finished database garbage collection'. > > The easiest way to get some debugging is to modify your > log4j.properties (usually in /etc/puppetdb) and add the line: > > log4j.logger.com.jolbox.bonecp=DEBUG > > This will log *ALL* SQL statements, so it might be heavy - I'm not > sure if you can isolate one of your nodes for this purpose or somehow > mitigate this risk ... we only need a heavy GC run, it probably > doesn't need to be actively receiving requests and such ... > > ken. > > On Tue, Jan 15, 2013 at 4:12 PM, Chuck <css...@gmail.com <javascript:>> > wrote: > > > > > > On Tuesday, January 15, 2013 9:55:48 AM UTC-6, Ken Barber wrote: > >> > >> > Well I currently have 7,000 nodes in my PuppetDB instance and 8 > puppetdb > >> > servers in various geographic locations. The garbage collect seems > to > >> > be a > >> > pretty intensive operation on the Postgres DB server in the current > >> > design. > >> > >> Thats a shame to hear. What's the impact for you? I see from your > >> screenshot its averaging 16 seconds, is that what you are seeing? > > > > > > This is normally what we see 8 times of course. > > > >> > >> > We have also been having an issue that started at the beginning of > 2013 > >> > that > >> > we have not been able to narrow down at this time (we were on > puppetdb > >> > 1.0.1 > >> > for a long time with no issues. Updated to 1.0.5 last week with no > >> > improvement) > >> > >> This issue, is it related to the gc problem or something else? Can you > >> elaborate? > >> > > We are still trying to determine what is happening. We just see a huge > > spike in the number of rows being deleted. > > > >> > >> > I was planning on having my puppet manifest, for the puppetdb > servers, > >> > elect > >> > one or two master puppetdb servers by querying for active puppetdb > >> > servers. > >> > So I would be dynamically electing the master(s) based on an > algorithm. > >> > >> Sounds good to me. Do you want to raise a feature request for this? > > > > > > Yeah I can do that. > > > >> > >> http://projects.puppetlabs.com/projects/puppetdb > >> > >> Just more or less cover what you've said about wanting to disable by > >> setting to zero. I've already more or less got a patch that can > >> disable it when zero (its really just a conditional clause in the > >> right place), but I haven't done any thorough testing or gotten review > >> on it yet. > >> > >> ken. > > > > -- > > You received this message because you are subscribed to the Google > Groups > > "Puppet Users" group. > > To view this discussion on the web visit > > https://groups.google.com/d/msg/puppet-users/-/OkVPHnluxpcJ. > > > > To post to this group, send email to > > puppet...@googlegroups.com<javascript:>. > > > To unsubscribe from this group, send email to > > puppet-users...@googlegroups.com <javascript:>. > > For more options, visit this group at > > http://groups.google.com/group/puppet-users?hl=en. > -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To view this discussion on the web visit https://groups.google.com/d/msg/puppet-users/-/vuZyS8chcXYJ. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.