On Friday, March 1, 2013 3:36:20 PM UTC+1, Ken Barber wrote: > > Oh - and a copy of the current dead letter queue would be nice, its > normally stored in: > > /var/lib/puppetdb/mq/discarded/* >
I will back it up. > This should also contain the full exceptions for the failed SQL as I > mentioned earlier, so perhaps a glance into those now and letting me > know what the prevalent failure is would be handy. > Here's on of them. Attempt 1 @ 2013-02-18T17:04:30.512Z java.lang.IllegalArgumentException: Edge '{:relationship :contains, :target {:title nil, :type nil}, :source {:title "Castor::Installer", :type "Class"}}' refers to resource '{:title nil, :type nil}', which doesn't exist in the catalog. com.puppetlabs.puppetdb.catalog$validate_edges.invoke(catalog.clj:205) clojure.core$comp$fn__4036.invoke(core.clj:2286) com.puppetlabs.puppetdb.catalog$eval1498$fn__1499.invoke(catalog.clj:311) clojure.lang.MultiFn.invoke(MultiFn.java:167) com.puppetlabs.puppetdb.command$replace_catalog_STAR_$fn__2696.invoke(command.clj:308) com.puppetlabs.puppetdb.command$replace_catalog_STAR_.invoke(command.clj:308) com.puppetlabs.puppetdb.command$eval2716$fn__2718.invoke(command.clj:329) clojure.lang.MultiFn.invoke(MultiFn.java:167) com.puppetlabs.puppetdb.command$produce_message_handler$fn__2838.invoke(command.clj:566) com.puppetlabs.puppetdb.command$wrap_with_discard$fn__2789$fn__2792.invoke(command.clj:474) com.puppetlabs.puppetdb.command.proxy$java.lang.Object$Callable$f8c5758f.call(Unknown Source) com.yammer.metrics.core.Timer.time(Timer.java:91) com.puppetlabs.puppetdb.command$wrap_with_discard$fn__2789.invoke(command.clj:473) com.puppetlabs.puppetdb.command$wrap_with_exception_handling$fn__2774$fn__2775.invoke(command.clj:427) com.puppetlabs.puppetdb.command.proxy$java.lang.Object$Callable$f8c5758f.call(Unknown Source) com.yammer.metrics.core.Timer.time(Timer.java:91) com.puppetlabs.puppetdb.command$wrap_with_exception_handling$fn__2774.invoke(command.clj:426) com.puppetlabs.puppetdb.command$wrap_with_command_parser$fn__2784.invoke(command.clj:449) com.puppetlabs.puppetdb.command$wrap_with_meter$fn__2765.invoke(command.clj:387) com.puppetlabs.puppetdb.command$wrap_with_thread_name$fn__2797.invoke(command.clj:489) clamq.jms$jms_consumer$fn__2452.invoke(jms.clj:38) clamq.jms.proxy$java.lang.Object$MessageListener$ce893c05.onMessage(Unknown Source) org.springframework.jms.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:560) org.springframework.jms.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:498) org.springframework.jms.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:467) org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:325) org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:263) org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:1058) org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessageListenerContainer.java:1050) org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:947) java.lang.Thread.run(Thread.java:679) Attempt null @ 2013-02-18T17:04:30.512Z java.lang.IllegalArgumentException: Edge '{:relationship :contains, :target {:title nil, :type nil}, :source {:title "Castor::Installer", :type "Class"}}' refers to resource '{:title nil, :type nil}', which doesn't exist in the catalog. com.puppetlabs.puppetdb.catalog$validate_edges.invoke(catalog.clj:205) clojure.core$comp$fn__4036.invoke(core.clj:2286) com.puppetlabs.puppetdb.catalog$eval1498$fn__1499.invoke(catalog.clj:311) clojure.lang.MultiFn.invoke(MultiFn.java:167) com.puppetlabs.puppetdb.command$replace_catalog_STAR_$fn__2696.invoke(command.clj:308) com.puppetlabs.puppetdb.command$replace_catalog_STAR_.invoke(command.clj:308) com.puppetlabs.puppetdb.command$eval2716$fn__2718.invoke(command.clj:329) clojure.lang.MultiFn.invoke(MultiFn.java:167) com.puppetlabs.puppetdb.command$produce_message_handler$fn__2838.invoke(command.clj:566) com.puppetlabs.puppetdb.command$wrap_with_discard$fn__2789$fn__2792.invoke(command.clj:474) com.puppetlabs.puppetdb.command.proxy$java.lang.Object$Callable$f8c5758f.call(Unknown Source) com.yammer.metrics.core.Timer.time(Timer.java:91) com.puppetlabs.puppetdb.command$wrap_with_discard$fn__2789.invoke(command.clj:473) com.puppetlabs.puppetdb.command$wrap_with_exception_handling$fn__2774$fn__2775.invoke(command.clj:427) com.puppetlabs.puppetdb.command.proxy$java.lang.Object$Callable$f8c5758f.call(Unknown Source) com.yammer.metrics.core.Timer.time(Timer.java:91) com.puppetlabs.puppetdb.command$wrap_with_exception_handling$fn__2774.invoke(command.clj:426) com.puppetlabs.puppetdb.command$wrap_with_command_parser$fn__2784.invoke(command.clj:449) com.puppetlabs.puppetdb.command$wrap_with_meter$fn__2765.invoke(command.clj:387) com.puppetlabs.puppetdb.command$wrap_with_thread_name$fn__2797.invoke(command.clj:489) clamq.jms$jms_consumer$fn__2452.invoke(jms.clj:38) clamq.jms.proxy$java.lang.Object$MessageListener$ce893c05.onMessage(Unknown Source) org.springframework.jms.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:560) org.springframework.jms.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:498) org.springframework.jms.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:467) org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:325) org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:263) org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:1058) org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessageListenerContainer.java:1050) org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:947) java.lang.Thread.run(Thread.java:679) > I can organise a secure space on a Puppetlabs support storage area to > upload this data if you are willing. Just contact me privately to > organise it. > I am willing, but this is not up to me. :) I will keep all the backups you asked for for now. > > So I've been pondering this issue of yours, and I keep coming back to > > that error in my mind: > > > > ERROR: insert or update on table "certname_catalogs" violates foreign > > key constraint "certname_catalogs_catalog_fkey" > > > > Regardless of the other issues, 512 GB db - yes its big, but so what? > > That shouldn't cause an integrity violation. It could be a big hint to > > something else, but I'm not entirely sure this is the cause. > > > > This error - can we try to track down the corresponding PuppetDB error > > message that goes with it in the puppetdb.log or in perhaps the dead > > letter queue for activemq? If postgresql is throwing the error - then > > there must be a corresponding exception in puppetdb.log. > > > > Its just that when I look at the code ... it doesn't make sense that > > the insert fails here - "Key > > (catalog)=(d1c89bbef78a55edcf560c432d965cfe1263059c) is not present in > > table "catalogs". ... but I can't yet see a fault in the logic inside > > the code yet. > > > > Still its going to be hard to replicate without the data on my side. > > > > As far as your issue ... you have a couple of avenues: > > > > * Drop the existing database, allow it to be recreated from scratch - > > run noop a serveral times across your entire infrastructure to > > repopulate it. Has the advantage of being an effective vacuum at the > > same time :-). > This could be the next step :) If the database maintenance doesn't help. > * Rollback the PuppetDB version and the database. This will at least > > get you back to a known working state, but will still leave you in a > > place where you need to recover. > > Either case - you would want to take a backup of the existing database > > for diagnostic purposes. But I really feel like there is something > > structurally wrong here - perhaps the migration scripts have failed > > during the database upgrade? > I made a backup today, to have a fresh one before we start the database maintenance. The structurally wrong might not be so far fetched, since we didn't upgrade from an official 1.0.2 release. My colleague got a patched version (don't know the details, and can't ask now, as he's on holiday), because at that time the official release was having an issue with binary files (http://projects.puppetlabs.com/issues/17216). One consequence was, that value 8 was missing from the schema_migrations table, even though the matching modifications were already in place in the database ( 8 rename-fact-column in /src/com/puppetlabs/puppetdb/scf/migrate.clj ). Because of that, upgrades failed initially, when PuppetDB tried to rename non-existing columns. The schema looked like it was actually matching the migrations up to number 8, so just adding that value seemed to be ok. This was tested in a lot smaller scale, but there the upgrade seemed fine, never saw this error. Perhaps we overlooked something. If this is the case, then your suggested solution, to recreate the database from scratch, should solve the problem. This will be up our next step, if database maintenance doesn't help. > > But I'm guessing you can't stay broken for very long - so I would take > > a snapshot of the current broken state so we can take a look and try > > to replicate it, and make moves on getting back to a working state as > > suggested above. > > > > I'll need: > > > > * PuppetDB logs > > * Backup of your current DB > > * Backup of your old DB > > * The broken KahaDB queues > > > > ken. > > > I have all these backups, but other people have to authorize sharing them. Have a nice weekend! :) ak0ska -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscr...@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users?hl=en. For more options, visit https://groups.google.com/groups/opt_out.