>> Okay. Did you clear the ActiveMQ queues after doing this? I usually
>> just move the old KahaDB directory out of the way when I do this.
>
>
> I haven't though about myself, but it makes sense, so I just flushed the
> queue again while puppetdb service was stopped. Since this last restart it
> seems smooth, but it's only up for a few minutes.

You mean, you've only been watching it for a few minutes, and so far
so good - or it crashed? Sorry - just want to be clear :-).

>> Okay, so were those old queries that are taking 90 minutes, what are they?
>>
> SELECT COUNT(*) AS c FROM certname_catalogs cc, catalog_resources cr,
> certnames c WHERE cc.catalog=cr.catalog AND c.name=cc.certname AND
> c.deactivated IS NULL
> 18 instances were running, oldest was ~90 mins at the time
>
> SELECT COUNT(*) AS c FROM (SELECT DISTINCT resource FROM catalog_resources
> cr, certname_catalogs cc, certnames c  WHERE cr.catalog=cc.catalog AND
> cc.certname=c.name AND c.deactivated IS NULL)
> 12 instances were running, oldest was ~90 mins at the time

>> Have you tried shutting down the PuppetDB, clearing the queue -
>> running full vacuum on the database then starting it all up again?
>
>
> Not yet, according to the Postgresql documentation full vacuum is not really
> recommended. Also it might take a while, and PuppetDB would be unavailable
> for the duration.

Its not recommended as a re-occurring maintenance task this is true,
but if your DB has had major changes due to a schema upgrade, or if
vacuum hasn't ran in a while its generally okay and can provide some
benefit. But yeah, judging by the size of your DB this will take a
long time. I guess my point is, every recommendation has a caveat or
back-story.

>> What version of PostgreSQL are we talking here? What distro and
>> version specifically? How big is the database itself?
>
>
> 8.4.13 on a RHEL6 based distribution. The database size might be an
> indicator that something is not ok. It currently takes up 512Gbs, and it
> grew 100Gbs over a week. We made a database dump before the PuppetDB upgrade
> and it was ~1Gb uncompressed. The full vacuum might solve this, but my guess
> that it would run quite long.

Wow - it should be more like 5 GB or less for your size.

It sounds like you've got some major fragmentation issues, your
indexes may need rebuilding and yeah - a vacuum will probably help -
and index rebuild. But this will require an outage I would say.

When was the last time it was vacuumed? Try the SQL:

select relname,last_vacuum, last_autovacuum, last_analyze,
last_autoanalyze from pg_stat_user_tables;

And give us the output. I think by default autovacuum should be on for
Postgresql 8.4 on Redhat but I can't recall.

On another note ... to be honest I can't be precise about why the
upgrade failed, I'm pretty sure you were running old revisions of the
terminus when you upgraded but that should cause this kind of anguish.
The errors you were receiving about constraints:

Key (catalog)=(d1c89bbef78a55edcf560c432d965cfe1263059c) is not
present in table "catalogs".

Should not be occurring at all, which is all very suspect - alas I
have no clue yet as to why. Have they stopped now we have cleared the
queue and restarted?

>> How many nodes and how frequently are they checking in? A screenshot
>> of your puppetdb dashboard might give us a good snapshot of the size
>> of your environment as well.
>
>
> We have ~2500 nodes, run interval for the two major environments: 60mins for
> 1593 machines and 30 mins for 496. Splay limit is 900sec for both.
> Two screenshots from today: http://imgur.com/kG5Sth5,bLFqGAX#0 and
> http://imgur.com/kG5Sth5,bLFqGAX#1

Okay.

>> Has any tuning been performed on your postgresql server? A copy of
>> your postgresql.conf would be interesting.
>
>
> No tuning that I'm aware of. The same default postgresql.conf as deployed by
> the puppetlabs-postgres module. It's here: http://pastebin.com/njvGm4eu

What is the size of your database? CPU/cores ... and RAM on the box?

Does your puppetdb service live on the same node as the database? I'm
guessing this to be true, as your postgresql.conf is listening only on
'localhost' ... what is the memory consumption of your apps? The
output of 'free' would probably be a good start.

ken.

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To post to this group, send email to puppet-users@googlegroups.com.
Visit this group at http://groups.google.com/group/puppet-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to