Hi ak0ska,
How are things going? Anything to report?
ken.
On Fri, Mar 15, 2013 at 5:00 AM, Ken Barber wrote:
> Hi ak0ska,
>
> FWIW - with the help of some of my colleagues we've managed to
> replicate your constraint issue in a lab style environment now:
>
> https://gist.github.com/kbarber/5157
Hi ak0ska,
FWIW - with the help of some of my colleagues we've managed to
replicate your constraint issue in a lab style environment now:
https://gist.github.com/kbarber/5157836
Which is a start. It requires a unique precondition to replicate, and
I've been unable however to replicate it in any
So I have this sinking feeling that all of your problems (including
the constraint side-effect) are related to general performance issues
on your database 'for some reason we are yet to determine'. This could
be related to IO contention, it could be a bad index (although you've
rebuilt them all rig
Hello Ken,
I really appreciate you guys looking into this problem, and I'm happy to
provide you with the data you ask for. However, I feel like I should ask,
whether you think this problem is worth your efforts, if rebuilding the
database might solve the issue?
Cheers,
ak0ska
On Thursday, Ma
Hi ak0ska,
So I've been spending the last 2 days trying all kinds of things to
replicate your constraint violation problem and I still am getting
nowhere with it. I've been speaking to all kinds of smart people and
we believe its some sort of lock and/or transactional mode problem but
none of my t
Hello Deepak,
Here are the queries you asked for:
> Can you fire up psql, point it at your puppetdb database, and run "EXPLAIN
> ANALYZE SELECT COUNT(*) AS c FROM certname_catalogs cc, catalog_resources
> cr, certnames c WHERE cc.catalog=cr.catalog AND c.name=cc.certname AND
> c.deactivated I
On Tue, Mar 12, 2013 at 6:38 AM, ak0ska wrote:
> I think my previous comment just got lost.
>
> So, I cut three occurrences of this error from the database log and the
> corresponding part from the puppetdb log. I removed the hostnames, I hope
> it's still sensible: http://pastebin.com/yvyBDWQE
>
I think my previous comment just got lost.
So, I cut three occurrences of this error from the database log and the
corresponding part from the puppetdb log. I removed the hostnames, I hope
it's still sensible: http://pastebin.com/yvyBDWQE
The unversioned api warnings are not from the masters. Th
> After dropping the obsolete index, and rebuilding the others, the database
> is now ~ 30 GB. We still get the constraint violation errors when garbage
> collection starts.
Okay - can you please send me the puppetdb.log entry that shows the
exception? Including surrounding messages?
> Also the "
After dropping the obsolete index, and rebuilding the others, the database
is now ~ 30 GB. We still get the constraint violation errors when garbage
collection starts.
Also the "Resources" and "Resource duplication" values on the dashboard are
still question marks, so those queries probably time
On Thursday, March 7, 2013 12:23:13 AM UTC+1, Ken Barber wrote:
>
>
>
> So the index 'idx_catalog_resources_tags' was removed in 1.1.0 I
> think, so that is no longer needed.
>
> This points back to making sure you schema matches exactly what a
> known good 1.1.1 has, as things have been misse
> Indexes seem bloated.
Totally agree, you should organise re-indexes starting from the biggest.
> relation | size
> -+-
> public.idx_catalog_resources_tags_gin | 117 GB
> public.idx_catalog_resources_tags
@Mike
iostat -nx
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
avgqu-sz await svctm %util
md0 0.00 0.00 85.55 405.31 9226.53 3234.60
25.39 0.000.00 0.00 0.00
@Ken
Wow. Thats still way too large for the amount of nodes. I i
> Vacuum full was running for the whole weekend, so we didn't yet have time to
> rebuild indexes, because that would require more downtime, and we're not
> sure how long it would take. The size of the database didn't drop that much,
> it's now ~370Gb.
Wow. Thats still way too large for the amount
What are the I/O stats? Can I just peak at them?
Mike { Thanks => always }
On Mar 5, 2013, at 3:00 AM, ak0ska wrote:
> Hey Mike,
>
> Thanks for the suggestions, but we already checked the IO rates, and they
> seemed fine. And yes, PuppetDB and Postgres are on the same machine for now,
> bu
Hey Mike,
Thanks for the suggestions, but we already checked the IO rates, and they
seemed fine. And yes, PuppetDB and Postgres are on the same machine for
now, but we plan to change that sometime in the future.
Cheers,
ak0ska
On Tuesday, March 5, 2013 12:51:16 AM UTC+1, Mike wrote:
>
> Is pu
A little update on our story.
Vacuum full was running for the whole weekend, so we didn't yet have time
to rebuild indexes, because that would require more downtime, and we're not
sure how long it would take. The size of the database didn't drop that
much, it's now ~370Gb.
We already see some
Is puppetdb and postgres on the same server?
How many node does your environment have?
I had a similar issue and it was linked to I/O. Can you look at that?
Mike { Thanks => always }
On Mar 4, 2013, at 4:55 PM, Ken Barber wrote:
> Any progress today?
>
> On Fri, Mar 1, 2013 at 9:00 AM, ak
Any progress today?
On Fri, Mar 1, 2013 at 9:00 AM, ak0ska wrote:
> Yes, maybe not. The next step will be to recreate it from scratch.
>
>
> On Friday, March 1, 2013 5:47:06 PM UTC+1, Ken Barber wrote:
>>
>>
>> Well, I don't think a vacuum will help you - I imagine something is
>> wrong with the
Yes, maybe not. The next step will be to recreate it from scratch.
On Friday, March 1, 2013 5:47:06 PM UTC+1, Ken Barber wrote:
>
>
> Well, I don't think a vacuum will help you - I imagine something is
> wrong with the schema right now or some data migration failed during
> upgrade. Esp. if you
> I made a backup today, to have a fresh one before we start the database
> maintenance. The structurally wrong might not be so far fetched, since we
> didn't upgrade from an official 1.0.2 release. My colleague got a patched
> version (don't know the details, and can't ask now, as he's on holiday)
On Friday, March 1, 2013 3:36:20 PM UTC+1, Ken Barber wrote:
>
> Oh - and a copy of the current dead letter queue would be nice, its
> normally stored in:
>
> /var/lib/puppetdb/mq/discarded/*
>
I will back it up.
> This should also contain the full exceptions for the failed SQL as I
> me
Oh - and a copy of the current dead letter queue would be nice, its
normally stored in:
/var/lib/puppetdb/mq/discarded/*
This should also contain the full exceptions for the failed SQL as I
mentioned earlier, so perhaps a glance into those now and letting me
know what the prevalent failure is wou
So I've been pondering this issue of yours, and I keep coming back to
that error in my mind:
ERROR: insert or update on table "certname_catalogs" violates foreign
key constraint "certname_catalogs_catalog_fkey"
Regardless of the other issues, 512 GB db - yes its big, but so what?
That shouldn't
On Thursday, February 28, 2013 6:09:35 PM UTC+1, Ken Barber wrote:
>
> FYI, I just upgraded only the PuppetDB part to 1.1.1, using the old
> 1.0.2 terminus I get no errors:
>
> 2013-02-28 17:06:27,711 WARN [qtp1478462104-39] [http.server] Use of
> unversioned APIs is deprecated; please use /v
> You mean, you've only been watching it for a few minutes, and so far
> so good - or it crashed? Sorry - just want to be clear :-).
>
I was watching it for a few minutes and it seemed good. However the queue
grew up to 4000 items overnight. Also we have more of the constraint
violation err
FYI, I just upgraded only the PuppetDB part to 1.1.1, using the old
1.0.2 terminus I get no errors:
2013-02-28 17:06:27,711 WARN [qtp1478462104-39] [http.server] Use of
unversioned APIs is deprecated; please use /v1/commands
2013-02-28 17:06:28,284 INFO [command-proc-44] [puppetdb.command]
[7a4a
>> Okay. Did you clear the ActiveMQ queues after doing this? I usually
>> just move the old KahaDB directory out of the way when I do this.
>
>
> I haven't though about myself, but it makes sense, so I just flushed the
> queue again while puppetdb service was stopped. Since this last restart it
> s
> I'm @ken_barber on irc btw if that is easier.
>
Can't use IRC here, sorry. :(
> Okay. Did you clear the ActiveMQ queues after doing this? I usually
> just move the old KahaDB directory out of the way when I do this.
>
I haven't though about myself, but it makes sense, so I just flushe
> Hi, thanks for trying to help! :)
I'm @ken_barber on irc btw if that is easier.
>> That is a bit of a concern, are you receiving a lot of these? Is this
>> constant?
>
> Before we flushed the queue, quite a lot. Since we flushed it, only 4 times.
>
>
>>
>> > Use of unversioned APIs is deprecate
Hi, thanks for trying to help! :)
If you clear the queue and rollback to the original version does the
> problem disappear? If you're having processing problems at the latest
> version thats what I would do, as I presume we're talking production
> here right?
>
>
Yes, this is production. We wo
If you clear the queue and rollback to the original version does the
problem disappear? If you're having processing problems at the latest
version thats what I would do, as I presume we're talking production
here right?
> Can this be somehow related to the the KahaDB leak thread?
No - it doesn't
The queue jumped back to ~1 600, and according to the puppetdb logs, it
hasn't processed a single entry for like 20 minutes, but the log shows a
lot of slow queries (see example below). Postgres logs show no error.
And now it started processing the queue again.
Slow queries: http://pastebin.co
Or maybe it did. The queue size went up to like 900 and after a while it
normalized and started slowly decreasing. Right now it's empty.
There were 2 insert errors in postgres log since its restart.
It was perhaps on our end, but it would be good to hear some theories, or
suggestions where els
We flushed the queue and also restarted Postgres. Didn't help.
On Thursday, February 28, 2013 10:14:16 AM UTC+1, ak0ska wrote:
>
> Hello,
>
> We have upgraded Puppetdb from 1.0.2 to 1.1.1 last week, and recently we
> noticed that queue is not processed. We don't know if this problem is
> direct
35 matches
Mail list logo