Hi ak0ska, FWIW - with the help of some of my colleagues we've managed to replicate your constraint issue in a lab style environment now:
https://gist.github.com/kbarber/5157836 Which is a start. It requires a unique precondition to replicate, and I've been unable however to replicate it in any way with PuppetDB directly, so the statement log will still be helpful if you can supply it. ken. On Thu, Mar 14, 2013 at 2:12 PM, Ken Barber <k...@puppetlabs.com> wrote: > So I have this sinking feeling that all of your problems (including > the constraint side-effect) are related to general performance issues > on your database 'for some reason we are yet to determine'. This could > be related to IO contention, it could be a bad index (although you've > rebuilt them all right?) or something else. What continues to bother > me is the fact you have only managed to reduce your DB size to 30GB, > this is far larger than I expect and I it screams 'problem' to me. It > should be less than 5 GB, at least based on my rough calculations from > other DBs I've seen in the wild. > > It could very well be that something still isn't sufficiently > organised (index, tablespace data) for optimal performance. A > recreation of the database is by all means a very fast way to recreate > everything, its like heaving an axe though. Having said that, it is > fast and would eradicate a lot of back & forth between us to try and > solve this problem. > > A dump, drop, recreate and restore is also another way, not 100% the > same thing, but close to it. > > Its entirely up to you - if you want to try either option then go > ahead. It would be a fast step to achieving a lot of small steps. If > it works, we may have lost some clarity on the issue however, so might > be good to keep the original DB backup. If it doesn't work, at least > we have ruled out a lot of possibilities in one quick action. So yeah, > I wouldn't be upset if you take this action - there is a reason why I > suggested this earlier, and that is it performs a lot of actions in > one single step ... and I still think this is a viable action, but > since you were keen on trying other things I was happy to go along > :-). At the end of the day your time and your risk assessment is up to > you. If it was me, yeah, I would try it. Esp. considering the amount > of time already spent with very little final result. > > Thanks for being patient with me on this btw ... :-). > > ken. > > On Thu, Mar 14, 2013 at 3:36 AM, ak0ska <akos.he...@gmail.com> wrote: >> Hello Ken, >> >> I really appreciate you guys looking into this problem, and I'm happy to >> provide you with the data you ask for. However, I feel like I should ask, >> whether you think this problem is worth your efforts, if rebuilding the >> database might solve the issue? >> >> Cheers, >> >> ak0ska >> >> >> On Thursday, March 14, 2013 8:05:59 AM UTC+1, Ken Barber wrote: >>> >>> Hi ak0ska, >>> >>> So I've been spending the last 2 days trying all kinds of things to >>> replicate your constraint violation problem and I still am getting >>> nowhere with it. I've been speaking to all kinds of smart people and >>> we believe its some sort of lock and/or transactional mode problem but >>> none of my testing can replicate it. >>> >>> My latest test was something like: >>> >>> https://gist.github.com/kbarber/5157836 >>> >>> But with all that preparation and load simulation I still couldn't >>> replicate it :-(. >>> >>> I think I need more data from you ... in this case, a statement log. >>> Now a statement log can be pretty heavy, so if we can get the data >>> surrounding a GC only that should be enough - however I anticipate it >>> might reduce some performance so if you're apprehensive about doing it >>> in production I fully understand. I'm just running out of ideas, and >>> without more data I may have trouble coming to a conclusion. >>> >>> Anyway, if at all possible the instructions for enabling this are as >>> follows: >>> >>> * Go to your postgresql.conf file (On RHEL 6 this is usually located >>> in /var/lib/pgsql/data) >>> * Add/modify the entries: >>> log_statement = 'all' >>> log_line_prefix = '%m PID[%p] SID[%c] VTID[%v] TID[%x] ' >>> * Restart postgresql >>> * Logs should go to: /var/lib/pgsql/data/pg_log >>> >>> The closer you can do this to being set _just before_ a GC run the >>> better, as it will reduce the amount of time this setting needs to be >>> on. Once you have a GC run and a subsequent constraint logged you >>> should be able to turn it off. Don't worry about PuppetDB during these >>> cases, it retries the database so if you can take the small hit with >>> queries the outage should be quick while you switch off/on the setting >>> (and be mindful that catalogs/facts/reports submissions will get >>> queued, queries of course will fail during that short time during DB >>> restart). >>> >>> Now the data itself is going to obviously contain private items, so if >>> you prefer to send it to me privately thats fine (contact me off this >>> list if so). Otherwise, it might need some scrubbing. >>> >>> So the concern I have with your issue, even though it doesn't stop a >>> catalog from being submitted - is that you're seeing a 'block' when >>> those two catalogs get submitted which makes me fear that data doesn't >>> get populated until _after_ the GC is complete. This is a concern >>> because we don't want data to be delayed like this. If this wasn't the >>> case we could happily let it fail and retry, ultimately the data isn't >>> lost, but we should be able to process data within reasonable time >>> limits without you have to worry about database GC slowing things >>> down. >>> >>> Now in regards to your performance, we are going to keep looking into >>> the results you gave us. Looking at your settings Deepak and I think >>> you probably need to increase at least work_mem to 16MB and >>> maintenance_work_mem to 32MB (this is used during vacuum) the main >>> item being work_mem here really. These are reasonable changes we've >>> seen other users apply with success, but alone they probably won't >>> solve your issue - it would be good to just get you working using >>> higher values for these. >>> >>> I'm very curious to understand the disk IO for your systems also. What >>> are the nature of the disks where your database is located? Are they >>> SCSI/SATA/FC or otherwise? What filesystem type are you using? Can you >>> give me any other details you might find interesting? >>> >>> What are the results of the command: >>> >>> iostat -k 5 >>> >>> A few samples of the run would be useful to see, so we can understand >>> the average disk IO going on here. >>> >>> FYI To give you an idea of what we _should_ be seeing for say, just >>> one of your queries, here is a query on a Centos 5.8 box (running just >>> postgresql 8.1 actually): >>> >>> puppetdb=# EXPLAIN ANALYZE SELECT COUNT(*) AS c FROM (SELECT DISTINCT >>> resource FROM catalog_resources cr, certname_catalogs cc, certnames c >>> WHERE cr.catalog=cc.catalog AND cc.certname=c.name AND c.deactivated >>> IS NULL) r; >>> QUERY >>> PLAN >>> >>> >>> ------------------------------------------------------------------------------------------------- >>> --------------------------------------------------- >>> Aggregate (cost=91792.54..91792.55 rows=1 width=0) (actual >>> time=2611.872..2611.872 rows=1 loops >>> =1) >>> -> HashAggregate (cost=91737.14..91761.76 rows=2462 width=41) >>> (actual time=2606.507..2609.48 >>> 9 rows=5478 loops=1) >>> -> Hash Join (cost=184.36..90322.32 rows=565926 width=41) >>> (actual time=7.060..2072.000 >>> rows=867847 loops=1) >>> Hash Cond: ((cr.catalog)::text = (cc.catalog)::text) >>> -> Seq Scan on catalog_resources cr >>> (cost=0.00..80139.47 rows=867847 width=82) ( >>> actual time=0.021..1119.260 rows=867847 loops=1) >>> -> Hash (cost=159.17..159.17 rows=2015 width=41) >>> (actual time=7.032..7.032 rows= >>> 2195 loops=1) >>> -> Hash Join (cost=62.34..159.17 rows=2015 >>> width=41) (actual time=2.438..5 >>> .830 rows=2195 loops=1) >>> Hash Cond: (cc.certname = c.name) >>> -> Seq Scan on certname_catalogs cc >>> (cost=0.00..65.79 rows=2179 widt >>> h=50) (actual time=0.005..0.908 rows=2195 loops=1) >>> -> Hash (cost=37.15..37.15 rows=2015 >>> width=9) (actual time=2.427..2. >>> 427 rows=2195 loops=1) >>> -> Seq Scan on certnames c >>> (cost=0.00..37.15 rows=2015 width=9 >>> ) (actual time=0.008..1.088 rows=2195 loops=1) >>> Filter: (deactivated IS NULL) >>> Total runtime: 2611.935 ms >>> (13 rows) >>> >>> As you can see, 2 seconds on my box and 475 seconds on yours - the >>> difference is just too large to be considered a normal discrepancy >>> :-). Be mindful, this is on data that represents 2000 nodes ... so its >>> not like my tables are smaller then yours (at least not drastically), >>> however I do notice the number of resources is much higher so I'm >>> going to try and replicate that. >>> >>> For now, I'm going to ponder more on your performance as I believe >>> that affects you the most - so let us get back to you with more >>> updates. If you can give me any details as I've requested for now that >>> would be appreciated. >>> >>> ken. >>> >>> On Wed, Mar 13, 2013 at 2:00 AM, ak0ska <akos....@gmail.com> wrote: >>> > Hello Deepak, >>> > >>> > Here are the queries you asked for: >>> > >>> > >>> >> >>> >> Can you fire up psql, point it at your puppetdb database, and run >>> >> "EXPLAIN >>> >> ANALYZE SELECT COUNT(*) AS c FROM certname_catalogs cc, >>> >> catalog_resources >>> >> cr, certnames c WHERE cc.catalog=cr.catalog AND c.name=cc.certname AND >>> >> c.deactivated IS NULL;"? That will profile the query, and we can see >>> >> where >>> >> the problem is. >>> >> >>> > >>> > >>> > QUERY >>> > PLAN >>> > >>> > ------------------------------------------------------------------------------------------------------------------------------------------------------- >>> > Aggregate (cost=3132186.09..3132186.10 rows=1 width=0) (actual >>> > time=348426.375..348426.376 rows=1 loops=1) >>> > -> Hash Join (cost=247.37..3096532.80 rows=14261314 width=0) >>> > (actual >>> > time=24059.151..348255.933 rows=1604253 loops=1) >>> > Hash Cond: ((cr.catalog)::text = (cc.catalog)::text) >>> > -> Seq Scan on catalog_resources cr (cost=0.00..2871450.86 >>> > rows=16444286 width=41) (actual time=24053.542..347250.967 rows=2107982 >>> > loops=1) >>> > -> Hash (cost=216.42..216.42 rows=2476 width=41) (actual >>> > time=5.555..5.555 rows=2476 loops=1) >>> > -> Hash Join (cost=94.84..216.42 rows=2476 width=41) >>> > (actual time=1.850..4.555 rows=2476 loops=1) >>> > Hash Cond: (cc.certname = c.name) >>> > -> Seq Scan on certname_catalogs cc >>> > (cost=0.00..82.55 >>> > rows=2855 width=62) (actual time=0.010..0.941 rows=2855 loops=1) >>> > -> Hash (cost=63.71..63.71 rows=2490 width=21) >>> > (actual time=1.818..1.818 rows=2490 loops=1) >>> > -> Seq Scan on certnames c >>> > (cost=0.00..63.71 >>> > rows=2490 width=21) (actual time=0.013..0.823 rows=2490 loops=1) >>> > Filter: (deactivated IS NULL) >>> > Total runtime: 348446.364 ms >>> > (12 rows) >>> > >>> > >>> > >>> > >>> >> Also worth examining is the GC query (modified to be non-destructive): >>> >> >>> >> EXPLAIN ANALYZE SELECT * FROM resource_params WHERE NOT EXISTS (SELECT >>> >> * >>> >> FROM catalog_resources cr WHERE cr.resource=resource_params.resource); >>> >> >>> > >>> > >>> > QUERY PLAN >>> > >>> > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>> > Nested Loop Anti Join (cost=0.00..1710681.18 rows=360492 width=126) >>> > (actual time=1967.182..1967.182 rows=0 loops=1) >>> > -> Seq Scan on resource_params (cost=0.00..13249.23 rows=389623 >>> > width=126) (actual time=0.013..76.403 rows=389693 loops=1) >>> > -> Index Scan using idx_catalog_resources_resource on >>> > catalog_resources >>> > cr (cost=0.00..21429.14 rows=5291 width=41) (actual time=0.004..0.004 >>> > rows=1 loops=389693) >>> > Index Cond: ((cr.resource)::text = >>> > (resource_params.resource)::text) >>> > Total runtime: 1967.253 ms >>> > (5 rows) >>> > >>> > >>> > >>> >> And the resource duplication query: >>> >> >>> >> EXPLAIN ANALYZE SELECT COUNT(*) AS c FROM (SELECT DISTINCT resource >>> >> FROM >>> >> catalog_resources cr, certname_catalogs cc, certnames c WHERE >>> >> cr.catalog=cc.catalog AND cc.certname=c.name AND c.deactivated IS NULL) >>> >> r; >>> >> >>> > >>> > QUERY PLAN >>> > >>> > ------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> > Aggregate (cost=3129997.71..3129997.72 rows=1 width=0) (actual >>> > time=474369.420..474369.420 rows=1 loops=1) >>> > -> HashAggregate (cost=3129927.78..3129958.86 rows=3108 width=41) >>> > (actual time=474351.730..474365.144 rows=49466 loops=1) >>> > -> Hash Join (cost=248.48..3094726.38 rows=14080561 width=41) >>> > (actual time=21536.467..473678.479 rows=1604237 loops=1) >>> > Hash Cond: ((cr.catalog)::text = (cc.catalog)::text) >>> > -> Seq Scan on catalog_resources cr >>> > (cost=0.00..2871450.86 >>> > rows=16444286 width=82) (actual time=21531.355..472692.303 rows=1934212 >>> > loops=1) >>> > -> Hash (cost=217.36..217.36 rows=2490 width=41) >>> > (actual >>> > time=5.071..5.071 rows=2476 loops=1) >>> > -> Hash Join (cost=94.84..217.36 rows=2490 >>> > width=41) >>> > (actual time=1.700..4.048 rows=2476 loops=1) >>> > Hash Cond: (cc.certname = c.name) >>> > -> Seq Scan on certname_catalogs cc >>> > (cost=0.00..83.08 rows=2908 width=62) (actual time=0.010..0.603 >>> > rows=2855 >>> > loops=1) >>> > -> Hash (cost=63.71..63.71 rows=2490 >>> > width=21) >>> > (actual time=1.670..1.670 rows=2490 loops=1) >>> > -> Seq Scan on certnames c >>> > (cost=0.00..63.71 rows=2490 width=21) (actual time=0.014..0.785 >>> > rows=2490 >>> > loops=1) >>> > Filter: (deactivated IS NULL) >>> > Total runtime: 474370.129 ms >>> > (13 rows) >>> > >>> > >>> >> >>> >> And finally, can you give us a dump of your database settings? You can >>> >> do >>> >> this from within psql by doing "SHOW ALL;" >>> > >>> > >>> > http://pastebin.com/raw.php?i=HfhqfVa4 >>> > >>> > Thank you for your help! >>> > >>> > Cheers, >>> > >>> > ak0ska >>> > >>> > >>> > -- >>> > You received this message because you are subscribed to the Google >>> > Groups >>> > "Puppet Users" group. >>> > To unsubscribe from this group and stop receiving emails from it, send >>> > an >>> > email to puppet-users...@googlegroups.com. >>> > To post to this group, send email to puppet...@googlegroups.com. >>> > Visit this group at http://groups.google.com/group/puppet-users?hl=en. >>> > For more options, visit https://groups.google.com/groups/opt_out. >>> > >>> > >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Puppet Users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to puppet-users+unsubscr...@googlegroups.com. >> To post to this group, send email to puppet-users@googlegroups.com. >> Visit this group at http://groups.google.com/group/puppet-users?hl=en. >> For more options, visit https://groups.google.com/groups/opt_out. >> >> -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscr...@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users?hl=en. For more options, visit https://groups.google.com/groups/opt_out.