Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
On Thursday, December 20, 2012, Charles Gomes wrote: > Without hyperthreading CPU still not a bottleneck, while I/O is only 10% > utilization. > > top - 14:55:01 up 27 min, 2 users, load average: 0.17, 0.19, 0.14 > Tasks: 614 total, 17 running, 597 sleeping, 0 stopped, 0 zombie > Cpu(s): 73.8%us, 4.3%sy, 0.0%ni, 21.6%id, 0.1%wa, 0.0%hi, 0.1%si, > 0.0%st > ... > I believe the bottleneck may be that pgsql has fight with it's siblings to > update the indexes. I think that should mostly show up as idle or wait, not as user time. Since your at 75% user already, you couldn't expect more than a 33% improvement by fixing that, assuming that that were the problem. > Is there a way good way to add probes to check where things are slowing down ? What happens if you just drop the indexes as an experiment? That should put a hard limit on the amount the indexes can be slowing you down. I like oprofile to give you good bottom level profile of where CPU time is going. Unfortunately, it will probably just show you "AllocSetAlloc". Also, I don't trust it on virtualized systems, if you are on one of those. There are many other ways of approaching it, but mostly you have to already have a good theory about what is going on in order know which one to use or to interpret the results, and many of them require you to make custom compiles of the postgres server code. Cheers, Jeff
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
On Thursday, December 20, 2012, Charles Gomes wrote: > True, that's the same I feel, I will be looking to translate the trigger > to C if I can find good examples, that should accelerate. > I think your performance bottleneck is almost certainly the dynamic SQL. Using C to generate that dynamic SQL isn't going to help much, because it is still the SQL engine that has to parse, plan, and execute it. Are the vast majority if your inserts done on any given day for records from that same day or the one before; or are they evenly spread over the preceding year? If the former, you could use static SQL in IF and ELSIF for those days, and fall back on the dynamic SQL for the exceptions in the ELSE block. Of course that means you have to update the trigger every day. > Using rules would be totally bad as I'm partitioning daily and after one > year having 365 lines of IF won't be fun to maintain. > Maintaining 365 lines of IF is what Perl was invented for. That goes for triggers w/ static SQL as well as for rules. If you do the static SQL in a trigger and the dates of the records are evenly scattered over the preceding year, make sure your IFs are nested like a binary search, not a linear search. And if they are mostly for "today's" date, then make sure you search backwards. Cheers, Jeff
Re: [PERFORM] Why is PostgreSQL 9.2 slower than 9.1 in my tests?
On Tuesday, December 11, 2012, Tom Lane wrote: > Jeff Janes > writes: > > On Tue, Dec 11, 2012 at 2:50 AM, Patryk Sidzina > > > wrote: > >> The differences come up when you change the "INSERT" to "EXECUTE > 'INSERT'" ( > >> and i checked this time on 3 machines, one of which was Windows): > > >> FOR i IN 1..cnt LOOP > >> EXECUTE 'INSERT INTO test_table_md_speed(n) VALUES (' || i || ')'; > >> END LOOP; > > > The culprit is the commit below. I don't know exactly why this slows > > down your case. A preliminary oprofile analysis suggests that it most > > of the slowdown is that it calls AllocSetAlloc more often. I suspect > > that this slow-down will be considered acceptable trade-off for > > getting good parameterized plans. > > I'm having a hard time getting excited about optimizing the above case: > the user can do far more to make it fast than we can, simply by not > using EXECUTE, which is utterly unnecessary in this example. > I assumed his example was an intentionally simplified test-case, not a real world use-case. For a more realistic use, see "[PERFORM] Performance on Bulk Insert to Partitioned Table". There too it would probably be best to get rid of the EXECUTE, but doing so in that case would certainly have a high cost in trigger-code complexity and maintainability. (In my test case of loading 1e7 narrow tuples to 100 partitions, the plan cache change lead to a 26% slow down) > Having said that, though, it's not real clear to me why the plancache > changes would have affected the speed of EXECUTE at all --- the whole > point of that command is we don't cache a plan for the query. > Doing a bottom level profile isn't helpful because all of the extra time is in very low level code that is called from everywhere. Doing call-counts with gprof, I see that there is big increase in the calls to copyObject (which indirectly leads to a big increase in AllocSetAlloc). Before the change, each EXECUTE had one top-level (i.e. nonrecursive) copyObject call, coming from _SPI_prepare_plan. After the change, each EXECUTE has 4 such top-level copyObject calls, one each from CreateCachedPlan and CompleteCachedPlan and two from BuildCachedPlan. Cheers, Jeff