Re: [GENERAL] Performance of UPDATE SET = FROM vs UPDATE SET = (SELECT ...)
On Thu, Nov 06, 2014 at 02:55:20PM +, Shaun Thomas wrote: > > These updates aren't equivalent. It's very important you know this, because > you're also inflating your table with a lot of extra updated rows. > > Take the first UPDATE: > > > UPDATE second SET time1 = orig.time1 > > FROM orig > > WHERE second.key1 = orig.key1; > > If you wrote this as a SELECT, it would look like this: > > SELECT second.time1, orig.time1 > FROM second > JOIN ORIG ON (second.key1 = orig.key1) > > Since second is a many to one subset of orig, you now have several > simultaneous updates. Your second UPDATE: > > > UPDATE second SET time1 = (SELECT orig.time1 FROM orig,second > > WHERE orig.key1 = second.key1 LIMIT 1); > > Is equivalent to this SELECT: > > SELECT second.time1, >(SELECT orig.time1 FROM orig,second > WHERE orig.key1 = second.key1 LIMIT 1) > FROM second; > > Meaning you'd only get as many updates as there are rows in second. The > difference is your LIMIT 1. However, since you're not using an ORDER BY > clause, the actual value you get for time1 will be indeterminate. Something > like this would remove the row inflation and fix the random time1 behavior, > but I'm not sure it was your intent: > > UPDATE second >SET time1 = orig.time1 > FROM (SELECT DISTINCT ON (key1) key1, time1 > FROM orig > ORDER BY key1, time1 DESC) sub > WHERE second.key1 = sub.key1; I see now that I made more than one mistake. 1) I forgot to INCLUDE INDEXES when creating second. I would have seen dup keys when filling it. CREATE TABLE second (LIKE orig INCLUDING INDEXES); 2) I should have used something like this to fill second: INSERT INTO second (key1) SELECT key1 FROM orig ORDER BY random() LIMIT 40; 3) I then incorrectly remembered the query I had written at work. It should have been: EXPLAIN ANALYZE UPDATE second se SET time1 = (SELECT time1 FROM orig WHERE orig.key1 = se.key1); Once the second table is filled with unique keys, then both UPDATES should have produced the same results, but the UPDATE FROM is faster than the UPDATE = SELECT, which is documented. My original intent was to find out what the performance differences between the two are. Thanks for pointing these things out! PJ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] reindex table deadlock
Hi, I encountered a deadlock while running 'reindex table TABLE1' in postgresql version 9.2.4 The postgresql logs shows the two offending processes. 1st process was running reindex table TABLE1 waiting for AccessExclusiveLock on primary key index of TABLE1 2nd process was running stored procedure that executes selects and deletes to TABLE1 waiting for RowExclusiveLock on TABLE1. Is this the same lock upgrade deadlock issue that someone has previously mentioned with reindex? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Fwd: function for trigger
Hi, I have this function CREATE OR REPLACE FUNCTION sumlast_func() RETURNS trigger AS begin CASE WHEN idcar = (select idcar FROM selectedcar) THEN JOIN selectedcar ON selectedcar.idcar = actuals.idcar) update actuals set comb = comb + (select combor from combor_view where idlog = (select max(idlog) from combor_view)), END CASE; return new; end I'm not sure where to create the boolean condition; the options in the function above or in the trigger below CREATE TRIGGER update_actuals_tg AFTER INSERT ON combor FOR EACH ROW EXECUTE PROCEDURE sumlast_func(); and then, is the boolean condition wrong? thanks for any tips
Re: [GENERAL] Fwd: function for trigger
On 11/07/2014 07:07 AM, avpro avpro wrote: Hi, I have this function CREATE OR REPLACE FUNCTION sumlast_func() RETURNS trigger AS begin CASE WHEN idcar = (select idcar FROM selectedcar) THEN JOIN selectedcar ON selectedcar.idcar = actuals.idcar) update actuals set comb = comb + (select combor from combor_view where idlog = (select max(idlog) from combor_view)), END CASE; return new; end I'm not sure where to create the boolean condition; the options in the function above or in the trigger below CREATE TRIGGER update_actuals_tg AFTER INSERT ON combor FOR EACH ROW EXECUTE PROCEDURE sumlast_func(); and then, is the boolean condition wrong? thanks for any tips The function you show above will not run for several reasons: 1) I do not see a language specified. I am assuming plpgsql. 2) The the statements in your CASE are not ended with ; 3) end does not have a ; I am also not sure that you can use SQL statements in the WHEN. Then there is this: THEN JOIN selectedcar ON selectedcar.idcar = actuals.idcar) update actuals set comb = comb + (select combor from combor_view where idlog = (select max(idlog) from combor_view)), I am not quite sure what is going on there, in particular where is is actuals.idcar coming from? I would suggest looking at the plpgsql trigger docs: http://www.postgresql.org/docs/9.3/interactive/plpgsql-trigger.html I believe you want to declare some variables and SELECT INTO them and then use those to do the boolean testing. FYI plpgsql SELECT INTO is not the same as the SQL version. For more information see: http://www.postgresql.org/docs/9.3/interactive/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-ONEROW -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Fwd: function for trigger
avpro avpro wrote > Hi, > > I have this function > > CREATE OR REPLACE FUNCTION sumlast_func() > RETURNS trigger AS > begin > CASE > WHEN idcar = (select idcar FROM selectedcar) THEN > JOIN selectedcar ON selectedcar.idcar = actuals.idcar) > update actuals > set > comb = comb + (select combor from combor_view > where idlog = (select max(idlog) from combor_view)), > END CASE; > return new; > end > > I'm not sure where to create the boolean condition; the options in the > function above or in the trigger below > > CREATE TRIGGER update_actuals_tg > AFTER INSERT > ON combor > FOR EACH ROW > EXECUTE PROCEDURE sumlast_func(); > > and then, is the boolean condition wrong? > thanks for any tips What you wrote makes no sense. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Fwd-function-for-trigger-tp5826063p5826066.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Testing on Power 8 systems
Hi, On Mon, 2014-11-03 at 10:35 -0800, John R Pierce wrote: > oooh, did someone donate a build host?cool! It is not finalized, but we are close to that. > what about AIX ? No idea. I just build RPMs :) Regards, -- Devrim GÜNDÜZ Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Twitter: @DevrimGunduz , @DevrimGunduzTR signature.asc Description: This is a digitally signed message part
Re: [GENERAL] DATA corruption after promoting slave to master
Krit, It sounds like you might be hitting edge-case statistics problems on recent data. We've had issues with this before, ourselves. I'd personally recommend increasing your default_statistics_target to 400 for your entire database in general. But it's possible that won't help this. Recent data tends to have no stats at all, and if it comes in with enough volume, the planner will be wrong a good majority of the time. If that's the data you're trying to grab, well... Based on that, you might want to consider tweaking your autovacuum_analyze_scale_factor setting so you get relatively frequent automatic analyzes so the stats are better through the day. Otherwise, your approach should work just fine. Good luck! __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
Re: [GENERAL] DATA corruption after promoting slave to master
Hey Shaun, Thanks for quick reply We have not changed the default_statistics_target, so it shall remain to its default value 100. I would like to share our observation on this if you can infer anything from it. : It started using query plan 1 after a our scheduled vacuum run and it continued to use the plan for a day till next day scheduled vacuum run. it switched to optimal query plan 2 and slowness disappeared after then. However this is what we are thinking to do incase if resurface : 1. run vacuum analyse (assuming it will help to query planner to choose best possible path as it updates the stats in pg_statistic). 2. If it does not work, increase the default_statistics_target for the column. Let us know your thoughts. On Thu, Nov 6, 2014 at 7:39 PM, Shaun Thomas wrote: > Hi Krit, > > It looks like your actual problem is here: > > > Index Scan using t1_orderid_creationtime_idx on t1 > > (cost=0.43..1181104.36 rows=9879754 width=158) > > (actual time=0.021..60830.724 rows=2416614 loops=1 > > This index scan estimates 9.8M rows, and had to touch 2.4M. The issue is > that your LIMIT clause makes the planner overly optimistic. The worst case > cost estimate for this part of the query is about 1.2M, which is much > higher than the SEQ SCAN variation you posted. The planner must think it > can get the rows without incurring the full cost, otherwise I can't see how > the 1.2M cost estimate wasn't rolled into the total estimate. > > Unfortunately behavior like this is pretty common when using LIMIT > clauses. Sometimes the planner thinks it can get results much faster than > it actually can, and it ends up reading a much larger portion of the data > than it assumed would be necessary. > > Just out of curiosity, Can you tell me what your default_statistics_target > is? > > __ > > See http://www.peak6.com/email_disclaimer/ for terms and conditions > related to this email > -- Kirit Parmar