Re: [PERFORM] improvement suggestions for performance design
Hi all! 2007/7/18, Thomas Finneid <[EMAIL PROTECTED]>: Hi I have tested your COPY patch (actually I tested postgresql-jdbc-8.2-505-copy-20070716.jdbc3.jar) and it is really fast, actually just as fast as serverside COPY (boths tests was performed on local machine). Happy to hear there's interest toward this solution. This means I am interrested in using it in my project, but I have some concerns that needs to be adressed, (and I am prepared to help in any way I can). The following are the concerns I have - While testing I got some errors, which needs to be fixed (detailed below) - The patch must be of production grade quality - I would like the patch to be part of the official pg JDBC driver. Definitely agreed, those are my requirements as well. We can discuss bug fixing among ourselves; new versions I'll announce on pgsql-jdbc list. -- Kalle Hallivuori +358-41-5053073 http://korpiq.iki.fi/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [GENERAL] [PERFORM] Parrallel query execution for UNION ALL Queries
Hi, Le mercredi 18 juillet 2007, Jonah H. Harris a écrit : > On 7/18/07, Benjamin Arai <[EMAIL PROTECTED]> wrote: > > But I want to parrallelize searches if possible to reduce > > the perofrmance loss of having multiple tables. > > PostgreSQL does not support parallel query. Parallel query on top of > PostgreSQL is provided by ExtenDB and PGPool-II. Seems to me that : - GreenPlum provides some commercial parallel query engine on top of PostgreSQL, - plproxy could be a solution to the given problem. https://developer.skype.com/SkypeGarage/DbProjects/PlProxy Hope this helps, -- dim signature.asc Description: This is a digitally signed message part.
Re: [GENERAL] [PERFORM] Parrallel query execution for UNION ALL Queries
Dimitri, > Seems to me that : > - GreenPlum provides some commercial parallel query engine on top of >PostgreSQL, I certainly think so and so do our customers in production with 100s of terabytes :-) > - plproxy could be a solution to the given problem. >https://developer.skype.com/SkypeGarage/DbProjects/PlProxy This is solving real world problems at Skype of a different kind than Greenplum, well worth checking out. - Luke ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] User concurrency thresholding: where do I look?
Folks, I've run into this a number of times with various PostgreSQL users, so we tested it at Sun. What seems to be happening is that at some specific number of connections average throughput drops 30% and response time quadruples or worse. The amount seems to vary per machine; I've seen it as variously 95, 1050, 1700 or 2800 connections. Tinkering with postgresql.conf parameters doesn't seem to affect this threshold. As an example of this behavior: Users Txn/User Resp. Time 50 105.38 0.01 100 113.05 0.01 150 114.05 0.01 200 113.51 0.01 250 113.38 0.01 300 112.14 0.01 350 112.26 0.01 400 111.43 0.01 450 110.72 0.01 500 110.44 0.01 550 109.36 0.01 600 107.01 0.02 650 105.71 0.02 700 106.95 0.02 750 107.69 0.02 800 106.78 0.02 850 108.59 0.02 900 106.03 0.02 950 106.13 0.02 100064.58 0.15 105052.32 0.23 110049.79 0.25 Tinkering with shared_buffers has had no effect on this threholding (the above was with 3gb to 6gb of shared_buffers). Any ideas on where we should look for the source of the bottleneck? -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] User concurrency thresholding: where do I look?
Josh Berkus wrote: Folks, 650 105.71 0.02 700 106.95 0.02 750 107.69 0.02 800 106.78 0.02 850 108.59 0.02 900 106.03 0.02 950 106.13 0.02 100064.58 0.15 105052.32 0.23 110049.79 0.25 Tinkering with shared_buffers has had no effect on this threholding (the above was with 3gb to 6gb of shared_buffers). Any ideas on where we should look for the source of the bottleneck? I have seen this as well. I always knocked it up to PG having to managing so many connections but there are some interesting evidences to review. The amount of memory "each" connection takes up. Consider 4-11 meg per connection depending on various things like number of prepared queries. Number of CPUs. Obviously 500 connections over 4 CPUS isn't the same as 500 connections over 8 CPUS. That number of connections generally means a higher velocity, a higher velocity means more checkpoint segments. Wrong settings with your checkpoint segments, bgwriter and checkpoint will cause you to start falling down. I would also note that our experience is that PG falls down a little higher, more toward 2500 connections last time I checked, but this was likely on different hardware. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] User concurrency thresholding: where do I look?
Josh Berkus wrote: > Folks, > > I've run into this a number of times with various PostgreSQL users, so we > tested it at Sun. What seems to be happening is that at some specific number > of connections average throughput drops 30% and response time quadruples or > worse. The amount seems to vary per machine; I've seen it as variously 95, > 1050, 1700 or 2800 connections. Tinkering with postgresql.conf parameters > doesn't seem to affect this threshold. > > As an example of this behavior: > > Users Txn/User Resp. Time > 50105.38 0.01 > 100 113.05 0.01 > 150 114.05 0.01 > 200 113.51 0.01 > 250 113.38 0.01 > 300 112.14 0.01 > 350 112.26 0.01 > 400 111.43 0.01 > 450 110.72 0.01 > 500 110.44 0.01 > 550 109.36 0.01 > 600 107.01 0.02 > 650 105.71 0.02 > 700 106.95 0.02 > 750 107.69 0.02 > 800 106.78 0.02 > 850 108.59 0.02 > 900 106.03 0.02 > 950 106.13 0.02 > 1000 64.58 0.15 > 1050 52.32 0.23 > 1100 49.79 0.25 > > Tinkering with shared_buffers has had no effect on this threholding (the > above > was with 3gb to 6gb of shared_buffers). Any ideas on where we should look > for the source of the bottleneck? Have you messed with max_connections and/or max_locks_per_transaction while testing this? The lock table is sized to max_locks_per_xact times max_connections, and shared memory hash tables get slower when they are full. Of course, the saturation point would depend on the avg number of locks acquired per user, which would explain why you are seeing a lower number for some users and higher for others (simpler/more complex queries). This is just a guess though. No profiling or measuring at all, really. -- Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC "How amazing is that? I call it a night and come back to find that a bug has been identified and patched while I sleep."(Robert Davidson) http://archives.postgresql.org/pgsql-sql/2006-03/msg00378.php ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] User concurrency thresholding: where do I look?
Alvaro, > Have you messed with max_connections and/or max_locks_per_transaction > while testing this? The lock table is sized to max_locks_per_xact times > max_connections, and shared memory hash tables get slower when they are > full. Of course, the saturation point would depend on the avg number of > locks acquired per user, which would explain why you are seeing a lower > number for some users and higher for others (simpler/more complex > queries). That's an interesting thought. Let me check lock counts and see if this is possibly the case. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Is it possible to know where is the "deadlock"
Hi all, I have a serious problem with a server. This server holds severals DB, the problem is thet the CPU's spend most of the time waiting: Cpu0: 4.0% us, 2.3% sy, 0.0% ni, 61.5% id, 32.1% wa, 0.0% hi, 0.0% si Cpu1: 2.3% us, 0.3% sy, 0.0% ni, 84.1% id, 13.3% wa, 0.0% hi, 0.0% si Cpu2: 1.3% us, 0.3% sy, 0.0% ni, 68.6% id, 29.8% wa, 0.0% hi, 0.0% si Cpu3: 4.6% us, 3.3% sy, 0.0% ni, 2.6% id, 88.4% wa, 0.3% hi, 0.7% si The iostat -c says about 8% of time waiting for IO. I'm afraid this is due to locks between concurrent queries, is there anyway to have more info about? Thanks all -- Arnau ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Is it possible to know where is the "deadlock"
In response to Arnau <[EMAIL PROTECTED]>: > Hi all, > >I have a serious problem with a server. This server holds severals > DB, the problem is thet the CPU's spend most of the time waiting: > > Cpu0: 4.0% us, 2.3% sy, 0.0% ni, 61.5% id, 32.1% wa, 0.0% hi, 0.0% si > Cpu1: 2.3% us, 0.3% sy, 0.0% ni, 84.1% id, 13.3% wa, 0.0% hi, 0.0% si > Cpu2: 1.3% us, 0.3% sy, 0.0% ni, 68.6% id, 29.8% wa, 0.0% hi, 0.0% si > Cpu3: 4.6% us, 3.3% sy, 0.0% ni, 2.6% id, 88.4% wa, 0.3% hi, 0.7% si > > The iostat -c says about 8% of time waiting for IO. I'm afraid this > is due to locks between concurrent queries, is there anyway to have more > info about? This looks perfectly normal for a medium-load server. Although you don't state your problem (you state what you think is a symptom, and call it the problem) I'm guessing you have queries that are executing slower than you would like? If that's the case, I would suggest investigating the slow queries directly. Check for indexes and ensure your vacuum/analyze schedule is acceptable. If you get stumped, post details of the queries here asking for help. Another thing that (I'm guessing) may be confusing you is if this system has multiple CPUs, each query can only execute on a single CPU. So a single query at full throttle on a 8-way system will only use 12.5% max. If you have reason to believe that locks are an issue, the pg_locks view can help you prove/disprove that theory: http://www.postgresql.org/docs/8.2/interactive/view-pg-locks.html If none of those are the case, then please describe the actual problem you are having. HTH. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ [EMAIL PROTECTED] Phone: 412-422-3463x4023 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Is it possible to know where is the "deadlock"
The iostat -c says about 8% of time waiting for IO. I'm afraid this is due to locks between concurrent queries, is there anyway to have more info about? I do believe that if you told what OS you're running, what pg-version you're running, what type of sql-statements you perform the list can provide some help. -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] User concurrency thresholding: where do I look?
Josh Berkus <[EMAIL PROTECTED]> writes: > Alvaro, >> Have you messed with max_connections and/or max_locks_per_transaction >> while testing this? The lock table is sized to max_locks_per_xact times >> max_connections, and shared memory hash tables get slower when they are >> full. Of course, the saturation point would depend on the avg number of >> locks acquired per user, which would explain why you are seeing a lower >> number for some users and higher for others (simpler/more complex >> queries). > That's an interesting thought. Let me check lock counts and see if this is > possibly the case. AFAIK you'd get hard failures, not slowdowns, if you ran out of lock space entirely; and the fact that you can continue the curve upwards says that you're not on the edge of running out. However I agree that it's worth experimenting with those two parameters to see if the curve moves around at all. Another resource that might be interesting is the number of open files. Also, have you tried watching vmstat or local equivalent to confirm that the machine's not starting to swap? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Trying to tune postgres, how is this config?
I'd like any advice you have on my postgres.conf. The machine in question is a 2.4 Ghz Xeon with 2 gigs of ram running freebsd 6.2 and postgres 8.24. There are 16 concurrent users. This machine is used only for the database. Usage is split out pretty evenly between reads and writes. Thanks, Pat # - # PostgreSQL configuration file # - # # This file consists of lines of the form: # # name = value # # (The '=' is optional.) White space may be used. Comments are introduced # with '#' anywhere on a line. The complete list of option names and # allowed values can be found in the PostgreSQL documentation. The # commented-out settings shown in this file represent the default values. # # Please note that re-commenting a setting is NOT sufficient to revert it # to the default value, unless you restart the server. # # Any option can also be given as a command line switch to the server, # e.g., 'postgres -c log_connections=on'. Some options can be changed at # run-time with the 'SET' SQL command. # # This file is read on server startup and when the server receives a # SIGHUP. If you edit the file on a running system, you have to SIGHUP the # server for the changes to take effect, or use "pg_ctl reload". Some # settings, which are marked below, require a server shutdown and restart # to take effect. # # Memory units: kB = kilobytes MB = megabytes GB = gigabytes # Time units:ms = milliseconds s = seconds min = minutes h = hours d = days #--- # FILE LOCATIONS #--- # The default values of these variables are driven from the -D command line # switch or PGDATA environment variable, represented here as ConfigDir. #data_directory = 'ConfigDir' # use data in another directory # (change requires restart) #hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file # (change requires restart) #ident_file = 'ConfigDir/pg_ident.conf' # ident configuration file # (change requires restart) # If external_pid_file is not explicitly set, no extra PID file is written. #external_pid_file = '(none)' # write an extra PID file # (change requires restart) #--- # CONNECTIONS AND AUTHENTICATION #--- # - Connection Settings - listen_addresses = 'localhost' # what IP address(es) to listen on; # comma-separated list of addresses; # defaults to 'localhost', '*' = all # (change requires restart) #port = 5432# (change requires restart) max_connections = 20# (change requires restart) # Note: increasing max_connections costs ~400 bytes of shared memory per # connection slot, plus lock space (see max_locks_per_transaction). You # might also need to raise shared_buffers to support more connections. #superuser_reserved_connections = 3 # (change requires restart) #unix_socket_directory = '' # (change requires restart) #unix_socket_group = '' # (change requires restart) #unix_socket_permissions = 0777 # octal # (change requires restart) #bonjour_name = '' # defaults to the computer name # (change requires restart) # - Security & Authentication - #authentication_timeout = 1min # 1s-600s #ssl = off # (change requires restart) #password_encryption = on #db_user_namespace = off # Kerberos #krb_server_keyfile = ''# (change requires restart) #krb_srvname = 'postgres' # (change requires restart) #krb_server_hostname = '' # empty string matches any keytab entry # (change requires restart) #krb_caseins_users = off# (change requires restart) # - TCP Keepalives - # see 'man 7 tcp' for details #tcp_keepalives_idle = 0# TCP_KEEPIDLE, in seconds; # 0 selects the system default #tcp_keepalives_interval = 0# TCP_KEEPINTVL, in seconds; # 0 selects the system default #tcp_keepalives_count = 0 # TCP_KEEPCNT; # 0 selects the system default #--- # RESOURCE USAGE (except WAL) #--- # - Memo
Re: [PERFORM] User concurrency thresholding: where do I look?
Tom Lane wrote: > Josh Berkus <[EMAIL PROTECTED]> writes: > > Alvaro, > >> Have you messed with max_connections and/or max_locks_per_transaction > >> while testing this? The lock table is sized to max_locks_per_xact times > >> max_connections, and shared memory hash tables get slower when they are > >> full. Of course, the saturation point would depend on the avg number of > >> locks acquired per user, which would explain why you are seeing a lower > >> number for some users and higher for others (simpler/more complex > >> queries). > > > That's an interesting thought. Let me check lock counts and see if this is > > possibly the case. > > AFAIK you'd get hard failures, not slowdowns, if you ran out of lock > space entirely; Well, if there still is shared memory available, the lock hash can continue to grow, but it would slow down according to this comment in ShmemInitHash: * max_size is the estimated maximum number of hashtable entries. This is * not a hard limit, but the access efficiency will degrade if it is * exceeded substantially (since it's used to compute directory size and * the hash table buckets will get overfull). For the lock hash tables this max_size is (MaxBackends+max_prepared_xacts) * max_locks_per_xact. So maybe this does not make much sense in normal operation, thus not applicable to what Josh Berkus is reporting. However I was talking to Josh Drake yesterday and he told me that pg_dump was spending some significant amount of time in LOCK TABLE when there are lots of tables (say 300k). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] User concurrency thresholding: where do I look?
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> AFAIK you'd get hard failures, not slowdowns, if you ran out of lock >> space entirely; > Well, if there still is shared memory available, the lock hash can > continue to grow, but it would slow down according to this comment in > ShmemInitHash: Right, but there's not an enormous amount of headroom in shared memory beyond the intended size of the hash tables. I'd think that you'd start seeing hard failures not very far beyond the point at which performance impacts became visible. Of course this is all speculation; I quite agree with varying the table-size parameters to see if it makes a difference. Josh, what sort of workload is being tested here --- read-mostly, write-mostly, a mixture? > However I was talking to Josh Drake yesterday and he told me that > pg_dump was spending some significant amount of time in LOCK TABLE when > there are lots of tables (say 300k). I wouldn't be too surprised if there's some O(N^2) effects when a single transaction holds that many locks, because of the linked-list proclock data structures. This would not be relevant to Josh's case though. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] User concurrency thresholding: where do I look?
Alvaro Herrera wrote: Tom Lane wrote: Josh Berkus <[EMAIL PROTECTED]> writes: So maybe this does not make much sense in normal operation, thus not applicable to what Josh Berkus is reporting. However I was talking to Josh Drake yesterday and he told me that pg_dump was spending some significant amount of time in LOCK TABLE when there are lots of tables (say 300k). Less, 128k Joshua D. Drake ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] User concurrency thresholding: where do I look?
On Thu, 19 Jul 2007, Josh Berkus wrote: What seems to be happening is that at some specific number of connections average throughput drops 30% and response time quadruples or worse. Could you characterize what each connection is doing and how you're generating the load? I don't know how productive speculating about the cause here will be until there's a test script available so other people can see where the tipping point is on their system. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Improving select peformance
Hi, One of our end users was complaining about a report that was taking too much time to execute and I´ve discovered that the following SQL statement was the responsible for it. I would appreciate any suggestions to improve performance of it. Thank you very much in advance! _ explain analyze select (VEN.DOCUME)::varchar(13) as COLUNA0, (VENCODPGT.APEPGT)::varchar(9) as COLUNA1, (COALESCE(COALESCE(VEN.VLRLIQ,0) * (CASE VEN.VLRNOT WHEN 0 THEN 0 ELSE IVE.VLRMOV / VEN.VLRNOT END),0)) as COLUNA2, (COALESCE(IVE.QTDMOV,0)) as COLUNA3, (VIPR.NOMPRO)::varchar(83) as COLUNA4, (VIPR.REFPRO)::varchar(20) as COLUNA5 from TV_VEN VEN inner join TT_IVE IVE ON IVE.SEQUEN = VEN.SEQUEN and IVE.CODFIL = VEN.CODFIL inner join TV_IPR VIPR ON VIPR.FILMAT = IVE.FILMAT and VIPR.CODMAT = IVE.CODMAT and VIPR.CODCOR = IVE.CODCOR and VIPR.CODTAM = IVE.CODTAM left join TT_PLA VENCODPGT ON VEN.FILPGT = VENCODPGT.FILPGT AND VEN.CODPGT = VENCODPGT.CODPGT where ('001' = VEN.CODFIL) and VEN.DATHOR between '07/12/2007 00:00:00' and '07/12/2007 23:59:59' and (VEN.CODNAT = '-3') and IVE.SITMOV <> 'C' and ('1' = VIPR.DEPART) ; --- Nested Loop Left Join (cost=995.52..75661.01 rows=1 width=195) (actual time=4488.166..1747121.374 rows=256 loops=1) -> Nested Loop (cost=995.52..75660.62 rows=1 width=199) (actual time=4481.323..1747105.903 rows=256 loops=1) Join Filter: ((gra.filmat = ive.filmat) AND (gra.codmat = ive.codmat) AND (gra.codcor = ive.codcor) AND (gra.codtam = ive.codtam)) -> Nested Loop (cost=1.11..3906.12 rows=1 width=151) (actual time=15.626..128.934 rows=414 loops=1) Join Filter: (div.coddiv = ddiv.codtab) -> Nested Loop (cost=1.11..3905.05 rows=1 width=160) (actual time=15.611..121.455 rows=414 loops=1) Join Filter: (sub.codsub = dsub.codtab) -> Nested Loop (cost=1.11..3903.99 rows=1 width=169) (actual time=15.593..113.866 rows=414 loops=1) Join Filter: ((gra.codcor)::text = ((div.codite)::text || ''::text)) -> Hash Join (cost=1.11..3888.04 rows=11 width=146) (actual time=15.560..85.376 rows=414 loops=1) Hash Cond: ((gra.codtam)::text = ((sub.codite)::text || ''::text)) -> Nested Loop (cost=0.00..3883.64 rows=423 width=123) (actual time=15.376..81.482 rows=414 loops=1) -> Index Scan using i_fk_pro_ddep on tt_pro pro (cost=0.00..149.65 rows=516 width=77) (actual time=15.244..30.586 rows=414 loops=1) Index Cond: (1::numeric = depart) -> Index Scan using pk_gra on tt_gra gra (cost=0.00..7.22 rows=1 width=46) (actual time=0.104..0.110 rows=1 loops=414) Index Cond: ((pro.filmat = gra.filmat) AND (pro.codmat = gra.codmat)) -> Hash (cost=1.05..1.05 rows=5 width=32) (actual time=0.048..0.048 rows=5 loops=1) -> Seq Scan on tt_sub sub (cost=0.00..1.05 rows=5 width=32) (actual time=0.016..0.024 rows=5 loops=1) -> Seq Scan on tt_div div (cost=0.00..1.15 rows=15 width=32) (actual time=0.004..0.022 rows=15 loops=414) -> Seq Scan on td_sub dsub (cost=0.00..1.03 rows=3 width=9) (actual time=0.003..0.007 rows=3 loops=414) -> Seq Scan on td_div ddiv (cost=0.00..1.03 rows=3 width=9) (actual time=0.002..0.007 rows=3 loops=414) -> Hash Join (cost=994.41..71746.74 rows=388 width=114) (actual time=5.298..4218.486 rows=857 loops=414) Hash Cond: (ive.sequen = ven.sequen) -> Nested Loop (cost=0.00..68318.52 rows=647982 width=85) (actual time=0.026..3406.170 rows=643739 loops=414) -> Seq Scan on td_nat nat (cost=0.00..1.24 rows=1 width=9) (actual time=0.004..0.014 rows=1 loops=414) Filter: (-3::numeric = codtab) -> Seq Scan on tt_ive ive (cost=0.00..61837.46 rows=647982 width=76) (actual time=0.017..1926.983 rows=643739 loops=414) Filter: ((sitmov <> 'C'::bpchar) AND ('001'::bpchar = codfil)) -> Hash (cost=992.08..992.08 rows=186 width=89) (actual time=33.234..33.234 rows=394 loops=1) -> Hash Left Join (cost=3.48..992.08 rows=186 width=89) (a
Re: [PERFORM] User concurrency thresholding: where do I look?
Tom, all: > Also, have you tried watching vmstat or local equivalent to confirm that > the machine's not starting to swap? We're not swapping. > Josh, what sort of workload is being tested here --- read-mostly, > write-mostly, a mixture? It's a TPCC-like workload, so heavy single-row updates, and the updates/inserts are what's being measured. For that matter, when I've seen this before it was with heavy-write workloads and we were measuring the number of updates/inserts and not the number of reads. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Problems with posting
Hi, I'm trying to post the following message to the performance group but the message does not appears in the list. Can someone help to solve this issue? Thanks in advance! ___ Hi, One of our end users was complaining about a report that was taking too much time to execute and I´ve discovered that the following SQL statement was the responsible for it. I would appreciate any suggestions to improve performance of it. Thank you very much in advance! _ explain analyze select (VEN.DOCUME)::varchar(13) as COLUNA0, (VENCODPGT.APEPGT)::varchar(9) as COLUNA1, (COALESCE(COALESCE(VEN.VLRLIQ,0) * (CASE VEN.VLRNOT WHEN 0 THEN 0 ELSE IVE.VLRMOV / VEN.VLRNOT END),0)) as COLUNA2, (COALESCE(IVE.QTDMOV,0)) as COLUNA3, (VIPR.NOMPRO)::varchar(83) as COLUNA4, (VIPR.REFPRO)::varchar(20) as COLUNA5 from TV_VEN VEN inner join TT_IVE IVE ON IVE.SEQUEN = VEN.SEQUEN and IVE.CODFIL = VEN.CODFIL inner join TV_IPR VIPR ON VIPR.FILMAT = IVE.FILMAT and VIPR.CODMAT = IVE.CODMAT and VIPR.CODCOR = IVE.CODCOR and VIPR.CODTAM = IVE.CODTAM left join TT_PLA VENCODPGT ON VEN.FILPGT = VENCODPGT.FILPGT AND VEN.CODPGT = VENCODPGT.CODPGT where ('001' = VEN.CODFIL) and VEN.DATHOR between '07/12/2007 00:00:00' and '07/12/2007 23:59:59' and (VEN.CODNAT = '-3') and IVE.SITMOV <> 'C' and ('1' = VIPR.DEPART) ; --- Nested Loop Left Join (cost=995.52..75661.01 rows=1 width=195) (actual time=4488.166..1747121.374 rows=256 loops=1) -> Nested Loop (cost=995.52..75660.62 rows=1 width=199) (actual time=4481.323..1747105.903 rows=256 loops=1) Join Filter: ((gra.filmat = ive.filmat) AND (gra.codmat = ive.codmat) AND (gra.codcor = ive.codcor) AND (gra.codtam = ive.codtam)) -> Nested Loop (cost=1.11..3906.12 rows=1 width=151) (actual time=15.626..128.934 rows=414 loops=1) Join Filter: (div.coddiv = ddiv.codtab) -> Nested Loop (cost=1.11..3905.05 rows=1 width=160) (actual time=15.611..121.455 rows=414 loops=1) Join Filter: (sub.codsub = dsub.codtab) -> Nested Loop (cost=1.11..3903.99 rows=1 width=169) (actual time=15.593..113.866 rows=414 loops=1) Join Filter: ((gra.codcor)::text = ((div.codite)::text || ''::text)) -> Hash Join (cost=1.11..3888.04 rows=11 width=146) (actual time=15.560..85.376 rows=414 loops=1) Hash Cond: ((gra.codtam)::text = ((sub.codite)::text || ''::text)) -> Nested Loop (cost=0.00..3883.64 rows=423 width=123) (actual time=15.376..81.482 rows=414 loops=1) -> Index Scan using i_fk_pro_ddep on tt_pro pro (cost=0.00..149.65 rows=516 width=77) (actual time=15.244..30.586 rows=414 loops=1) Index Cond: (1::numeric = depart) -> Index Scan using pk_gra on tt_gra gra (cost=0.00..7.22 rows=1 width=46) (actual time=0.104..0.110 rows=1 loops=414) Index Cond: ((pro.filmat = gra.filmat) AND (pro.codmat = gra.codmat)) -> Hash (cost=1.05..1.05 rows=5 width=32) (actual time=0.048..0.048 rows=5 loops=1) -> Seq Scan on tt_sub sub (cost=0.00..1.05 rows=5 width=32) (actual time=0.016..0.024 rows=5 loops=1) -> Seq Scan on tt_div div (cost=0.00..1.15 rows=15 width=32) (actual time=0.004..0.022 rows=15 loops=414) -> Seq Scan on td_sub dsub (cost=0.00..1.03 rows=3 width=9) (actual time=0.003..0.007 rows=3 loops=414) -> Seq Scan on td_div ddiv (cost=0.00..1.03 rows=3 width=9) (actual time=0.002..0.007 rows=3 loops=414) -> Hash Join (cost=994.41..71746.74 rows=388 width=114) (actual time=5.298..4218.486 rows=857 loops=414) Hash Cond: (ive.sequen = ven.sequen) -> Nested Loop (cost=0.00..68318.52 rows=647982 width=85) (actual time=0.026..3406.170 rows=643739 loops=414) -> Seq Scan on td_nat nat (cost=0.00..1.24 rows=1 width=9) (actual time=0.004..0.014 rows=1 loops=414) Filter: (-3::numeric = codtab) -> Seq Scan on tt_ive ive (cost=0.00..61837.46 rows=647982 width=76) (actual time=0.017..1926.983
Re: [PERFORM] Improving select peformance
"Carlos H. Reimer" <[EMAIL PROTECTED]> writes: > One of our end users was complaining about a report that was taking too much > time to execute and I´ve discovered that the following SQL statement was the > responsible for it. Here's part of the problem: >Join Filter: ((gra.codcor)::text = > ((div.codite)::text || ''::text)) >-> Hash Join (cost=1.11..3888.04 rows=11 > width=146) (actual time=15.560..85.376 rows=414 loops=1) > Hash Cond: ((gra.codtam)::text = > ((sub.codite)::text || ''::text)) Why such bizarre join conditions? Why don't you lose the useless concatenations of empty strings and have just a plain equality comparison? This technique completely destroys any chance of the planner making good estimates of the join result sizes (and the bad estimates it's coming out with are part of the problem). >-> Nested Loop (cost=0.00..68318.52 rows=647982 width=85) > (actual time=0.026..3406.170 rows=643739 loops=414) > -> Seq Scan on td_nat nat (cost=0.00..1.24 rows=1 > width=9) (actual time=0.004..0.014 rows=1 loops=414) >Filter: (-3::numeric = codtab) > -> Seq Scan on tt_ive ive (cost=0.00..61837.46 > rows=647982 width=76) (actual time=0.017..1926.983 rows=643739 loops=414) >Filter: ((sitmov <> 'C'::bpchar) AND > ('001'::bpchar = codfil)) The other big problem seems to be that it's choosing to do this unconstrained join first. I'm not sure about the cause of that, but maybe you need to increase join_collapse_limit. What PG version is this anyway? A more general comment, if you are open to schema changes, is that you should change all the "numeric(n,0)" fields to integer (or possibly smallint or bigint as needed). Particularly the ones that are used as join keys, primary keys, foreign keys. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] User concurrency thresholding: where do I look?
Josh Berkus <[EMAIL PROTECTED]> writes: >> Josh, what sort of workload is being tested here --- read-mostly, >> write-mostly, a mixture? > It's a TPCC-like workload, so heavy single-row updates, and the > updates/inserts are what's being measured. For that matter, when I've seen > this before it was with heavy-write workloads and we were measuring the > number of updates/inserts and not the number of reads. Well, if the load is a lot of short writing transactions then you'd expect the throughput to depend on how fast stuff can be pushed down to WAL. What have you got wal_buffers set to? Are you using a commit delay? What's the I/O system anyway (any BB write cache on the WAL disk?) and what wal sync method are you using? While I'm asking questions, exactly what were the data columns you presented? Txn/User doesn't make much sense to me, and I'm not sure what "response time" you were measuring either. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
Tom, > Well, if the load is a lot of short writing transactions then you'd > expect the throughput to depend on how fast stuff can be pushed down to > WAL. What have you got wal_buffers set to? Are you using a commit > delay? What's the I/O system anyway (any BB write cache on the WAL > disk?) and what wal sync method are you using? You know, I think Jignesh needs to me on this list so I can stop relaying questions on a workload I didn't design. Let me get him. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] 8.2 -> 8.3 performance numbers
Sorry for the cross-post, but this is performance and advocacy related... Has anyone benchmarked HEAD against 8.2? I'd like some numbers to use in my OSCon lightning talk. Numbers for both with and without HOT would be even better (I know we've got HOT-specific benchmarks, but I want complete 8.2 -> 8.3 numbers). -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings