Re: [PERFORM] improvement suggestions for performance design

2007-07-19 Thread Kalle Hallivuori

Hi all!

2007/7/18, Thomas Finneid <[EMAIL PROTECTED]>:

Hi I have tested your COPY patch (actually I tested
postgresql-jdbc-8.2-505-copy-20070716.jdbc3.jar) and it is really fast,
actually just as fast as serverside COPY (boths tests was performed on
local machine).


Happy to hear there's interest toward this solution.


This means I am interrested in using it in my project, but I have some
concerns that needs to be adressed, (and I am prepared to help in any
way I can). The following are the concerns I have

- While testing I got some errors, which needs to be fixed (detailed below)
- The patch must be of production grade quality
- I would like the patch to be part of the official pg JDBC driver.


Definitely agreed, those are my requirements as well. We can discuss
bug fixing among ourselves; new versions I'll announce on pgsql-jdbc
list.

--
Kalle Hallivuori +358-41-5053073 http://korpiq.iki.fi/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [GENERAL] [PERFORM] Parrallel query execution for UNION ALL Queries

2007-07-19 Thread Dimitri Fontaine
Hi,

Le mercredi 18 juillet 2007, Jonah H. Harris a écrit :
> On 7/18/07, Benjamin Arai <[EMAIL PROTECTED]> wrote:
> > But I want to parrallelize searches if possible to reduce
> > the perofrmance loss of having multiple tables.
>
> PostgreSQL does not support parallel query.  Parallel query on top of
> PostgreSQL is provided by ExtenDB and PGPool-II.

Seems to me that : 
 - GreenPlum provides some commercial parallel query engine on top of
   PostgreSQL,
 
 - plproxy could be a solution to the given problem.
   https://developer.skype.com/SkypeGarage/DbProjects/PlProxy

Hope this helps,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [GENERAL] [PERFORM] Parrallel query execution for UNION ALL Queries

2007-07-19 Thread Luke Lonergan
Dimitri,

> Seems to me that : 
>  - GreenPlum provides some commercial parallel query engine on top of
>PostgreSQL,

I certainly think so and so do our customers in production with 100s of
terabytes :-)
  
>  - plproxy could be a solution to the given problem.
>https://developer.skype.com/SkypeGarage/DbProjects/PlProxy

This is solving real world problems at Skype of a different kind than
Greenplum, well worth checking out.

- Luke


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Josh Berkus
Folks,

I've run into this a number of times with various PostgreSQL users, so we 
tested it at Sun.  What seems to be happening is that at some specific number 
of connections average throughput drops 30% and response time quadruples or 
worse.  The amount seems to vary per machine; I've seen it as variously 95, 
1050, 1700 or 2800 connections.  Tinkering with postgresql.conf parameters 
doesn't seem to affect this threshold.

As an example of this behavior:

Users   Txn/User  Resp. Time
50  105.38  0.01
100 113.05  0.01
150 114.05  0.01
200 113.51  0.01
250 113.38  0.01
300 112.14  0.01
350 112.26  0.01
400 111.43  0.01
450 110.72  0.01
500 110.44  0.01
550 109.36  0.01
600 107.01  0.02
650 105.71  0.02
700 106.95  0.02
750 107.69  0.02
800 106.78  0.02
850 108.59  0.02
900 106.03  0.02
950 106.13  0.02
100064.58   0.15
105052.32   0.23
110049.79   0.25

Tinkering with shared_buffers has had no effect on this threholding (the above 
was with 3gb to 6gb of shared_buffers).   Any ideas on where we should look 
for the source of the bottleneck?

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Joshua D. Drake

Josh Berkus wrote:

Folks,




650 105.71  0.02
700 106.95  0.02
750 107.69  0.02
800 106.78  0.02
850 108.59  0.02
900 106.03  0.02
950 106.13  0.02
100064.58   0.15
105052.32   0.23
110049.79   0.25

Tinkering with shared_buffers has had no effect on this threholding (the above 
was with 3gb to 6gb of shared_buffers).   Any ideas on where we should look 
for the source of the bottleneck?


I have seen this as well. I always knocked it up to PG having to 
managing so many connections but there are some interesting evidences to 
review.


The amount of memory "each" connection takes up. Consider 4-11 meg per 
connection depending on various things like number of prepared queries.


Number of CPUs. Obviously 500 connections over 4 CPUS isn't the same as 
500 connections over 8 CPUS.


That number of connections generally means a higher velocity, a higher 
velocity means more checkpoint segments. Wrong settings with your 
checkpoint segments, bgwriter and checkpoint will cause you to start 
falling down.


I would also note that our experience is that PG falls down a little 
higher, more toward 2500 connections last time I checked, but this was 
likely on different hardware.


Joshua D. Drake



--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Alvaro Herrera
Josh Berkus wrote:
> Folks,
> 
> I've run into this a number of times with various PostgreSQL users, so we 
> tested it at Sun.  What seems to be happening is that at some specific number 
> of connections average throughput drops 30% and response time quadruples or 
> worse.  The amount seems to vary per machine; I've seen it as variously 95, 
> 1050, 1700 or 2800 connections.  Tinkering with postgresql.conf parameters 
> doesn't seem to affect this threshold.
> 
> As an example of this behavior:
> 
> Users Txn/User  Resp. Time
> 50105.38  0.01
> 100   113.05  0.01
> 150   114.05  0.01
> 200   113.51  0.01
> 250   113.38  0.01
> 300   112.14  0.01
> 350   112.26  0.01
> 400   111.43  0.01
> 450   110.72  0.01
> 500   110.44  0.01
> 550   109.36  0.01
> 600   107.01  0.02
> 650   105.71  0.02
> 700   106.95  0.02
> 750   107.69  0.02
> 800   106.78  0.02
> 850   108.59  0.02
> 900   106.03  0.02
> 950   106.13  0.02
> 1000  64.58   0.15
> 1050  52.32   0.23
> 1100  49.79   0.25
> 
> Tinkering with shared_buffers has had no effect on this threholding (the 
> above 
> was with 3gb to 6gb of shared_buffers).   Any ideas on where we should look 
> for the source of the bottleneck?

Have you messed with max_connections and/or max_locks_per_transaction
while testing this?  The lock table is sized to max_locks_per_xact times
max_connections, and shared memory hash tables get slower when they are
full.  Of course, the saturation point would depend on the avg number of
locks acquired per user, which would explain why you are seeing a lower
number for some users and higher for others (simpler/more complex
queries).

This is just a guess though.  No profiling or measuring at all, really.

-- 
Alvaro Herrera  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
"How amazing is that? I call it a night and come back to find that a bug has
been identified and patched while I sleep."(Robert Davidson)
   http://archives.postgresql.org/pgsql-sql/2006-03/msg00378.php

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Josh Berkus
Alvaro,

> Have you messed with max_connections and/or max_locks_per_transaction
> while testing this?  The lock table is sized to max_locks_per_xact times
> max_connections, and shared memory hash tables get slower when they are
> full.  Of course, the saturation point would depend on the avg number of
> locks acquired per user, which would explain why you are seeing a lower
> number for some users and higher for others (simpler/more complex
> queries).

That's an interesting thought.  Let me check lock counts and see if this is 
possibly the case.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] Is it possible to know where is the "deadlock"

2007-07-19 Thread Arnau

Hi all,

  I have a serious problem with a server. This server holds severals 
DB, the problem is thet the CPU's spend most of the time waiting:


Cpu0: 4.0% us, 2.3% sy, 0.0% ni, 61.5% id, 32.1% wa, 0.0% hi, 0.0% si
Cpu1: 2.3% us, 0.3% sy, 0.0% ni, 84.1% id, 13.3% wa, 0.0% hi, 0.0% si
Cpu2: 1.3% us, 0.3% sy, 0.0% ni, 68.6% id, 29.8% wa, 0.0% hi, 0.0% si
Cpu3: 4.6% us, 3.3% sy, 0.0% ni,  2.6% id, 88.4% wa, 0.3% hi, 0.7% si

The iostat -c says about 8% of time waiting for IO. I'm afraid this 
is due to locks between concurrent queries, is there anyway to have more 
info about?


Thanks all
--
Arnau

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Is it possible to know where is the "deadlock"

2007-07-19 Thread Bill Moran
In response to Arnau <[EMAIL PROTECTED]>:

> Hi all,
> 
>I have a serious problem with a server. This server holds severals 
> DB, the problem is thet the CPU's spend most of the time waiting:
> 
> Cpu0: 4.0% us, 2.3% sy, 0.0% ni, 61.5% id, 32.1% wa, 0.0% hi, 0.0% si
> Cpu1: 2.3% us, 0.3% sy, 0.0% ni, 84.1% id, 13.3% wa, 0.0% hi, 0.0% si
> Cpu2: 1.3% us, 0.3% sy, 0.0% ni, 68.6% id, 29.8% wa, 0.0% hi, 0.0% si
> Cpu3: 4.6% us, 3.3% sy, 0.0% ni,  2.6% id, 88.4% wa, 0.3% hi, 0.7% si
> 
>  The iostat -c says about 8% of time waiting for IO. I'm afraid this 
> is due to locks between concurrent queries, is there anyway to have more 
> info about?

This looks perfectly normal for a medium-load server.

Although you don't state your problem (you state what you think is a
symptom, and call it the problem) I'm guessing you have queries that
are executing slower than you would like?   If that's the case, I would
suggest investigating the slow queries directly.  Check for indexes and
ensure your vacuum/analyze schedule is acceptable.  If you get
stumped, post details of the queries here asking for help.

Another thing that (I'm guessing) may be confusing you is if this 
system has multiple CPUs, each query can only execute on a single
CPU.  So a single query at full throttle on a 8-way system will
only use 12.5% max.

If you have reason to believe that locks are an issue, the pg_locks
view can help you prove/disprove that theory:
http://www.postgresql.org/docs/8.2/interactive/view-pg-locks.html

If none of those are the case, then please describe the actual problem
you are having.

HTH.

-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

[EMAIL PROTECTED]
Phone: 412-422-3463x4023

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Is it possible to know where is the "deadlock"

2007-07-19 Thread Claus Guttesen

 The iostat -c says about 8% of time waiting for IO. I'm afraid this
is due to locks between concurrent queries, is there anyway to have more
info about?


I do believe that if you told what OS you're running, what pg-version
you're running, what type of sql-statements you perform the list can
provide some help.

--
regards
Claus

When lenity and cruelty play for a kingdom,
the gentlest gamester is the soonest winner.

Shakespeare

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Tom Lane
Josh Berkus <[EMAIL PROTECTED]> writes:
> Alvaro,
>> Have you messed with max_connections and/or max_locks_per_transaction
>> while testing this?  The lock table is sized to max_locks_per_xact times
>> max_connections, and shared memory hash tables get slower when they are
>> full.  Of course, the saturation point would depend on the avg number of
>> locks acquired per user, which would explain why you are seeing a lower
>> number for some users and higher for others (simpler/more complex
>> queries).

> That's an interesting thought.  Let me check lock counts and see if this is 
> possibly the case.

AFAIK you'd get hard failures, not slowdowns, if you ran out of lock
space entirely; and the fact that you can continue the curve upwards
says that you're not on the edge of running out.  However I agree that
it's worth experimenting with those two parameters to see if the curve
moves around at all.

Another resource that might be interesting is the number of open files.

Also, have you tried watching vmstat or local equivalent to confirm that
the machine's not starting to swap?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] Trying to tune postgres, how is this config?

2007-07-19 Thread Pat Maddox

I'd like any advice you have on my postgres.conf.  The machine in
question is a 2.4 Ghz Xeon with 2 gigs of ram running freebsd 6.2 and
postgres 8.24.  There are 16 concurrent users.  This machine is used
only for the database.  Usage is split out pretty evenly between reads
and writes.

Thanks,
Pat




# -
# PostgreSQL configuration file
# -
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.)  White space may be used.  Comments are introduced
# with '#' anywhere on a line.  The complete list of option names and
# allowed values can be found in the PostgreSQL documentation.  The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the server.
#
# Any option can also be given as a command line switch to the server,
# e.g., 'postgres -c log_connections=on'.  Some options can be changed at
# run-time with the 'SET' SQL command.
#
# This file is read on server startup and when the server receives a
# SIGHUP.  If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, or use "pg_ctl reload". Some
# settings, which are marked below, require a server shutdown and restart
# to take effect.
#
# Memory units:  kB = kilobytes MB = megabytes GB = gigabytes
# Time units:ms = milliseconds s = seconds min = minutes h = hours d = days


#---
# FILE LOCATIONS
#---

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'   # use data in another directory
  # (change requires restart)
#hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file
  # (change requires restart)
#ident_file = 'ConfigDir/pg_ident.conf' # ident configuration file
  # (change requires restart)

# If external_pid_file is not explicitly set, no extra PID file is written.
#external_pid_file = '(none)'   # write an extra PID file
  # (change requires restart)


#---
# CONNECTIONS AND AUTHENTICATION
#---

# - Connection Settings -

listen_addresses = 'localhost'   # what IP address(es) to listen on;
  # comma-separated list of addresses;
  # defaults to 'localhost', '*' = all
  # (change requires restart)
#port = 5432# (change requires restart)
max_connections = 20# (change requires restart)
# Note: increasing max_connections costs ~400 bytes of shared memory per
# connection slot, plus lock space (see max_locks_per_transaction).  You
# might also need to raise shared_buffers to support more connections.
#superuser_reserved_connections = 3 # (change requires restart)
#unix_socket_directory = '' # (change requires restart)
#unix_socket_group = '' # (change requires restart)
#unix_socket_permissions = 0777 # octal
  # (change requires restart)
#bonjour_name = ''  # defaults to the computer name
  # (change requires restart)

# - Security & Authentication -

#authentication_timeout = 1min  # 1s-600s
#ssl = off  # (change requires restart)
#password_encryption = on
#db_user_namespace = off

# Kerberos
#krb_server_keyfile = ''# (change requires restart)
#krb_srvname = 'postgres'   # (change requires restart)
#krb_server_hostname = ''   # empty string matches any keytab entry
  # (change requires restart)
#krb_caseins_users = off# (change requires restart)

# - TCP Keepalives -
# see 'man 7 tcp' for details

#tcp_keepalives_idle = 0# TCP_KEEPIDLE, in seconds;
  # 0 selects the system default
#tcp_keepalives_interval = 0# TCP_KEEPINTVL, in seconds;
  # 0 selects the system default
#tcp_keepalives_count = 0   # TCP_KEEPCNT;
  # 0 selects the system default


#---
# RESOURCE USAGE (except WAL)
#---

# - Memo

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Alvaro Herrera
Tom Lane wrote:
> Josh Berkus <[EMAIL PROTECTED]> writes:
> > Alvaro,
> >> Have you messed with max_connections and/or max_locks_per_transaction
> >> while testing this?  The lock table is sized to max_locks_per_xact times
> >> max_connections, and shared memory hash tables get slower when they are
> >> full.  Of course, the saturation point would depend on the avg number of
> >> locks acquired per user, which would explain why you are seeing a lower
> >> number for some users and higher for others (simpler/more complex
> >> queries).
> 
> > That's an interesting thought.  Let me check lock counts and see if this is 
> > possibly the case.
> 
> AFAIK you'd get hard failures, not slowdowns, if you ran out of lock
> space entirely;

Well, if there still is shared memory available, the lock hash can
continue to grow, but it would slow down according to this comment in
ShmemInitHash:

 * max_size is the estimated maximum number of hashtable entries.  This is
 * not a hard limit, but the access efficiency will degrade if it is
 * exceeded substantially (since it's used to compute directory size and
 * the hash table buckets will get overfull).

For the lock hash tables this max_size is
(MaxBackends+max_prepared_xacts) * max_locks_per_xact.

So maybe this does not make much sense in normal operation, thus not
applicable to what Josh Berkus is reporting.

However I was talking to Josh Drake yesterday and he told me that
pg_dump was spending some significant amount of time in LOCK TABLE when
there are lots of tables (say 300k).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> AFAIK you'd get hard failures, not slowdowns, if you ran out of lock
>> space entirely;

> Well, if there still is shared memory available, the lock hash can
> continue to grow, but it would slow down according to this comment in
> ShmemInitHash:

Right, but there's not an enormous amount of headroom in shared memory
beyond the intended size of the hash tables.  I'd think that you'd start
seeing hard failures not very far beyond the point at which performance
impacts became visible.  Of course this is all speculation; I quite
agree with varying the table-size parameters to see if it makes a
difference.

Josh, what sort of workload is being tested here --- read-mostly,
write-mostly, a mixture?

> However I was talking to Josh Drake yesterday and he told me that
> pg_dump was spending some significant amount of time in LOCK TABLE when
> there are lots of tables (say 300k).

I wouldn't be too surprised if there's some O(N^2) effects when a single
transaction holds that many locks, because of the linked-list proclock
data structures.  This would not be relevant to Josh's case though.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Joshua D. Drake

Alvaro Herrera wrote:

Tom Lane wrote:

Josh Berkus <[EMAIL PROTECTED]> writes:



So maybe this does not make much sense in normal operation, thus not
applicable to what Josh Berkus is reporting.

However I was talking to Josh Drake yesterday and he told me that
pg_dump was spending some significant amount of time in LOCK TABLE when
there are lots of tables (say 300k).


Less, 128k

Joshua D. Drake







---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Greg Smith

On Thu, 19 Jul 2007, Josh Berkus wrote:

What seems to be happening is that at some specific number of 
connections average throughput drops 30% and response time quadruples or 
worse.


Could you characterize what each connection is doing and how you're 
generating the load?  I don't know how productive speculating about the 
cause here will be until there's a test script available so other people 
can see where the tipping point is on their system.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM] Improving select peformance

2007-07-19 Thread Carlos H. Reimer
Hi,

One of our end users was complaining about a report that was taking too much
time to execute and I´ve discovered that the following SQL statement was the
responsible for it.

I would appreciate any suggestions to improve performance of it.

Thank you very much in advance!


_

explain analyze select (VEN.DOCUME)::varchar(13) as COLUNA0,
   (VENCODPGT.APEPGT)::varchar(9) as COLUNA1,
   (COALESCE(COALESCE(VEN.VLRLIQ,0) * (CASE  VEN.VLRNOT  WHEN 0
THEN  0 ELSE  IVE.VLRMOV / VEN.VLRNOT  END),0)) as COLUNA2,
   (COALESCE(IVE.QTDMOV,0)) as COLUNA3,
   (VIPR.NOMPRO)::varchar(83) as COLUNA4,
   (VIPR.REFPRO)::varchar(20) as COLUNA5
from TV_VEN VEN
  inner join TT_IVE IVE ON IVE.SEQUEN = VEN.SEQUEN and
IVE.CODFIL = VEN.CODFIL
  inner join TV_IPR VIPR ON VIPR.FILMAT = IVE.FILMAT and
VIPR.CODMAT = IVE.CODMAT and
VIPR.CODCOR = IVE.CODCOR and
VIPR.CODTAM = IVE.CODTAM

 left join TT_PLA VENCODPGT ON VEN.FILPGT = VENCODPGT.FILPGT AND
VEN.CODPGT = VENCODPGT.CODPGT
where ('001' = VEN.CODFIL)
and VEN.DATHOR between '07/12/2007 00:00:00' and '07/12/2007
23:59:59'
and (VEN.CODNAT = '-3')
and IVE.SITMOV <> 'C'
and ('1' = VIPR.DEPART) ;



---
 Nested Loop Left Join  (cost=995.52..75661.01 rows=1 width=195) (actual
time=4488.166..1747121.374 rows=256 loops=1)
   ->  Nested Loop  (cost=995.52..75660.62 rows=1 width=199) (actual
time=4481.323..1747105.903 rows=256 loops=1)
 Join Filter: ((gra.filmat = ive.filmat) AND (gra.codmat =
ive.codmat) AND (gra.codcor = ive.codcor) AND (gra.codtam = ive.codtam))
 ->  Nested Loop  (cost=1.11..3906.12 rows=1 width=151) (actual
time=15.626..128.934 rows=414 loops=1)
   Join Filter: (div.coddiv = ddiv.codtab)
   ->  Nested Loop  (cost=1.11..3905.05 rows=1 width=160)
(actual time=15.611..121.455 rows=414 loops=1)
 Join Filter: (sub.codsub = dsub.codtab)
 ->  Nested Loop  (cost=1.11..3903.99 rows=1 width=169)
(actual time=15.593..113.866 rows=414 loops=1)
   Join Filter: ((gra.codcor)::text =
((div.codite)::text || ''::text))
   ->  Hash Join  (cost=1.11..3888.04 rows=11
width=146) (actual time=15.560..85.376 rows=414 loops=1)
 Hash Cond: ((gra.codtam)::text =
((sub.codite)::text || ''::text))
 ->  Nested Loop  (cost=0.00..3883.64
rows=423 width=123) (actual time=15.376..81.482 rows=414 loops=1)
   ->  Index Scan using i_fk_pro_ddep on
tt_pro pro  (cost=0.00..149.65 rows=516 width=77) (actual
time=15.244..30.586 rows=414 loops=1)
 Index Cond: (1::numeric =
depart)
   ->  Index Scan using pk_gra on tt_gra
gra  (cost=0.00..7.22 rows=1 width=46) (actual time=0.104..0.110 rows=1
loops=414)
 Index Cond: ((pro.filmat =
gra.filmat) AND (pro.codmat = gra.codmat))
 ->  Hash  (cost=1.05..1.05 rows=5 width=32)
(actual time=0.048..0.048 rows=5 loops=1)
   ->  Seq Scan on tt_sub sub
(cost=0.00..1.05 rows=5 width=32) (actual time=0.016..0.024 rows=5 loops=1)
   ->  Seq Scan on tt_div div  (cost=0.00..1.15
rows=15 width=32) (actual time=0.004..0.022 rows=15 loops=414)
 ->  Seq Scan on td_sub dsub  (cost=0.00..1.03 rows=3
width=9) (actual time=0.003..0.007 rows=3 loops=414)
   ->  Seq Scan on td_div ddiv  (cost=0.00..1.03 rows=3 width=9)
(actual time=0.002..0.007 rows=3 loops=414)
 ->  Hash Join  (cost=994.41..71746.74 rows=388 width=114) (actual
time=5.298..4218.486 rows=857 loops=414)
   Hash Cond: (ive.sequen = ven.sequen)
   ->  Nested Loop  (cost=0.00..68318.52 rows=647982 width=85)
(actual time=0.026..3406.170 rows=643739 loops=414)
 ->  Seq Scan on td_nat nat  (cost=0.00..1.24 rows=1
width=9) (actual time=0.004..0.014 rows=1 loops=414)
   Filter: (-3::numeric = codtab)
 ->  Seq Scan on tt_ive ive  (cost=0.00..61837.46
rows=647982 width=76) (actual time=0.017..1926.983 rows=643739 loops=414)
   Filter: ((sitmov <> 'C'::bpchar) AND
('001'::bpchar = codfil))
   ->  Hash  (cost=992.08..992.08 rows=186 width=89) (actual
time=33.234..33.234 rows=394 loops=1)
 ->  Hash Left Join  (cost=3.48..992.08 rows=186
width=89) (a

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Josh Berkus
Tom, all:

> Also, have you tried watching vmstat or local equivalent to confirm that
> the machine's not starting to swap?

We're not swapping.

> Josh, what sort of workload is being tested here --- read-mostly,
> write-mostly, a mixture?

It's a TPCC-like workload, so heavy single-row updates, and the 
updates/inserts are what's being measured.  For that matter, when I've seen 
this before it was with heavy-write workloads and we were measuring the 
number of updates/inserts and not the number of reads.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Problems with posting

2007-07-19 Thread Carlos H. Reimer
Hi, I'm trying to post the following message to the performance group but
the message does not appears in the list.

Can someone help to solve this issue?

Thanks in advance!


___

Hi,

One of our end users was complaining about a report that was taking too much
time to execute and I´ve discovered that the following SQL statement was the
responsible for it.

I would appreciate any suggestions to improve performance of it.

Thank you very much in advance!


_

explain analyze select (VEN.DOCUME)::varchar(13) as COLUNA0,
   (VENCODPGT.APEPGT)::varchar(9) as COLUNA1,
   (COALESCE(COALESCE(VEN.VLRLIQ,0) * (CASE  VEN.VLRNOT  WHEN 0
THEN  0 ELSE  IVE.VLRMOV / VEN.VLRNOT  END),0)) as COLUNA2,
   (COALESCE(IVE.QTDMOV,0)) as COLUNA3,
   (VIPR.NOMPRO)::varchar(83) as COLUNA4,
   (VIPR.REFPRO)::varchar(20) as COLUNA5
from TV_VEN VEN
  inner join TT_IVE IVE ON IVE.SEQUEN = VEN.SEQUEN and
IVE.CODFIL = VEN.CODFIL
  inner join TV_IPR VIPR ON VIPR.FILMAT = IVE.FILMAT and
VIPR.CODMAT = IVE.CODMAT and
VIPR.CODCOR = IVE.CODCOR and
VIPR.CODTAM = IVE.CODTAM

 left join TT_PLA VENCODPGT ON VEN.FILPGT = VENCODPGT.FILPGT AND
VEN.CODPGT = VENCODPGT.CODPGT
where ('001' = VEN.CODFIL)
and VEN.DATHOR between '07/12/2007 00:00:00' and '07/12/2007
23:59:59'
and (VEN.CODNAT = '-3')
and IVE.SITMOV <> 'C'
and ('1' = VIPR.DEPART) ;



---
 Nested Loop Left Join  (cost=995.52..75661.01 rows=1 width=195) (actual
time=4488.166..1747121.374 rows=256 loops=1)
   ->  Nested Loop  (cost=995.52..75660.62 rows=1 width=199) (actual
time=4481.323..1747105.903 rows=256 loops=1)
 Join Filter: ((gra.filmat = ive.filmat) AND (gra.codmat =
ive.codmat) AND (gra.codcor = ive.codcor) AND (gra.codtam = ive.codtam))
 ->  Nested Loop  (cost=1.11..3906.12 rows=1 width=151) (actual
time=15.626..128.934 rows=414 loops=1)
   Join Filter: (div.coddiv = ddiv.codtab)
   ->  Nested Loop  (cost=1.11..3905.05 rows=1 width=160)
(actual time=15.611..121.455 rows=414 loops=1)
 Join Filter: (sub.codsub = dsub.codtab)
 ->  Nested Loop  (cost=1.11..3903.99 rows=1 width=169)
(actual time=15.593..113.866 rows=414 loops=1)
   Join Filter: ((gra.codcor)::text =
((div.codite)::text || ''::text))
   ->  Hash Join  (cost=1.11..3888.04 rows=11
width=146) (actual time=15.560..85.376 rows=414 loops=1)
 Hash Cond: ((gra.codtam)::text =
((sub.codite)::text || ''::text))
 ->  Nested Loop  (cost=0.00..3883.64
rows=423 width=123) (actual time=15.376..81.482 rows=414 loops=1)
   ->  Index Scan using i_fk_pro_ddep on
tt_pro pro  (cost=0.00..149.65 rows=516 width=77) (actual
time=15.244..30.586 rows=414 loops=1)
 Index Cond: (1::numeric =
depart)
   ->  Index Scan using pk_gra on tt_gra
gra  (cost=0.00..7.22 rows=1 width=46) (actual time=0.104..0.110 rows=1
loops=414)
 Index Cond: ((pro.filmat =
gra.filmat) AND (pro.codmat = gra.codmat))
 ->  Hash  (cost=1.05..1.05 rows=5 width=32)
(actual time=0.048..0.048 rows=5 loops=1)
   ->  Seq Scan on tt_sub sub
(cost=0.00..1.05 rows=5 width=32) (actual time=0.016..0.024 rows=5 loops=1)
   ->  Seq Scan on tt_div div  (cost=0.00..1.15
rows=15 width=32) (actual time=0.004..0.022 rows=15 loops=414)
 ->  Seq Scan on td_sub dsub  (cost=0.00..1.03 rows=3
width=9) (actual time=0.003..0.007 rows=3 loops=414)
   ->  Seq Scan on td_div ddiv  (cost=0.00..1.03 rows=3 width=9)
(actual time=0.002..0.007 rows=3 loops=414)
 ->  Hash Join  (cost=994.41..71746.74 rows=388 width=114) (actual
time=5.298..4218.486 rows=857 loops=414)
   Hash Cond: (ive.sequen = ven.sequen)
   ->  Nested Loop  (cost=0.00..68318.52 rows=647982 width=85)
(actual time=0.026..3406.170 rows=643739 loops=414)
 ->  Seq Scan on td_nat nat  (cost=0.00..1.24 rows=1
width=9) (actual time=0.004..0.014 rows=1 loops=414)
   Filter: (-3::numeric = codtab)
 ->  Seq Scan on tt_ive ive  (cost=0.00..61837.46
rows=647982 width=76) (actual time=0.017..1926.983

Re: [PERFORM] Improving select peformance

2007-07-19 Thread Tom Lane
"Carlos H. Reimer" <[EMAIL PROTECTED]> writes:
> One of our end users was complaining about a report that was taking too much
> time to execute and I´ve discovered that the following SQL statement was the
> responsible for it.

Here's part of the problem:

>Join Filter: ((gra.codcor)::text =
> ((div.codite)::text || ''::text))
>->  Hash Join  (cost=1.11..3888.04 rows=11
> width=146) (actual time=15.560..85.376 rows=414 loops=1)
>  Hash Cond: ((gra.codtam)::text =
> ((sub.codite)::text || ''::text))

Why such bizarre join conditions?  Why don't you lose the useless
concatenations of empty strings and have just a plain equality
comparison?  This technique completely destroys any chance of the
planner making good estimates of the join result sizes (and the bad
estimates it's coming out with are part of the problem).

>->  Nested Loop  (cost=0.00..68318.52 rows=647982 width=85)
> (actual time=0.026..3406.170 rows=643739 loops=414)
>  ->  Seq Scan on td_nat nat  (cost=0.00..1.24 rows=1
> width=9) (actual time=0.004..0.014 rows=1 loops=414)
>Filter: (-3::numeric = codtab)
>  ->  Seq Scan on tt_ive ive  (cost=0.00..61837.46
> rows=647982 width=76) (actual time=0.017..1926.983 rows=643739 loops=414)
>Filter: ((sitmov <> 'C'::bpchar) AND
> ('001'::bpchar = codfil))

The other big problem seems to be that it's choosing to do this
unconstrained join first.  I'm not sure about the cause of that,
but maybe you need to increase join_collapse_limit.  What PG version
is this anyway?

A more general comment, if you are open to schema changes, is that you
should change all the "numeric(n,0)" fields to integer (or possibly
smallint or bigint as needed).  Particularly the ones that are used as
join keys, primary keys, foreign keys.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Tom Lane
Josh Berkus <[EMAIL PROTECTED]> writes:
>> Josh, what sort of workload is being tested here --- read-mostly,
>> write-mostly, a mixture?

> It's a TPCC-like workload, so heavy single-row updates, and the 
> updates/inserts are what's being measured.  For that matter, when I've seen 
> this before it was with heavy-write workloads and we were measuring the 
> number of updates/inserts and not the number of reads.

Well, if the load is a lot of short writing transactions then you'd
expect the throughput to depend on how fast stuff can be pushed down to
WAL.  What have you got wal_buffers set to?  Are you using a commit
delay?  What's the I/O system anyway (any BB write cache on the WAL
disk?) and what wal sync method are you using?

While I'm asking questions, exactly what were the data columns you
presented?  Txn/User doesn't make much sense to me, and I'm not sure
what "response time" you were measuring either.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Josh Berkus
Tom,

> Well, if the load is a lot of short writing transactions then you'd
> expect the throughput to depend on how fast stuff can be pushed down to
> WAL.  What have you got wal_buffers set to?  Are you using a commit
> delay?  What's the I/O system anyway (any BB write cache on the WAL
> disk?) and what wal sync method are you using?

You know, I think Jignesh needs to me on this list so I can stop relaying 
questions on a workload I didn't design.  Let me get him.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] 8.2 -> 8.3 performance numbers

2007-07-19 Thread Jim Nasby
Sorry for the cross-post, but this is performance and advocacy  
related...


Has anyone benchmarked HEAD against 8.2? I'd like some numbers to use  
in my OSCon lightning talk. Numbers for both with and without HOT  
would be even better (I know we've got HOT-specific benchmarks, but I  
want complete 8.2 -> 8.3 numbers).

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings