Re: [PERFORM] Why the planner is not using the INDEX .

2005-07-05 Thread Bruno Wolff III
On Mon, Jul 04, 2005 at 20:29:50 -0400,
  David Gagnon <[EMAIL PROTECTED]> wrote:
> Thanks .. I miss that FK don't create indexed ...  since Primary key 
> implicitly does ...
> 
> I'm a bit surprised of that behavior thought, since it means that if we 
> delete a row from table A all tables (B,C,D) with FK pointing to this 
> table (A) must be scanned. 

But in some applications you don't ever do that, so you don't save
anything by having the index for deletes but have to pay the cost to
update it when modifying the referencing table.

If you think an index will help in your case, just create one.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Why the planner is not using the INDEX .

2005-07-05 Thread Stephan Szabo
On Mon, 4 Jul 2005, David Gagnon wrote:

> Thanks .. I miss that FK don't create indexed ...  since Primary key
> implicitly does ...
>
> I'm a bit surprised of that behavior thought, since it means that if we
> delete a row from table A all tables (B,C,D) with FK pointing to this
> table (A) must be scanned.
> If there is no index on those tables it means we gone do all Sequantial
> scans. Than can cause significant performance problem!!!.
>
> Is there a reason why implicit index aren't created when FK are
> declared.  I looked into the documentation and I haven't found a way to

The reason is that it's not always useful to have an index for that
purpose.  You could either have low selectivity (in which case the index
wouldn't be used) or low/batch changes to the referenced table (in which
case the cost of maintaining the index may be greater than the value of
having the index) or other such cases.  In primary key and unique, we
currently have no choice but to make an index because that's how the
constraint is currently implemented.

> tell postgresql to automatically create an index when creating la FK.
> Does it means I need to manage it EXPLICITLY with create index statement
> ?

Yeah.

>Is there another way ?

Not that I can think of without changing the source.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Why the planner is not using the INDEX .

2005-07-05 Thread Tom Lane
David Gagnon <[EMAIL PROTECTED]> writes:
>  explain analyse SELECT IRNUM FROM IR
> INNER JOIN IT ON  IT.ITIRNUM = ANY ('{1000, 2000}') AND 
> IT.ITYPNUM = 'M' AND IR.IRYPNUM = IT.ITYPNUM AND IR.IRNUM = IT.ITIRNUM 
> WHERE IRNUM = ANY ('{1000, 2000}') and IRYPNUM = 'M'

Those =ANY constructs are not currently optimizable at all.  You might
get better results with "IT.ITIRNUM IN (1000, 2000)" etc.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] Heavy virtual memory usage on production system

2005-07-05 Thread Alexander Stanier
We are having terrible performance issues with a production instance of 
PostgreSQL version 7.4.5, but are struggling with which parameters in 
the postgresql.conf to change.


Our database server is an Apple G5 (2 x 2GHz CPU, 2GB RAM). The 
operating system is Mac OS X 10.3.


The database seems to fine to start with, but then as the load increases 
it seems to reach a threshold where the number of non-idle queries in 
pg_stat_activity grows heavily and we appear to get something similar to 
a motorway tail back with up to perhaps 140 queries awaiting processing. 
At the same time the virtual memory usage (reported by the OS) appears 
to grow heavily too (sometimes up to 50GB). The CPUs do not seems to be 
working overly hard nor do the disks and the memory monitor reports 
about 600MB of inactive memory. Once in this situation, the database 
never catches up with itself and the only way to put it back on an even 
keel is to stop the application and restart database.


The system memory settings are:

kern.sysv.shmmax: 536870912
kern.sysv.shmmin: 1
kern.sysv.shmmni: 4096
kern.sysv.shmseg: 4096
kern.sysv.shmall: 131072

We have unlimited the number of processes and open files for the user 
running PostgreSQL (therefore max 2048 processes and max 12288 open files).


Non default postgresql parameters are:

tcpip_socket = true
max_connections = 500
unix_socket_directory = '/Local/PostgreSQL'
shared_buffers = 8192   # min 16, at least max_connections*2, 
8KB each

sort_mem = 2048 # min 64, size in KB
wal_buffers = 32# min 4, 8KB each
effective_cache_size = 10   # typically 8KB each
random_page_cost = 2# units are one sequential page fetch cost
log_min_error_statement = info # Values in order of increasing severity:
log_duration = true
log_pid = true
log_statement = true
log_timestamp = true
stats_command_string = true

although on the last restart I changed the following (since the current 
config clearly isn't working):


shared_buffers = 16384   # min 16, at least max_connections*2, 
8KB each

effective_cache_size = 1   # typically 8KB each

We don't know whether these have helped yet - but we should get a good 
idea around 10am tomorrow morning.


We currently have the application limited to a maximum of 40 concurrent 
connections to the database.


Our application produces a fairly varied mix of queries, some quite 
complex and plenty of them. We seem to average about 400,000 queries per 
hour. At first I thought it might be one or two inefficient queries 
blocking the CPUs but the CPUs don't seem to be very stretched. My guess 
is that we have our postgresql memory settings wrong, however, the is 
lots of conflicting advice about what to set (from 1000 to 10 shared 
buffers).


Does this heavy use of VM and query tail back indicate which memory 
settings are wrong? Presumably if there are 140 queries in 
pg_stat_activity then postgresql will be trying to service all these 
queries at once? I also presume that if VM usage is high then we are 
paging a vast amount to disk. But I am not sure why.


Has anyone seen this behaviour before and can anyone point me in the 
right direction?


Regards,

Alexander Stanier


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Heavy virtual memory usage on production system

2005-07-05 Thread Tom Lane
Alexander Stanier <[EMAIL PROTECTED]> writes:
> The database seems to fine to start with, but then as the load increases 
> it seems to reach a threshold where the number of non-idle queries in 
> pg_stat_activity grows heavily and we appear to get something similar to 
> a motorway tail back with up to perhaps 140 queries awaiting processing. 
> At the same time the virtual memory usage (reported by the OS) appears 
> to grow heavily too (sometimes up to 50GB). The CPUs do not seems to be 
> working overly hard nor do the disks and the memory monitor reports 
> about 600MB of inactive memory.

You shouldn't be putting a lot of credence in the virtual memory usage
then, methinks.  Some versions of top count the Postgres shared memory
against *each* backend process, leading to a wildly inflated figure for
total memory used.  I'd suggest watching the output of "vmstat 1" (or
local equivalent) to observe whether there's any significant amount of
swapping going on; if not, excessive memory usage isn't the problem.

Are you sure that the problem isn't at the level of some query taking an
exclusive lock and then sitting on it?  I would expect either CPU or
disk bandwidth or both to be saturated if you were having a conventional
resource limitation problem.  Again, comparing vmstat readings during
normal and slow response conditions would be instructive.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Why the planner is not using the INDEX .

2005-07-05 Thread David Gagnon

Tom Lane wrote:


David Gagnon <[EMAIL PROTECTED]> writes:
 


explain analyse SELECT IRNUM FROM IR
   INNER JOIN IT ON  IT.ITIRNUM = ANY ('{1000, 2000}') AND 
IT.ITYPNUM = 'M' AND IR.IRYPNUM = IT.ITYPNUM AND IR.IRNUM = IT.ITIRNUM 
   WHERE IRNUM = ANY ('{1000, 2000}') and IRYPNUM = 'M'
   



Those =ANY constructs are not currently optimizable at all.  You might
get better results with "IT.ITIRNUM IN (1000, 2000)" etc.

regards, tom lane

 

I already tried this construct.  But the statement comes from a stored 
procedure where the {1000, 2000} is an array variable (requestIds).  I 
tried to use


IT.ITIRNUM IN (requestIds) or several other variant without success.

Is there a way to make it work?  Here is the statement the statement from the 
store procedure.  Remenber requestIds is an array of int.


FOR inventoryTransaction IN
   SELECT DISTINCT IRNUM, IRAENUM, IRSTATUT, IRSENS, IRSOURCE, 
IRDATE, IRQTE

   FROM IR
   WHERE IRNUM = ANY (requestIds) and IRYPNUM = companyId
   LOOP

Thank for your help 
/David

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Planner constants for RAM resident databases

2005-07-05 Thread Josh Berkus
Emil,

>   ->  Merge Left Join  (cost=9707.71..13993.52 rows=1276 width=161)
> (actual time=164.423..361.477 rows=49 loops=1)

That would indicate that you need to either increase your statistical 
sampling (SET STATISTICS) or your frequency of running ANALYZE, or both.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]