Re: [PERFORM] Optimising "in" queries

2007-08-23 Thread Russell Smith

Michael Glaesemann wrote:


On Aug 22, 2007, at 5:58 , Russell Smith wrote:


Stephen Davies wrote:
select count(rdate),rdate from reading where sensor_id in 
(1137,1138,1139,1140) group by rdate order by rdate desc limit 1;



It would have been helpful to see the table definition here.  I can 
say up front that array processing in postgres is SLOW.


Um, what array processing are you seeing here? IN (a, b, b) is not an 
array construct.


Filter: (sensor_id = ANY ('{1137,1138,1139,1140}'::integer[]))

I've never seen this plan item except for when array's are involved.  I 
could be wrong.  I'd like to know how this is generated when you don't 
have an array.


Regards

Russell Smith

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Optimising "in" queries

2007-08-23 Thread Russell Smith

Russell Smith wrote:


Filter: (sensor_id = ANY ('{1137,1138,1139,1140}'::integer[]))

I've never seen this plan item except for when array's are involved.  I 
could be wrong.  I'd like to know how this is generated when you don't 
have an array.




I have just discovered that PG 8.2 will turn an IN clause into an array 
search instead of an OR list.  Sorry all.


Regards

Russell Smith

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM] asynchronous commit feature

2007-08-23 Thread Merlin Moncure
I'm testing the new asynch commit feature on various raid
configurations and my early findings is that it reduces the impact of
keeping wal and data on the same volume.  I have 10 disks to play
with, and am finding that it's faster to do a 10 drive raid 10 rather
than 8 drive raid 10 + two drive wal.

anybody curious about the results, feel free to drop a line.  I think
this will be a popular feature.

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] io storm on checkpoints, postgresql 8.2.4, linux

2007-08-23 Thread Dmitry Potapov
2007/8/23, Greg Smith <[EMAIL PROTECTED]>:
>
> On Wed, 22 Aug 2007, Dmitry Potapov wrote:
>
> If you do end up following up with this via the Linux kernel mailing list,
> please pass that link along.  I've been meaning to submit it to them and
> wait for the flood of e-mail telling me what I screwed up, that will go
> better if you tell them about it instead of me.


I'm planning to do so, but before I need to take a look at postgresql source
and dev documentation to find how exactly IO is done, to be able to explain
the issue to linux kernel people.  That will take some time, I'll post a
link here when I'm done.


> > looks to me as an elegant solution. Is there some other way to fix this
> > issue without disabling pagecache and the IO smoothing it was designed
> > to perform?
>
> I spent a couple of months trying and decided it was impossible.  Your
> analysis of the issue is completely accurate; lowering
> dirty_background_ratio to 0 makes the system much less efficient, but it's
> the only way to make the stalls go completely away.


By the way, does postgresql has a similar stall problem on freebsd/other
OS'es? It would be interesting to study their approach to io smoothing if it
doesn't.

I contributed some help toward fixing the issue in the upcoming 8.3
> instead; there's a new checkpoint writing process aimed to ease the exact
> problem you're running into there, see the new
> checkpoint_completion_target tunable at
> http://developer.postgresql.org/pgdocs/postgres/wal-configuration.html
>
> If you could figure out how to run some tests to see if the problem clears
> up for you using the new technique, that would be valuable feedback for
> the development team for the upcoming 8.3 beta.  Probably more productive
> use of your time than going crazy trying to fix the issue in 8.2.4.

We have a tool here to record and replay the exact workload we have on a
real production system, the only problem is getting a spare 16Gb box. I can
get a server with 8Gb ram and nearly same storage setup for testing
purposes. I hope it will be able to carry the production load, so I can
compare 8.2.4 and 8.3devel on the same box, in the same situation. Is there
any other changes in 8.3devel that can affect the results of such test? I
didn't really follow 8.3 development process :(

-- 
Regards,
Dmitry


[PERFORM] Installing PostgreSQL

2007-08-23 Thread Campbell, Lance
Should installation questions be sent here or to the admin listserv?

 

OS: redhat linux

Version of PostgreSQL: 8.2.4

 

I had a group that now manages our server set up a directory/partition
for us to put postgreSQL into.  The directory is called pgsql_data.  The
directory is more than a regular directory.  It contains a subdirectory
called "lost+found".  I would assume this is a logical partition.  I
tried installing postgreSQL directly into this directory but it failed
since there is a file in this directory, "lost+found".  Is there a way
around this?  Worst case scenario I will create a subdirectory called
data and put the install in there.  I would have preferred to put it
directly into the pgsql_data.  There would be no other files that would
have gone into the directory/partition other than postgreSQL.  Would it
be possible for me to install postgreSQL into a sub directory of
pgsql_data and then move the files up a directory into pgsql_data?

 

Thanks,

 

 

Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu

 



Re: [PERFORM] io storm on checkpoints, postgresql 8.2.4, linux

2007-08-23 Thread Greg Smith

On Thu, 23 Aug 2007, Dmitry Potapov wrote:


I'm planning to do so, but before I need to take a look at postgresql source
and dev documentation to find how exactly IO is done, to be able to explain
the issue to linux kernel people.


I can speed that up for you. 
http://developer.postgresql.org/index.php/Buffer_Cache%2C_Checkpoints%2C_and_the_BGW 
outlines all the source code involved.  Easiest way to browse through the 
code is it via http://doxygen.postgresql.org/ , eventually I want to 
update the page so it points right into the appropriate doxygen spots but 
haven't gotten to that yet.



By the way, does postgresql has a similar stall problem on freebsd/other
OS'es? It would be interesting to study their approach to io smoothing if it
doesn't.


There's some evidence that something about Linux aggrevates the problem; 
check out 
http://archives.postgresql.org/pgsql-hackers/2007-07/msg00261.php and the 
rest of the messages in that thread.  I haven't heard a report of this 
problem from someone who isn't running Linux, but as it requires a certain 
level of hardware and a specific type of work load I'm not sure if this is 
coincidence or a cause/effect relationship.


Is there any other changes in 8.3devel that can affect the results of 
such test?


The "all" component of the background writer was removed as it proved not 
to be useful once checkpoint_completion_target was introduced.  And the 
LRU background writer keeps going while checkpoints are being trickled 
out, in earlier versions that didn't happen.


The test I'd like to see more people run is to simulate their workloads 
with checkpoint_completion_target set to 0.5, 0.7, and 0.9 and see how 
each of those settings works relative to the 8.2 behavior.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Installing PostgreSQL

2007-08-23 Thread Richard Huxton

Campbell, Lance wrote:

Should installation questions be sent here or to the admin listserv?


Probably the pgsql-general/admin/novice lists


OS: redhat linux


RHES?


Version of PostgreSQL: 8.2.4


OK


I had a group that now manages our server set up a directory/partition
for us to put postgreSQL into.  The directory is called pgsql_data.  The
directory is more than a regular directory.  It contains a subdirectory
called "lost+found".  I would assume this is a logical partition. 


No - if you get filesystem corruption any recovered disk-blocks are put 
into files here. All your disk partitions will have such a directory.


> I

tried installing postgreSQL directly into this directory but it failed
since there is a file in this directory, "lost+found".  Is there a way
around this?  Worst case scenario I will create a subdirectory called
data and put the install in there.


That's what you want to do. Apart from anything else it lets you set 
ownership & permission of the directory.


>  I would have preferred to put it

directly into the pgsql_data.  There would be no other files that would
have gone into the directory/partition other than postgreSQL.  Would it
be possible for me to install postgreSQL into a sub directory of
pgsql_data and then move the files up a directory into pgsql_data?


Just symlink your directory to the correct place if that's what you want.

Partition at: /mnt/pg_disk
Directory is: /mnt/pg_disk/data
symlink to:   /var/db/data

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] long-running query - needs tuning

2007-08-23 Thread Kevin Kempter
Hi List;

I've just started working with a new client and they have amoung other issues 
with their databases a particular update that basically locks out users. 

The below query was running for over 6 hours this morning and the CPU load had 
climbed to a point where new connections simply hung waiting to connect. We 
had to kill the query to allow business users to connect to the applications 
that connect to the database, thus I could not post an explain analyze.

In any case the query looks like this:

update dat_customer_mailbox_counts
set total_contacts = contacts.ct,
total_contact_users = contacts.dct
from
( select customer_id, count(*) as ct,
count( distinct con.user_id ) as dct
from dat_user_contacts con
group by customer_id)
contacts where contacts.customer_id = dat_customer_mailbox_counts.customer_id

Here's the latest counts from the system catalogs:

dat_customer_mailbox_counts:   423
dat_user_contacts   59,469,476



And here's an explain plan:

 QUERY PLAN

 Merge Join  (cost=17118858.51..17727442.30 rows=155 width=90)
   Merge Cond: ("outer".customer_id = "inner".customer_id)
   ->  GroupAggregate  (cost=17118772.93..17727347.34 rows=155 width=8)
 ->  Sort  (cost=17118772.93..17270915.95 rows=60857208 width=8)
   Sort Key: con.customer_id
   ->  Seq Scan on dat_user_contacts con  (cost=0.00..7332483.08 
rows=60857208 width=8)
   ->  Sort  (cost=85.57..88.14 rows=1026 width=74)
 Sort Key: dat_customer_mailbox_counts.customer_id
 ->  Seq Scan on dat_customer_mailbox_counts  (cost=0.00..34.26 
rows=1026 width=74)
(9 rows)


Any thoughts, comments, Ideas for debugging, etc would be way helpful...

Thanks in advance.

/Kevin





---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Installing PostgreSQL

2007-08-23 Thread Campbell, Lance
Richard,
So what you are saying is that if you install PostgeSQL into a data
directory /abc/data you could then stop the database, move the files
into /def/data, and then start the database making sure to point to the
new data directory.  PostgreSQL is therefore referencing its files
relative to the "data" directory the files are in.

Is this a correct observation?

Thanks,

Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
http://webservices.uiuc.edu
 

-Original Message-
From: Richard Huxton [mailto:[EMAIL PROTECTED] 
Sent: Thursday, August 23, 2007 12:08 PM
To: Campbell, Lance
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Installing PostgreSQL

Campbell, Lance wrote:
> Should installation questions be sent here or to the admin listserv?

Probably the pgsql-general/admin/novice lists

> OS: redhat linux

RHES?

> Version of PostgreSQL: 8.2.4

OK

> I had a group that now manages our server set up a directory/partition
> for us to put postgreSQL into.  The directory is called pgsql_data.
The
> directory is more than a regular directory.  It contains a
subdirectory
> called "lost+found".  I would assume this is a logical partition. 

No - if you get filesystem corruption any recovered disk-blocks are put 
into files here. All your disk partitions will have such a directory.

 > I
> tried installing postgreSQL directly into this directory but it failed
> since there is a file in this directory, "lost+found".  Is there a way
> around this?  Worst case scenario I will create a subdirectory called
> data and put the install in there.

That's what you want to do. Apart from anything else it lets you set 
ownership & permission of the directory.

 >  I would have preferred to put it
> directly into the pgsql_data.  There would be no other files that
would
> have gone into the directory/partition other than postgreSQL.  Would
it
> be possible for me to install postgreSQL into a sub directory of
> pgsql_data and then move the files up a directory into pgsql_data?

Just symlink your directory to the correct place if that's what you
want.

Partition at: /mnt/pg_disk
Directory is: /mnt/pg_disk/data
symlink to:   /var/db/data

-- 
   Richard Huxton
   Archonet Ltd

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Installing PostgreSQL

2007-08-23 Thread Greg Smith

On Thu, 23 Aug 2007, Campbell, Lance wrote:


Should installation questions be sent here or to the admin listserv?


admin or general would be more appropriate for this type of question.

The directory is called pgsql_data.  The directory is more than a 
regular directory.  It contains a subdirectory called "lost+found".  I 
would assume this is a logical partition.


It's a partition of some sort.  lost+found shows up in the root directory 
of any partition you create, it's where damaged files found by fsck go. 
See http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/lostfound.html for 
more information.


I tried installing postgreSQL directly into this directory but it 
failed since there is a file in this directory, "lost+found".  Is there 
a way around this?  Worst case scenario I will create a subdirectory 
called data and put the install in there.


You will have to create subdirectory in this new partition in order for 
initdb to have a place it can work in.  What you should probably do here 
is have your administrator rename the mount point to something more 
generic, like "data" or "postgres", to avoid confusion here; then you'd 
have PGDATA pointing to data/pgsql_data or postgres/pgsql_data which won't 
be as confusing.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Installing PostgreSQL

2007-08-23 Thread Richard Huxton

Campbell, Lance wrote:

Richard,
So what you are saying is that if you install PostgeSQL into a data
directory /abc/data you could then stop the database, move the files
into /def/data, and then start the database making sure to point to the
new data directory.  PostgreSQL is therefore referencing its files
relative to the "data" directory the files are in.

Is this a correct observation?


Yes - provided:
1. Ownership and permissions on the destination directory are correct
2. You remember to stop the server when copying


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Installing PostgreSQL

2007-08-23 Thread Campbell, Lance
Richard,
I was able to prove that it works.  Thanks for your time.

Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
http://webservices.uiuc.edu
 

-Original Message-
From: Richard Huxton [mailto:[EMAIL PROTECTED] 
Sent: Thursday, August 23, 2007 12:26 PM
To: Campbell, Lance
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Installing PostgreSQL

Campbell, Lance wrote:
> Richard,
> So what you are saying is that if you install PostgeSQL into a data
> directory /abc/data you could then stop the database, move the files
> into /def/data, and then start the database making sure to point to
the
> new data directory.  PostgreSQL is therefore referencing its files
> relative to the "data" directory the files are in.
> 
> Is this a correct observation?

Yes - provided:
1. Ownership and permissions on the destination directory are correct
2. You remember to stop the server when copying


-- 
   Richard Huxton
   Archonet Ltd

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Optimising "in" queries

2007-08-23 Thread Stephen Davies
Interesting semantics. I have never seen  the IN syntax referred to as 
"array processing" before.

I have always thought of array processing as the thing that vector 
processors such as Cray and ETA do/did.

While superficially equivalent, I have always believed that IN (a,b,c) 
executed faster than =a or =b or =c. Am I wrong for PostgreSQL?

Stephen

 On Wednesday 22 August 2007 22:55, Michael Glaesemann wrote:
> On Aug 22, 2007, at 5:58 , Russell Smith wrote:
> > Stephen Davies wrote:
> >> select count(rdate),rdate from reading where sensor_id in
> >> (1137,1138,1139,1140) group by rdate order by rdate desc limit 1;
> >
> > It would have been helpful to see the table definition here.  I can
> > say up front that array processing in postgres is SLOW.
>
> Um, what array processing are you seeing here? IN (a, b, b) is not an
> array construct.
>
> Michael Glaesemann
> grzm seespotcode net

-- 

This email is for the person(s) identified above, and is confidential to
the sender and the person(s).  No one else is authorised to use or
disseminate this email or its contents.

Stephen Davies ConsultingVoice: 08-8177 1595
Adelaide, South Australia. Fax: 08-8177 0133
Computing & Network solutions.   Mobile:0403 0405 83

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] deadlock_timeout parameter in Postgresql.cof

2007-08-23 Thread Sachchida Ojha
I am having some dead locking problem with my app system. Our dev are
debugging the app to find out the cause of the problem. In the mean time
I looked at postgresql.conf file. I found that there is a parameter in
postgresql.conf file deadlock_timeout which was set 1000 (ms).  Normally
I see deadlock in the night or when auto vacuum is running for a long
time.

 

My question is 

 

What is the significance of this parameter and updating this parameter
value will make any difference ? 

 

Thanks

Regards

Sachchida N Ojha

[EMAIL PROTECTED]  

Secure Elements Incorporated

198 Van Buren Street, Suite 110

Herndon Virginia 20170-5338 USA

 

http://www.secure-elements.com/  

 

800-709-5011 Main

703-709-2168 Direct

703-709-2180 Fax

This email message and any attachment to this email message is intended
only for the use of the addressee(s) named above. If the reader of this
message is not the intended recipient or the employee or agent
responsible for delivering the message to the intended recipient(s),
please note that any distribution or copying of this communication is
strictly prohibited.  If you have received this email in error, please
notify me immediately and delete this message.  Please note that if this
email contains a forwarded message or is a reply to a prior message,
some or all of the contents of this message or any attachments may not
have been produced by the sender.

 



[PERFORM] deadlock_timeout parameter in Postgresql.cof

2007-08-23 Thread Sachchida Ojha
I am having some dead locking problem with my app system. Our dev are
debugging the app to find out the cause of the problem. In the mean time
I looked at postgresql.conf file. I found that there is a parameter in
postgresql.conf file deadlock_timeout which was set 1000 (ms).  Normally
I see deadlock in the night or when auto vacuum is running for a long
time.

 

My question is 

 

What is the significance of this parameter and updating this parameter
value will make any difference ? 

 

Thanks

Regards

sachi

 

 



Re: [PERFORM] deadlock_timeout parameter in Postgresql.cof

2007-08-23 Thread Bill Moran
In response to "Sachchida Ojha" <[EMAIL PROTECTED]>:

> I am having some dead locking problem with my app system. Our dev are
> debugging the app to find out the cause of the problem. In the mean time
> I looked at postgresql.conf file. I found that there is a parameter in
> postgresql.conf file deadlock_timeout which was set 1000 (ms).  Normally
> I see deadlock in the night or when auto vacuum is running for a long
> time.
> 
> My question is 
> 
> What is the significance of this parameter and updating this parameter
> value will make any difference ? 

Does the documentation leave anything unanswered?
http://www.postgresql.org/docs/8.2/static/runtime-config-locks.html

-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

[EMAIL PROTECTED]
Phone: 412-422-3463x4023

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] long-running query - needs tuning

2007-08-23 Thread Tom Lane
Kevin Kempter <[EMAIL PROTECTED]> writes:
>  Merge Join  (cost=17118858.51..17727442.30 rows=155 width=90)
>Merge Cond: ("outer".customer_id = "inner".customer_id)
>->  GroupAggregate  (cost=17118772.93..17727347.34 rows=155 width=8)
>  ->  Sort  (cost=17118772.93..17270915.95 rows=60857208 width=8)
>Sort Key: con.customer_id
>->  Seq Scan on dat_user_contacts con  (cost=0.00..7332483.08 
> rows=60857208 width=8)
>->  Sort  (cost=85.57..88.14 rows=1026 width=74)
>  Sort Key: dat_customer_mailbox_counts.customer_id
>  ->  Seq Scan on dat_customer_mailbox_counts  (cost=0.00..34.26 
> rows=1026 width=74)

The planner, at least, thinks that all the time will go into the sort
step.  Sorting 60M rows is gonna take awhile :-(.  What PG version is
this?  (8.2 has noticeably faster sort code than prior releases...)
What have you got work_mem set to?

Bad as the sort is, I suspect that the real problem is the
count(distinct) operator, which is going to require *another*
sort-and-uniq step for each customer_id group --- and judging by
the rowcount estimates, at least some of those groups must be
pretty large.  (AFAIR this time is not counted in the planner
estimates.)  Again, work_mem would have an effect on how fast
that goes.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Optimising "in" queries

2007-08-23 Thread Alvaro Herrera
Stephen Davies wrote:
> Interesting semantics. I have never seen  the IN syntax referred to as 
> "array processing" before.
> 
> I have always thought of array processing as the thing that vector 
> processors such as Cray and ETA do/did.
> 
> While superficially equivalent, I have always believed that IN (a,b,c) 
> executed faster than =a or =b or =c. Am I wrong for PostgreSQL?

Older versions of Postgres translated IN (a, b, c) into an OR'ed list of
equalities.  Nowadays it is treated as an array; I think it's translated
to = ANY ({a,b,c}), as you can see in the message you posted at the
start of this thread.

I don't think you showed us the EXPLAIN ANALYZE results that Scott
requested.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Raid Configurations

2007-08-23 Thread Decibel!

On Aug 17, 2007, at 6:55 PM, Merlin Moncure wrote:

So, I'd be looking at a large raid 10 and 1-2 drives for the WAL...on
a raid 1.  If your system supports two controllers (in either
active/active or active/passive), you should look at second controller
as well.


If you only have one controller, and it can cache writes (it has a  
BBU), I'd actually lean towards putting all 12 drives into one raid  
10. A good controller will be able to handle WAL fsyncs plenty fast  
enough, so having a separate WAL mirror would likely hurt more than  
help.

--
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] When/if to Reindex

2007-08-23 Thread Steven Flatt
On 8/22/07, Gregory Stark <[EMAIL PROTECTED]> wrote:

> postgres=# create table test (i integer);
> CREATE TABLE
> postgres=# insert into test select generate_series(1,1000);
> INSERT 0 1000
> postgres=# create or replace function slow(integer) returns integer as
> 'begin perform pg_sleep(0); return $1; end' language plpgsql immutable
> strict;
> CREATE FUNCTION
> postgres=# create index slowi on test (slow(i));
> CREATE INDEX
> postgres=# create or replace function slow(integer) returns integer as
> 'begin perform pg_sleep(1); return $1; end' language plpgsql immutable
> strict;
> CREATE FUNCTION
> postgres=# reindex index slowi;
>
> While that's running I ran:
>
> postgres=# select count(*) from test;
> count
> ---
> 1000
> (1 row)


Interestingly enough, the example you've given does not work for me either.
The select count(*) from test blocks until the reindex completes.  Are we
using the same pg version?

# select version();

version



 PostgreSQL 8.2.4 on i386-portbld-freebsd6.1, compiled by GCC cc (GCC)
3.4.4[FreeBSD] 20050518
(1 row)
Looking at the pg_locks table, I see:


# select locktype,relation,mode,granted from pg_locks where not granted;
 locktype | relation |  mode   | granted
--+--+-+-
 relation |69293 | AccessShareLock | f
(1 row)

# select relname from pg_class where oid = 69293;
 relname
-
 slowi
(1 row)

# select locktype,relation,mode,granted from pg_locks where relation =
69293;
 locktype | relation |mode | granted
--+--+-+-
 relation |69293 | AccessShareLock | f
 relation |69293 | AccessExclusiveLock | t
(2 rows)
So the reindex statement has an AccessExclusiveLock on the index, which
seems right, and this blocks the select count(*) from getting an
AccessShareLock on the index.  Why does the select count(*) need a lock on
the index?  Is there some Postgres setting that could cause this behaviour?
I can't even do an "explain select count(*) from test" without blocking.

Any ideas?

Steve


Re: [PERFORM] Optimising "in" queries

2007-08-23 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Stephen Davies wrote:
>> While superficially equivalent, I have always believed that IN (a,b,c) 
>> executed faster than =a or =b or =c. Am I wrong for PostgreSQL?

> Older versions of Postgres translated IN (a, b, c) into an OR'ed list of
> equalities.  Nowadays it is treated as an array; I think it's translated
> to = ANY ({a,b,c}), as you can see in the message you posted at the
> start of this thread.

If you're dealing with tables large enough that the index search work is
the dominant cost, all these variants ought to be exactly the same.
However, for smaller tables the planning time and executor startup time
are interesting, and on those measures the = ANY(array) formulation
should win because there's less "stuff" for the system to look at.
With "x=a OR x=b OR x=c" the planner actually has to deduce three times
that an indexscan on x is possible; with "x = ANY(ARRAY[a,b,c])" it
does that only once.  That's why I changed IN to expand to an array
construct instead of an OR tree.  I have to confess not having tried to
measure the consequences carefully, though.  I suspect it's not all
that interesting at only three items ... it's lists of hundreds or
thousands of items where this becomes a big deal.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] When/if to Reindex

2007-08-23 Thread Tom Lane
"Steven Flatt" <[EMAIL PROTECTED]> writes:
> Interestingly enough, the example you've given does not work for me either.
> The select count(*) from test blocks until the reindex completes.  Are we
> using the same pg version?

Seems like a fair question, because Greg's example blocks for me too,
in plancat.c where the planner is trying to acquire information on each
index.  This seems to be an unwanted side effect of this 8.2-era patch
http://archives.postgresql.org/pgsql-committers/2006-07/msg00356.php
specifically, note here
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/optimizer/util/plancat.c.diff?r1=1.121;r2=1.122;f=h
how the new planner coding takes at least AccessShareLock on each index,
where the old coding took no lock at all.

I think that the new coding rule of "you *must* take some lock when
opening the relation" is essential for tables, but it might not be
necessary for indexes if you've got a lock on the parent table.
We don't allow any schema changes on an index to be made without holding
exclusive lock on the parent, so plancat.c's basic purpose of finding
out the properties of the index could be done safely without any index
lock.

The fly in the ointment is that after collecting the pg_index definition
of the index, plancat.c also wants to know how big it is --- it calls
RelationGetNumberOfBlocks.  And that absolutely does look at the
physical storage, which means it absolutely is unsafe to do in parallel
with a REINDEX that will be dropping the old physical storage at some
point.

So maybe we are stuck and we have to say "that doesn't work anymore".
But it feels like we might not be too far away from letting it still
work.  Thoughts, ideas?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] When/if to Reindex

2007-08-23 Thread Mark Kirkwood

Tom Lane wrote:


The fly in the ointment is that after collecting the pg_index definition
of the index, plancat.c also wants to know how big it is --- it calls
RelationGetNumberOfBlocks.  And that absolutely does look at the
physical storage, which means it absolutely is unsafe to do in parallel
with a REINDEX that will be dropping the old physical storage at some
point.

So maybe we are stuck and we have to say "that doesn't work anymore".
But it feels like we might not be too far away from letting it still
work.  Thoughts, ideas?
  


A suggestion that seems a bit like a leap backwards in time - maybe just 
use the pg_class.relpages entry for the index size?


I'm punting that with autovacuum being enabled by default now, the 
relpages entries for all relations will be more representative than they 
used to in previous releases.


Cheers

Mark


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] When/if to Reindex

2007-08-23 Thread Gregory Stark
"Steven Flatt" <[EMAIL PROTECTED]> writes:

> On 8/22/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
>
> Interestingly enough, the example you've given does not work for me either.
> The select count(*) from test blocks until the reindex completes.  Are we
> using the same pg version?

I was using CVS head but given Tom's explanation I wouldn't expect to see any
different behaviour here.

I just retried it and it did block. I can't think of anything I could have
done wrong last time to make it appear not to block. If I had missed an error
at some point along the way I would have expected the reindex to complete
quickly or fail or something but it was definitely just blocked. I remember
noting (much) later that it had finished.

Strange.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org