[PERFORM] Performance issue

2003-09-24 Thread peter
Hello,

I have been trying to get my Postgres database to do faster inserts.

The environment is basically a single user situation.

The part that I would like to speed up is when a User copys a Project.
A Project consists of a number of Rooms(say 60). Each room contains a 
number of items.
A project will contain say 20,000 records.

Anyway the copying process gets slower and slower, as more projects are 
added to the database.

My statistics(Athlon 1.8Ghz)

20,000 itemsTakes on average 0.078seconds/room
385,000 items  Takes on average .11seconds/room
690,000 items  takes on average .270seconds/room
1,028,000 items   Takes on average .475seconds/room
As can be seen the time taken to process each room increases. A commit 
occurs when a room has been copied.
The hard drive  is not being driven very hard. The hard drive light 
only flashes about twice a second when there are a million records in 
the database.

I thought that the problem could have been my plpgsql procedure because 
I assume the code is interpreted.
However I have just rewriten the code using straight sql(with some temp 
fields),
and the times turn out to be almost exactly the same as the plpgsql 
version.

The read speed for the Application is fine. The sql planner seems to be 
doing a good job. There has been only one problem
that I have found with one huge select, which was fixed by a cross join.

 I am running Red hat 8. Some of my conf entries that I have changed 
follow
shared_buffers = 3700
effective_cache_size = 4000
sort_mem = 32168

Are the increasing times reasonable?
The times themselves might look slow, but thats because there are a 
number of tables involved in a Copy

I can increase the shared buffer sizes above 32M, but would this really 
help?

TIA

peter Mcgregor

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] Battery Backed Cache for RAID

2005-09-14 Thread Peter Darley
Folks,
I'm getting a new server for our database, and I have a quick question
about RAID controllers with a battery backed cache.  I understand that the
cache will allow the cache to be written out if the power fails to the box,
which allows it to report a write as committed safely when it's not actually
committed.
My question is, if the power goes off, and the drives stop, how does the
battery backed cache save things out to the dead drives?  Is there another
component that is implied that will provide power to the drives that I
should be looking into as well?
Thanks,
Peter Darley


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Inefficient escape codes.

2005-10-19 Thread Peter Childs
On 18/10/05, Michael Fuhr <[EMAIL PROTECTED]> wrote:
> [Please copy the mailing list on replies so others can participate
> in and learn from the discussion.]
>
> On Tue, Oct 18, 2005 at 07:09:08PM +, Rodrigo Madera wrote:
> > > What language and API are you using?
> >
> > I'm using libpqxx. A nice STL-style library for C++ (I am 101% C++).
>
> I've only dabbled with libpqxx; I don't know if or how you can make
> it send data in binary instead of text.  See the documentation or
> ask in a mailing list like libpqxx-general or pgsql-interfaces.
>
> > > Binary transfer sends data in binary, not by automatically converting
> > > to and from text.
> >
> > Uh, I'm sorry I didn't get that... If I send: insert into foo
> > values('\\001\\002') will libpq send 0x01, 0x02 or "001002"??
>
> If you do it that way libpq will send the string as text with escape
> sequences; you can use a sniffer like tcpdump or ethereal to see this
> for yourself.  To send the data in binary you'd call PQexecParams()
> with a query like "INSERT INTO foo VALUES ($1)".  The $1 is a
> placeholder; the other arguments to PQexecParams() provide the data
> itself, the data type and length, and specify whether the data is in
> text format or binary.  See the libpq documentation for details.
>

You could base64 encode your data admitiaddly increasing it by 1/3 but
it does at least convert it to text which means that its more
unserstandable. base64 is also pritty standard being whats used in
EMails for mime attachments.

Peter

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] help tuning queries on large database

2006-01-06 Thread peter royal

Howdy.

I'm running into scaling problems when testing with a 16gb (data 
+indexes) database.


I can run a query, and it returns in a few seconds. If I run it  
again, it returns in a few milliseconds. I realize this is because  
during subsequent runs, the necessary disk pages have been cached by  
the OS.


I have experimented with having all 8 disks in a single RAID0 set, a  
single RAID10 set, and currently 4 RAID0 sets of 2 disks each. There  
hasn't been an appreciable difference in the overall performance of  
my test suite (which randomly generates queries like the samples  
below as well as a few other types. this problem manifests itself on  
other queries in the test suite as well).


So, my question is, is there anything I can do to boost performance  
with what I've got, or am I in a position where the only 'fix' is  
more faster disks? I can't think of any schema/index changes that  
would help, since everything looks pretty optimal from the 'explain  
analyze' output. I'd like to get a 10x improvement when querying from  
the 'cold' state.


Thanks for any assistance. The advice from reading this list to  
getting to where I am now has been invaluable.

-peter


Configuration:

PostgreSQL 8.1.1

shared_buffers = 1  # (It was higher, 50k, but didn't help any,  
so brought down to free ram for disk cache)

work_mem = 8196
random_page_cost = 3
effective_cache_size = 25


Hardware:

CentOS 4.2 (Linux 2.6.9-22.0.1.ELsmp)
Areca ARC-1220 8-port PCI-E controller
8 x Hitachi Deskstar 7K80 (SATA2) (7200rpm)
2 x Opteron 242 @ 1.6ghz
3gb RAM (should be 4gb, but separate Linux issue preventing us from  
getting it to see all of it)

Tyan Thunder K8WE


RAID Layout:

4 2-disk RAID0 sets created

Each raid set is a tablespace, formatted ext3. The majority of the  
database is in the primary tablespace, and the popular object_data  
table is in its own tablespace.



Sample 1:

triple_store=# explain analyze SELECT DISTINCT O.subject AS oid FROM  
object_data O, object_tags T1, tags T2 WHERE O.type = 179 AND  
O.subject = T1.object_id AND T1.tag_id = T2.tag_id AND T2.tag =  
'transmitter\'s' LIMIT 1000;

QUERY PLAN
 
 
-
Limit  (cost=1245.07..1245.55 rows=97 width=4) (actual  
time=3702.697..3704.665 rows=206 loops=1)
   ->  Unique  (cost=1245.07..1245.55 rows=97 width=4) (actual  
time=3702.691..3703.900 rows=206 loops=1)
 ->  Sort  (cost=1245.07..1245.31 rows=97 width=4) (actual  
time=3702.686..3703.056 rows=206 loops=1)

   Sort Key: o.subject
   ->  Nested Loop  (cost=2.82..1241.87 rows=97 width=4)  
(actual time=97.166..3701.970 rows=206 loops=1)
 ->  Nested Loop  (cost=2.82..678.57 rows=186  
width=4) (actual time=59.903..1213.170 rows=446 loops=1)
   ->  Index Scan using tags_tag_key on tags  
t2  (cost=0.00..5.01 rows=1 width=4) (actual time=13.139..13.143  
rows=1 loops=1)
 Index Cond: (tag =  
'transmitter''s'::text)
   ->  Bitmap Heap Scan on object_tags t1   
(cost=2.82..670.65 rows=233 width=8) (actual time=46.751..1198.198  
rows=446 loops=1)
 Recheck Cond: (t1.tag_id =  
"outer".tag_id)
 ->  Bitmap Index Scan on  
object_tags_tag_id_object_id  (cost=0.00..2.82 rows=233 width=0)  
(actual time=31.571..31.571 rows=446 loops=1)
   Index Cond: (t1.tag_id =  
"outer".tag_id)
 ->  Index Scan using object_data_pkey on  
object_data o  (cost=0.00..3.02 rows=1 width=4) (actual  
time=5.573..5.574 rows=0 loops=446)

   Index Cond: (o.subject = "outer".object_id)
   Filter: ("type" = 179)
Total runtime: 3705.166 ms
(16 rows)

triple_store=# explain analyze SELECT DISTINCT O.subject AS oid FROM  
object_data O, object_tags T1, tags T2 WHERE O.type = 179 AND  
O.subject = T1.object_id AND T1.tag_id = T2.tag_id AND T2.tag =  
'transmitter\'s' LIMIT 1000;

   QUERY PLAN
 
 
---
Limit  (cost=1245.07..1245.55 rows=97 width=4) (actual  
time=11.037..12.923 rows=206 loops=1)
   ->  Unique  (cost=1245.07..1245.55 rows=97 width=4) (actual  
time=11.031..12.190 rows=206 loops=1)
 ->  Sort  (cost=1245.07..1245.31 rows=97 width=4) (actual  
time=1

Re: [PERFORM] help tuning queries on large database

2006-01-09 Thread peter royal

On Jan 8, 2006, at 1:42 PM, Luke Lonergan wrote:
Have you tested the underlying filesystem for it's performance?   
Run this:

  time bash -c 'dd if=/dev/zero of=/my_file_system/bigfile bs=8k
count= && sync'


This is a 2-disk RAID0

[EMAIL PROTECTED] /opt/alt-2]# time bash -c 'dd if=/dev/zero of=/opt/alt-2/ 
bigfile bs=8k count=100 && sync'

100+0 records in
100+0 records out

real1m27.143s
user0m0.276s
sys 0m37.338s

'iostat -x' showed writes peaking at ~100MB/s



Then run this:
  time dd if=/my_file_system/bigfile bs=8k of=/dev/null


[EMAIL PROTECTED] /opt/alt-2]# time dd if=/opt/alt-2/bigfile bs=8k of=/dev/ 
null

100+0 records in
100+0 records out

real1m9.846s
user0m0.189s
sys 0m11.099s

'iostat -x' showed reads peaking at ~116MB/s


Again with kernel 2.6.15:

[EMAIL PROTECTED] ~]# time bash -c 'dd if=/dev/zero of=/opt/alt-2/bigfile  
bs=8k count=100 && sync'

100+0 records in
100+0 records out

real1m29.144s
user0m0.204s
sys 0m48.415s

[EMAIL PROTECTED] ~]# time dd if=/opt/alt-2/bigfile bs=8k of=/dev/null
100+0 records in
100+0 records out

real1m9.701s
user0m0.168s
sys 0m11.933s


And report the times here please.  With your 8 disks in any of the  
RAID0
configurations you describe, you should be getting 480MB/s.  In the  
RAID10

configuration you should get 240.


Not anywhere near that. I'm scouring the 'net looking to see what  
needs to be tuned at the HW level.



You should also experiment with using larger readahead, which you can
implement like this:
  blockdev --setra 16384 /dev/

E.g. "blockdev --setra 16384 /dev/sda"


wow, this helped nicely. Without using the updated kernel, it took  
28% off my testcase time.



From what you describe, one of these is likely:
- hardware isn't configured properly or a driver problem.


Using the latest Areca driver, looking to see if there is some  
configuration that was missed.



- you need to use xfs and tune your Linux readahead


Will try XFS soon, concentrating on the 'dd' speed issue first.


On Jan 8, 2006, at 4:35 PM, Ron wrote:

Areca ARC-1220 8-port PCI-E controller


Make sure you have 1GB or 2GB of cache.  Get the battery backup and  
set the cache for write back rather than write through.


The card we've got doesn't have a SODIMM socket, since its only an 8- 
port card.  My understanding was that was cache used when writing?


A 2.6.12 or later based Linux distro should have NO problems using  
more than 4GB or RAM.


Upgraded the kernel to 2.6.15, then we were able to set the BIOS  
option for the 'Memory Hole' to 'Software' and it saw all 4G (under  
2.6.11 we got a kernel panic with that set)



RAID Layout:

4 2-disk RAID0 sets created
You do know that a RAID 0 set provides _worse_ data protection than  
a single HD?  Don't use RAID 0 for any data you want kept reliably.


yup, aware of that. was planning on RAID10 for production, but just  
broke it out into RAID0 sets for testing (from what I read, I  
gathered that the read performance of RAID0 and RAID10 were comparable)



thanks for all the suggestions, I'll report back as I continue testing.

-pete

--
(peter.royal|osi)@pobox.com - http://fotap.org/~osi



smime.p7s
Description: S/MIME cryptographic signature


[PERFORM] Massive delete of rows, how to proceed?

2006-11-25 Thread Peter Childs

On 24/11/06, Arnau <[EMAIL PROTECTED]> wrote:

Hi all,

   I have a table with statistics with more than 15 million rows. I'd
like to delete the oldest statistics and this can be about 7 million
rows. Which method would you recommend me to do this? I'd be also
interested in calculate some kind of statistics about these deleted
rows, like how many rows have been deleted for date. I was thinking in
creating a function, any recommendations?



Copy and drop old table. If you delete you will have a massive problem
with a bloated table and  vacuum will not help unless you expect the
table to grow to this size regulally otherwise vacuum full will take
ages.

Peter.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] GROUP BY vs DISTINCT

2006-12-20 Thread Peter Childs

On 20/12/06, Steinar H. Gunderson <[EMAIL PROTECTED]> wrote:

On Tue, Dec 19, 2006 at 11:19:39PM -0800, Brian Herlihy wrote:
> Actually, I think I answered my own question already.  But I want to
> confirm - Is the GROUP BY faster because it doesn't have to sort results,
> whereas DISTINCT must produce sorted results?  This wasn't clear to me from
> the documentation.  If it's true, then I could save considerable time by
> using GROUP BY where I have been using DISTINCT in the past.  Usually I
> simply want a count of the distinct values, and there is no need to sort
> for that.

You are right; at the moment, GROUP BY is more intelligent than DISTINCT,
even if they have to compare the same columns. This is, as always, something
that could be improved in a future release, TTBOMK.

/* Steinar */


Oh so thats why group by is nearly always quicker than distinct. I
always thought distinct was just short hand for "group by same columns
as I've just selected"
Is it actually in the sql spec to sort in a distinct or could we just
get the parser to rewrite distinct into group by and hence remove the
extra code a different way of doing it must mean.?

Peter.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Planner statistics, correlations

2007-01-12 Thread Peter Childs

On 12/01/07, Tobias Brox <[EMAIL PROTECTED]> wrote:

We have a table with a timestamp attribute (event_time) and a state flag
which usually changes value around the event_time (it goes to 4).  Now
we have more than two years of events in the database, and around 5k of
future events.

It is important to frequently pick out "overdue events", say:

  select * from events where state<>4 and event_time4 has event_time

Can you say what state might be rather than what it is not. I'm guess
that state is an int but there is only a limited list of possible
states, if you can say what it might be rather than what it is the
index is more liklly to be used.

Peter.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Two hard drives --- what to do with them?

2007-02-25 Thread Peter Kovacs

A related question:
Is it sufficient to disable write cache only on the disk where pg_xlog
is located? Or should write cache be disabled on both disks?

Thanks
Peter

On 2/25/07, Tom Lane <[EMAIL PROTECTED]> wrote:

Carlos Moreno <[EMAIL PROTECTED]> writes:
> The question is: does PostgreSQL have separate, independent areas that
> require storage such that performance would be noticeably boosted if
> the multiple storage operations could be done simultaneously?

The standard advice in this area is to put pg_xlog on a separate
spindle; although that probably is only important for update-intensive
applications.  You did not tell us anything about your application...

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Vacuumdb - Max_FSM_Pages Problem.

2007-02-26 Thread Peter Childs

On 26/02/07, Pallav Kalva <[EMAIL PROTECTED]> wrote:

Hi,

 I am in the process of cleaning up one of our big table, this table
has 187 million records and we need to delete around 100 million of them.

 I am deleting around 4-5 million of them daily in order to catchup
with vacuum and also with the archive logs space. So far I have deleted
around 15million in past few days.

 max_fsm_pages value is set to 120. Vacuumdb runs once daily,
here is the output from last night's vacuum job


===
 INFO:  free space map: 999 relations, 798572 pages stored; 755424
total pages needed
 DETAIL:  Allocated FSM size: 1000 relations + 120 pages = 7096
kB shared memory.
 VACUUM



 From the output it says 755424 total pages needed , this number
keeps growing daily even after vacuums are done daily. This was around
350K pages before the delete process started.

 I am afraid that this number will reach the max_fsm_pages limit
soon and vacuums thereafter will never catch up .

 Can anyone please explain this behavior ? What should I do to catch
up with vacuumdb daily ?



Vacuum adds to free pages to the fsm so that they can be reused. If
you don't fill up those free pages the fsm will fill up. Once the fsm
is full no more pages can be added to the fsm. If you start writing to
the free pages via inserts when vacuum next runs more free pages will
be added that did not fit previously in the free space map due to it
being full.

If you are really deleting that many records you may be better coping
those you want to a new table and dropping the old one. To actually
recover space you need to either run vacuum full or cluster.

This ought to be in the manual somewhere as this question gets asked
about once a week.

Peter.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Two hard drives --- what to do with them?

2007-02-26 Thread Peter Kovacs

On 2/26/07, Jeff Davis <[EMAIL PROTECTED]> wrote:

On Sun, 2007-02-25 at 23:11 +0100, Peter Kovacs wrote:
> A related question:
> Is it sufficient to disable write cache only on the disk where pg_xlog
> is located? Or should write cache be disabled on both disks?
>

When PostgreSQL does a checkpoint, it thinks the data pages before the
checkpoint have successfully made it to disk.

If the write cache holds those data pages, and then loses them, there's
no way for PostgreSQL to recover. So use a battery backed cache or turn
off the write cache.


Sorry for for not being familar with storage techonologies... Does
"battery" here mean battery in the common sense of the word - some
kind of independent power supply? Shouldn't the disk itself be backed
by a battery? As should the entire storage subsystem?

Thanks
Peter



Regards,
Jeff Davis




---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Two hard drives --- what to do with them?

2007-02-27 Thread Peter Kovacs

On 2/27/07, Shane Ambler <[EMAIL PROTECTED]> wrote:

Jeff Davis wrote:

>> Sorry for for not being familar with storage techonologies... Does
>> "battery" here mean battery in the common sense of the word - some
>> kind of independent power supply? Shouldn't the disk itself be backed
>> by a battery? As should the entire storage subsystem?
>>
>
> Yes, a battery that can hold power to keep data alive in the write cache
> in case of power failure, etc., for a long enough time to recover and
> commit the data to disk.

Just to expand a bit - the battery backup options are available on some
raid cards - that is where you would be looking for it. I don't know of
any hard drives that have it built in.

Of cause another reason to have a UPS for the server - keep it running
long enough after the clients have gone down so that it can ensure
everything is on disk and shuts down properly.

> So, a write cache is OK (even for pg_xlog) if it is durable (i.e. on
> permanent storage or backed by enough power to make sure it gets there).
> However, if PostgreSQL has no way to know whether a write is durable or
> not, it can't guarantee the data is safe.
>
> The reason this becomes an issue is that many consumer-grade disks have
> write cache enabled by default and no way to make sure the cached data
> actually gets written. So, essentially, these disks "lie" and say they
> wrote the data, when in reality, it's in volatile memory. It's
> recommended that you disable write cache on such a device.

 From all that I have heard this is another advantage of SCSI disks -
they honor these settings as you would expect - many IDE/SATA disks
often say "sure I'll disable the cache" but continue to use it or don't
retain the setting after restart.


As far as I know, SCSI drives also have "write cache" which is turned
off by default, but can be turned on (e.g. with the sdparm utility on
Linux). The reason I am so much interested in how write cache is
typically used (on or off) is that I recently ran our benchmarks on a
machine with SCSI disks and those benchmarks with high commit ratio
suffered significantly compared to our previous results
"traditionally" obtained on machines with IDE drives.

I wonder if running a machine on a UPS + 1 hot standby internal PS is
equivalent, in terms of data integrity, to using battery backed write
cache. Instinctively, I'd think that UPS + 1 hot standby internal PS
is better, since this setup also provides for the disk to actually
write out the content of the cache -- as you pointed out.

Thanks
Peter




--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[PERFORM] Scaling SELECT:s with the number of disks on a stripe

2007-03-29 Thread Peter Schuller
Hello,

I am looking to use PostgreSQL for storing some very simple flat data
mostly in a single table. The amount of data will be in the hundreds
of gigabytes range. Each row is on the order of 100-300 bytes in size;
in other words, small enough that I am expecting disk I/O to be seek
bound (even if PostgreSQL reads a full pg page at a time, since a page
is significantly smaller than the stripe size of the volume).

The only important performance characteristics are insertion/deletion
performance, and the performance of trivial SELECT queries whose WHERE
clause tests equality on one of the columns.

Other than absolute performance, an important goal is to be able to
scale fairly linearly with the number of underlying disk drives. We
are fully willing to take a disk seek per item selected, as long as it
scales.

To this end I have been doing some benchmarking to see whether the
plan is going to be feasable. On a 12 disk hardware stripe, insertion
performance does scale somewhat with concurrent inserters. However, I
am seeing surprising effects with SELECT:s: a single selecter
generates the same amount of disk activity as two concurrent selecters
(I was easily expecting about twice).

The query is simple:

SELECT * FROM test WHERE value = 'xxx' LIMIT 1000;

No ordering, no joins, no nothing. Selecting concurrently with two
different values of 'xxx' yields the same amount of disk activity
(never any significant CPU activity). Note that the total amount of
data is too large to fit in RAM (> 500 million rows), and the number
of distinct values in the value column is 1. The column in the
WHERE clause is indexed.

So my first question is - why am I not seeing this scaling? The
absolute amount of disk activity with a single selecter is consistent
with what I would expect from a SINGLE disk, which is completely
expected since I never thought PostgreSQL would introduce disk I/O
concurrency on its own. But this means that adding additional readers
doing random-access reads *should* scale very well with 12 underlying
disks in a stripe.

(Note that I have seen fairly similar results on other RAID variants
too, including software RAID5 (yes yes I know), in addition to the
hardware stripe.)

These tests have been done Linux 2.6.19.3 and PostgreSQL 8.1.

Secondly, I am seeing a query plan switch after a certain
threshold. Observe:

perftest=# explain select * from test where val='7433' limit 1000; 
   QUERY PLAN   
 
-
 Limit  (cost=0.00..4016.50 rows=1000 width=143)
   ->  Index Scan using test_val_ix on test  (cost=0.00..206620.88 rows=51443 
width=143)
 Index Cond: ((val)::text = '7433'::text)
(3 rows)

Now increasing to a limit of 1:

perftest=# explain select * from test where val='7433' limit 1;
  QUERY PLAN
  
--
 Limit  (cost=360.05..38393.36 rows=1 width=143)
   ->  Bitmap Heap Scan on test  (cost=360.05..196014.82 rows=51443 width=143)
 Recheck Cond: ((val)::text = '7433'::text)
 ->  Bitmap Index Scan on test_val_ix  (cost=0.00..360.05 rows=51443 
width=0)
   Index Cond: ((val)::text = '7433'::text)
(5 rows)

The interesting part is that the latter query is entirely CPU bound
(no disk I/O at all) for an extended period of time before even
beginning to read data from disk. And when it *does* start performing
disk I/O, the performance is about the same as for the other case. In
other words, the change in query plan seems to do nothing but add
overhead.

What is the bitmap heap scan supposed to be doing that would increase
performance above a "seek once per matching row" plan? I haven't been
able to Google my way to what the intended benefit is of a heap scan
vs. a plain index scan.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



pgpoVILTP5fwa.pgp
Description: PGP signature


Re: [PERFORM] Scaling SELECT:s with the number of disks on a stripe

2007-04-02 Thread Peter Schuller
Hello,

> If you are dealing with timed data or similar, you may consider to
> partition your table(s).

Unfortunately this is not the case; the insertion is more or less
random (not quite, but for the purpose of this problem it is).

Thanks for the pointers though. That is sure to be useful in some
other context down the road.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



pgp2AH9xvZCzu.pgp
Description: PGP signature


Re: [PERFORM] Scaling SELECT:s with the number of disks on a stripe

2007-04-02 Thread Peter Schuller
Hello,

> > SELECT * FROM test WHERE value = 'xxx' LIMIT 1000;
> 
> I tested this on a 14-way software raid10 on freebsd, using pg 8.1.6, and
> couldn't reproduce anything like it. With one client I get about 200 disk
> requests per second, scaling almost exactly linearly for the first 5 or so
> clients, as expected. At 14 clients it was down to about 150 reqs/sec per
> client, but the total throughput continued to increase with additional
> concurrency up to about 60 clients, giving about 3600 reqs/sec (260 per
> disk, which is about right for 10krpm scsi disks under highly concurrent
> random loads).

Ok. That is very intersting; so there is definitely nothing
fundamental in PG that prevents the scaling (even if on FreeBSD).

> A good question. Have you tried testing the disks directly? e.g. create
> some huge files, and run a few concurrent random readers on them? That
> would test the array and the filesystem without involving postgres.

I have confirmed that I am seeing expected performance for random
short and highly concurrent reads in one large (> 200 GB) file. The
I/O is done using libaio however, so depending on implementation I
suppose the I/O scheduling behavior of the fs/raid driver might be
affected compared to having a number of concurrent threads doing
synchronous reads. I will try to confirm performance in a way that
will more closely match PostgreSQL's behavior.

I have to say though that I will be pretty surprised if the
performance is not matched in that test.

Is there any chance there is some operation system conditional code in
pg itself that might affect this behavior? Some kind of purposeful
serialization of I/O for example (even if that sounds like an
extremely strange thing to do)?

> This is entirely expected. With the larger row count, it is more likely
> (or so the planner estimates) that rows will need to be fetched from
> adjacent or at least nearby blocks, thus a plan which fetches rows in
> physical table order rather than index order would be expected to be
> superior. The planner takes into account the estimated startup cost and
> per-row cost when planning LIMIT queries; therefore it is no surprise
> that for larger limits, it switches to a plan with a higher startup cost
> but lower per-row cost.

Roger that, makes sense. I had misunderstood the meaning of the heap
scan.

> Most likely your index is small enough that large parts of it will be
> cached in RAM, so that the scan of the index to build the bitmap does
> not need to hit the disk much if at all.

Even so however, several seconds of CPU activity to scan the index for
a few tens of thousands of entries sounds a bit excessive. Or does it
not? Because at that level, the CPU bound period alone is approaching
the time it would take to seek for each entry instead. But then I
presume the amount of work is similar/the same for the other case,
except it's being done at the beginning of the query instead of before
each seek.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



pgpK1zQziyuhZ.pgp
Description: PGP signature


Re: [PERFORM] Scaling SELECT:s with the number of disks on a stripe

2007-04-03 Thread Peter Schuller
Hello,

> The next question then is whether anything in your postgres configuration
> is preventing it getting useful performance from the OS. What settings
> have you changed in postgresql.conf?

The only options not commented out are the following (it's not even
tweaked for buffer sizes and such, since in this case I am not
interested in things like sort performance and cache locality other
than as an afterthought):

hba_file = '/etc/postgresql/8.1/main/pg_hba.conf'
ident_file = '/etc/postgresql/8.1/main/pg_ident.conf'
external_pid_file = '/var/run/postgresql/8.1-main.pid'
listen_addresses = '*'
port = 5432
max_connections = 100
unix_socket_directory = '/var/run/postgresql'
ssl = true
shared_buffers = 1000
log_line_prefix = '%t '
stats_command_string = on
stats_row_level = on
autovacuum = on
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'

> Are you using any unusual settings within the OS itself?

No. It's a pretty standard kernel. The only local tweaking done is
enabling/disabling various things; there are no special patches used
or attempts to create a minimalistic kernel or anything like that.

> You're forgetting the LIMIT clause. For the straight index scan, the
> query aborts when the LIMIT is reached having scanned only the specified
> number of index rows (plus any index entries that turned out to be dead
> in the heap). For the bitmap scan case, the limit can be applied only after
> the heap scan is under way, therefore the index scan to build the bitmap
> will need to scan ~50k rows, not the 10k specified in the limit, so the
> amount of time spent scanning the index is 50 times larger than in the
> straight index scan case.

Ok - makes sense that it has to scan the entire subset of the index
for the value in question. I will have to tweak the CPU/disk costs
settings (which I have, on purpose, not yet done).

> However, I do suspect you have a problem here somewhere, because in my
> tests the time taken to do the bitmap index scan on 50k rows, with the
> index in cache, is on the order of 30ms (where the data is cached in
> shared_buffers) to 60ms (where the data is cached by the OS). That's on
> a 2.8GHz xeon.

This is on a machine with 2.33GHz xeons and I wasn't trying to
exaggerate. I timed it and it is CPU bound (in userspace; next to no
system CPU usage at all) for about 15 seconds for the case of
selecting with a limit of 1.

Given that there is no disk activity I can't imagine any buffer sizes
or such affecting this other than userspace vs. kernelspace CPU
concerns (since obviously the data being worked on is in RAM). Or am I
missing something?

It is worth noting that the SELECT of fewer entries is entirely disk
bound; there is almost no CPU usage whatsoever. Even taking the
cumulative CPU usage into account (gut feeling calculation, nothing
scientific) and multiplying by 50 you are nowhere near 15 seconds of
CPU boundness. So it is indeed strange.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



pgp5zNDEU58j4.pgp
Description: PGP signature


Re: [PERFORM] SCSI vs SATA

2007-04-04 Thread Peter Kovacs

This may be a silly question but: will not 3 times as many disk drives
mean 3 times higher probability for disk failure? Also rumor has it
that SATA drives are more prone to fail than SCSI drivers. More
failures will result, in turn, in more administration costs.

Thanks
Peter

On 4/4/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

On Tue, 3 Apr 2007, Geoff Tolley wrote:

>
> Ron wrote:
>>  At 07:07 PM 4/3/2007, Ron wrote:
>> >  For random IO, the 3ware cards are better than PERC
>> >
>> > >  Question: will 8*15k 73GB SCSI drives outperform 24*7K 320GB SATA II
>> >  drives?
>> >
>> >  Nope.  Not even if the 15K 73GB HDs were the brand new Savvio 15K
>> >  screamers.
>> >
>> >  Example assuming 3.5" HDs and RAID 10 => 4 15K 73GB vs 12 7.2K 320GB
>> >  The 15K's are 2x faster rpm, but they are only ~23% the density =>
>> >  advantage per HD to SATAs.
>> >  Then there's the fact that there are 1.5x as many 7.2K spindles as 15K
>> >  spindles...
>>  Oops make that =3x= as many 7.2K spindles as 15K spindles...
>
> I don't think the density difference will be quite as high as you seem to
> think: most 320GB SATA drives are going to be 3-4 platters, the most that a
> 73GB SCSI is going to have is 2, and more likely 1, which would make the
> SCSIs more like 50% the density of the SATAs. Note that this only really
> makes a difference to theoretical sequential speeds; if the seeks are random
> the SCSI drives could easily get there 50% faster (lower rotational latency
> and they certainly will have better actuators for the heads). Individual 15K
> SCSIs will trounce 7.2K SATAs in terms of i/os per second.

true, but with 3x as many drives (and 4x the capacity per drive) the SATA
system will have to do far less seeking

for that matter, with 20ish 320G drives, how large would a parition be
that only used the outer pysical track of each drive? (almost certinly
multiple logical tracks) if you took the time to set this up you could
eliminate seeking entirely (at the cost of not useing your capacity, but
since you are considering a 12x range in capacity, it's obviously not your
primary concern)

> If you care about how often you'll have to replace a failed drive, then the
> SCSI option no question, although check the cases for hot-swapability.

note that the CMU and Google studies both commented on being surprised at
the lack of difference between the reliability of SCSI and SATA drives.

David Lang

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] SCSI vs SATA

2007-04-04 Thread Peter Kovacs

But if an individual disk fails in a disk array, sooner than later you
would want to purchase a new fitting disk, walk/drive to the location
of the disk array, replace the broken disk in the array and activate
the new disk. Is this correct?

Thanks
Peter

On 4/4/07, Alvaro Herrera <[EMAIL PROTECTED]> wrote:

Andreas Kostyrka escribió:
> * Peter Kovacs <[EMAIL PROTECTED]> [070404 14:40]:
> > This may be a silly question but: will not 3 times as many disk drives
> > mean 3 times higher probability for disk failure? Also rumor has it
> > that SATA drives are more prone to fail than SCSI drivers. More
> > failures will result, in turn, in more administration costs.
> Actually, the newest research papers show that all discs (be it
> desktops, or highend SCSI) have basically the same failure statistics.
>
> But yes, having 3 times the discs will increase the fault probability.

... of individual disks, which is quite different from failure of a disk
array (in case there is one).

--
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Scaling SELECT:s with the number of disks on a stripe

2007-04-04 Thread Peter Schuller
Hello,

> I'd always do benchmarks with a realistic value of shared_buffers (i.e.
> much higher than that).
> 
> Another thought that comes to mind is that the bitmap index scan does
> depend on the size of work_mem.
> 
> Try increasing your shared_buffers to a reasonable working value (say
> 10%-15% of RAM - I was testing on a machine with 4GB of RAM, using a
> shared_buffers setting of 5), and increase work_mem to 16364, and
> see if there are any noticable changes in behaviour.

Increasing the buffer size and work_mem did have a significant
effect. I can understand it in the case of the heap scan, but I am
still surprised at the index scan. Could pg be serializing the entire
query as a result of insufficient buffers/work_mem to satisfy multiple
concurrent queries?

With both turned up, not only is the heap scan no longer visibly CPU
bound, I am seeing some nice scaling in terms of disk I/O. I have not
yet benchmarked to the point of being able to say whether it's
entirely linear, but it certainly seems to at least be approaching the
ballpark.

Thank you for the help! I guess I made a bad call not tweaking
this. My thinking was that I explicitly did not want to turn it up so
that I could benchmark the raw performance of disk I/O, rather than
having things be cached in memory more than it would already be. But
apparantly it had other side-effects I did not consider.

Thanks again,

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



pgp0TSiIQlMjC.pgp
Description: PGP signature


[PERFORM] Nested loops overpriced

2007-05-08 Thread Peter Eisentraut
ime" < '2007-05-05 
18:01:59'::timestamp without time zone))
 Total runtime: 8160.442 ms

The estimates all look pretty good and reasonable.

A faster plan, however, is this:


QUERY PLAN  
   
---
 GroupAggregate  (cost=1920309.81..1920534.21 rows=11220 width=184) (actual 
time=5349.493..5587.536 rows=35000 loops=1)
   ->  Sort  (cost=1920309.81..1920337.86 rows=11220 width=184) (actual 
time=5349.427..5392.110 rows=35000 loops=1)
 Sort Key: eh_subj.header_body
 ->  Nested Loop  (cost=15576.58..1919555.05 rows=11220 width=184) 
(actual time=537.938..5094.377 rows=35000 loops=1)
   ->  Nested Loop  (cost=15576.58..475387.23 rows=11020 width=120) 
(actual time=537.858..4404.330 rows=35000 loops=1)
 ->  Nested Loop  (cost=15576.58..430265.44 rows=11092 
width=112) (actual time=537.768..4024.184 rows=35000 loops=1)
   ->  Bitmap Heap Scan on email_header eh_from  
(cost=15576.58..16041.55 rows=107156 width=104) (actual time=537.621..1801.032 
rows=280990 loops=1)
 Recheck Cond: ((mime_part_id = 0) AND 
(header_name = 'from'::text))
 ->  BitmapAnd  (cost=15576.58..15576.58 
rows=160 width=0) (actual time=500.006..500.006 rows=0 loops=1)
   ->  Bitmap Index Scan on dummy_index  
(cost=0.00..3724.22 rows=107156 width=0) (actual time=85.025..85.025 
rows=280990 loops=1)
   ->  Bitmap Index Scan on 
idx__email_header__from_local  (cost=0.00..5779.24 rows=107156 width=0) (actual 
time=173.006..173.006 rows=280990 loops=1)
   ->  Bitmap Index Scan on dummy2_index  
(cost=0.00..5992.25 rows=107156 width=0) (actual time=174.463..174.463 
rows=280990 loops=1)
   ->  Index Scan using email_pkey on email  
(cost=0.00..3.85 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990)
 Index Cond: (email.email_id = eh_from.email_id)
 Filter: (("time" >= '2007-05-05 
17:01:59'::timestamp without time zone) AND ("time" < '2007-05-05 
18:01:59'::timestamp without time zone))
 ->  Index Scan using mime_part_pkey on mime_part  
(cost=0.00..4.06 rows=1 width=12) (actual time=0.005..0.006 rows=1 loops=35000)
   Index Cond: ((email.email_id = mime_part.email_id) 
AND (mime_part.mime_part_id = 0))
   ->  Index Scan using idx__email_header__email_id__mime_part_id 
on email_header eh_subj  (cost=0.00..130.89 rows=13 width=104) (actual 
time=0.009..0.015 rows=1 loops=35000)
 Index Cond: ((email.email_id = eh_subj.email_id) AND (0 = 
eh_subj.mime_part_id))
 Filter: (header_name = 'subject'::text)
 Total runtime: 5625.024 ms

Note how spectacularly overpriced this plan is.  The costs for the nested
loops are calculated approximately as number of outer tuples times cost of
the inner scan.  So slight overestimations of the inner scans such as 

Index Scan using email_pkey on email  (cost=0.00..3.85 rows=1 width=8) (actual 
time=0.005..0.005 rows=0 loops=280990)

kill this calculation.

Most likely, all of these database is cached, so I tried reducing
seq_page_cost and random_page_cost, but I needed to turn them all the way
down to 0.02 or 0.03, which is almost like cpu_tuple_cost.  Is that
reasonable?  Or what is wrong here?


PostgreSQL 8.2.1 on x86_64-unknown-linux-gnu
work_mem = 256MB
effective_cache_size = 384MB

The machine has 1GB of RAM.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Peter Eisentraut
Am Dienstag, 8. Mai 2007 17:53 schrieb Tom Lane:
> Hmm, I'd have expected it to discount the repeated indexscans a lot more
> than it seems to be doing for you.  As an example in the regression
> database, note what happens to the inner indexscan cost estimate when
> the number of outer tuples grows:

I can reproduce your results in the regression test database. 8.2.1 and 8.2.4 
behave the same.

I checked the code around cost_index(), and the assumptions appear to be 
correct (at least this query doesn't produce wildly unusual data).  
Apparently, however, the caching effects are much more significant than the 
model takes into account.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[PERFORM] Apparently useless bitmap scans

2007-05-09 Thread Peter Eisentraut
AND mime_part_id = 0

from the query, but why does it need three of them to do it, when all
of them have the same predicate and none of them has an indexed
expression that appears in the query?

There are more partial indexes with the same predicate, but it appears
to always use three.  (The two "dummy" indexes are just leftovers from
these experiments.)

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Apparently useless bitmap scans

2007-05-09 Thread Peter Eisentraut
Am Mittwoch, 9. Mai 2007 16:29 schrieb Alvaro Herrera:
> Peter Eisentraut wrote:
> > There's another odd thing about this plan from yesterday.
>
> Is this still 8.2.1?  The logic to choose bitmap indexes was rewritten
> just before 8.2.4,

OK, upgrading to 8.2.4 fixes this odd plan choice.  The query does run
a bit faster too, but the cost estimate has actually gone up!

8.2.1:


  QUERY PLAN
   
---
 GroupAggregate  (cost=87142.18..87366.58 rows=11220 width=184) (actual 
time=7883.541..8120.647 rows=35000 loops=1)
   ->  Sort  (cost=87142.18..87170.23 rows=11220 width=184) (actual 
time=7883.471..7926.031 rows=35000 loops=1)
 Sort Key: eh_subj.header_body
 ->  Hash Join  (cost=46283.30..86387.42 rows=11220 width=184) (actual 
time=5140.182..7635.615 rows=35000 loops=1)
   Hash Cond: (eh_subj.email_id = email.email_id)
   ->  Bitmap Heap Scan on email_header eh_subj  
(cost=11853.68..50142.87 rows=272434 width=104) (actual time=367.956..1719.736 
rows=280989 loops=1)
 Recheck Cond: ((mime_part_id = 0) AND (header_name = 
'subject'::text))
 ->  BitmapAnd  (cost=11853.68..11853.68 rows=27607 
width=0) (actual time=326.507..326.507 rows=0 loops=1)
   ->  Bitmap Index Scan on 
idx__email_header__header_body_subject  (cost=0.00..5836.24 rows=272434 
width=0) (actual time=178.041..178.041 rows=280989 loops=1)
   ->  Bitmap Index Scan on 
idx__email_header__header_name  (cost=0.00..5880.97 rows=281247 width=0) 
(actual time=114.574..114.574 rows=280989 loops=1)
 Index Cond: (header_name = 'subject'::text)
   ->  Hash  (cost=34291.87..34291.87 rows=11020 width=120) (actual 
time=4772.148..4772.148 rows=35000 loops=1)
 ->  Hash Join  (cost=24164.59..34291.87 rows=11020 
width=120) (actual time=3131.067..4706.997 rows=35000 loops=1)
   Hash Cond: (mime_part.email_id = email.email_id)
   ->  Seq Scan on mime_part  (cost=0.00..8355.81 
rows=265804 width=12) (actual time=0.038..514.291 rows=267890 loops=1)
 Filter: (mime_part_id = 0)
   ->  Hash  (cost=24025.94..24025.94 rows=11092 
width=112) (actual time=3130.982..3130.982 rows=35000 loops=1)
 ->  Hash Join  (cost=22244.54..24025.94 
rows=11092 width=112) (actual time=996.556..3069.280 rows=35000 loops=1)
   Hash Cond: (eh_from.email_id = 
email.email_id)
   ->  Bitmap Heap Scan on email_header 
eh_from  (cost=15576.58..16041.55 rows=107156 width=104) (actual 
time=569.762..1932.017 rows=280990 loops=1)
 Recheck Cond: ((mime_part_id = 0) 
AND (header_name = 'from'::text))
 ->  BitmapAnd  
(cost=15576.58..15576.58 rows=160 width=0) (actual time=532.217..532.217 rows=0 
loops=1)
   ->  Bitmap Index Scan on 
dummy_index  (cost=0.00..3724.22 rows=107156 width=0) (actual 
time=116.386..116.386 rows=280990 loops=1)
   ->  Bitmap Index Scan on 
idx__email_header__from_local  (cost=0.00..5779.24 rows=107156 width=0) (actual 
time=174.883..174.883 rows=280990 loops=1)
   ->  Bitmap Index Scan on 
dummy2_index  (cost=0.00..5992.25 rows=107156 width=0) (actual 
time=173.575..173.575 rows=280990 loops=1)
   ->  Hash  (cost=6321.79..6321.79 
rows=27694 width=8) (actual time=426.739..426.739 rows=35000 loops=1)
 ->  Index Scan using 
idx__email__time on email  (cost=0.00..6321.79 rows=27694 width=8) (actual 
time=50.000..375.021 rows=35000 loops=1)
   Index Cond: (("time" >= 
'2007-05-05 17:01:59'::timestamp without time zone) AND ("time" < '2007-05-05 
18:01:59'::timestamp without time zone))
 Total runtime: 8160.442 ms


8.2.4:


QUERY PLAN  
  
-

Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Peter Eisentraut
Am Mittwoch, 9. Mai 2007 16:11 schrieb Tom Lane:
> Well, there's something funny going on here.  You've got for instance
>
>->  Index Scan using email_pkey on email  (cost=0.00..3.85
> rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990) Index Cond:
> (email.email_id = eh_from.email_id)
>  Filter: (("time" >= '2007-05-05 17:01:59'::timestamp
> without time zone) AND ("time" < '2007-05-05 18:01:59'::timestamp without
> time zone))
>
> on the inside of a nestloop whose outer side is predicted to return
> 107156 rows.  That should've been discounted to *way* less than 3.85
> cost units per iteration.

This is the new plan with 8.2.4.  It's still got the same problem, though.


QUERY PLAN  
   
---
 GroupAggregate  (cost=5627064.21..5627718.73 rows=32726 width=184) (actual 
time=4904.834..5124.585 rows=35000 loops=1)
   ->  Sort  (cost=5627064.21..5627146.03 rows=32726 width=184) (actual 
time=4904.771..4947.892 rows=35000 loops=1)
 Sort Key: eh_subj.header_body
 ->  Nested Loop  (cost=0.00..5624610.06 rows=32726 width=184) (actual 
time=0.397..4628.141 rows=35000 loops=1)
   ->  Nested Loop  (cost=0.00..1193387.12 rows=28461 width=120) 
(actual time=0.322..3960.360 rows=35000 loops=1)
 ->  Nested Loop  (cost=0.00..1081957.26 rows=28648 
width=112) (actual time=0.238..3572.023 rows=35000 loops=1)
   ->  Index Scan using dummy_index on email_header 
eh_from  (cost=0.00..13389.15 rows=280662 width=104) (actual 
time=0.133..1310.248 rows=280990 loops=1)
   ->  Index Scan using email_pkey on email  
(cost=0.00..3.79 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990)
 Index Cond: (email.email_id = eh_from.email_id)
 Filter: (("time" >= '2007-05-05 
17:01:59'::timestamp without time zone) AND ("time" < '2007-05-05 
18:01:59'::timestamp without time zone))
 ->  Index Scan using mime_part_pkey on mime_part  
(cost=0.00..3.88 rows=1 width=12) (actual time=0.005..0.006 rows=1 loops=35000)
   Index Cond: ((email.email_id = mime_part.email_id) 
AND (mime_part.mime_part_id = 0))
   ->  Index Scan using idx__email_header__email_id__mime_part_id 
on email_header eh_subj  (cost=0.00..155.47 rows=18 width=104) (actual 
time=0.009..0.014 rows=1 loops=35000)
 Index Cond: ((email.email_id = eh_subj.email_id) AND (0 = 
eh_subj.mime_part_id))
 Filter: (header_name = 'subject'::text)
 Total runtime: 5161.390 ms

> Are you using any nondefault planner settings?

random_page_cost = 3
effective_cache_size = 384MB

> How big are these tables, anyway?

email   35 MB
email_header421 MB
mime_part   37 MB

Everything is analyzed, vacuumed, and reindexed.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Nested loops overpriced

2007-05-10 Thread Peter Eisentraut
Am Mittwoch, 9. Mai 2007 19:40 schrieb Tom Lane:
> I remember having dithered about whether
> to try to avoid counting the same physical relation more than once in
> total_table_pages, but this example certainly suggests that we
> shouldn't.  Meanwhile, do the estimates get better if you set
> effective_cache_size to 1GB or so?

Yes, that makes the plan significantly cheaper (something like 500,000 instead 
of 5,000,000), but still a lot more expensive than the hash join (about 
100,000).

> To return to your original comment: if you're trying to model a
> situation with a fully cached database, I think it's sensible
> to set random_page_cost = seq_page_cost = 0.1 or so.  You had
> mentioned having to decrease them to 0.02, which seems unreasonably
> small to me too, but maybe with the larger effective_cache_size
> you won't have to go that far.

Heh, when I decrease these parameters, the hash join gets cheaper as well.  I 
can't actually get it to pick the nested-loop join.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Nested loops overpriced

2007-05-10 Thread Peter Eisentraut
Am Mittwoch, 9. Mai 2007 19:40 schrieb Tom Lane:
> Hmmm ... I see at least part of the problem, which is that email_header
> is joined twice in this query, which means that it's counted twice in
> figuring the total volume of pages competing for cache space.  So the
> thing thinks cache space is oversubscribed nearly 3X when in reality
> the database is fully cached.

I should add that other, similar queries in this database that do not involve 
joining the same table twice produce seemingly optimal plans.  (It picks hash 
joins which are actually faster than nested loops.)

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Postgres Benchmark Results

2007-05-21 Thread Peter Schuller
> - Deferred Transactions, since adding a comment to a blog post
> doesn't need the same guarantees than submitting a paid order, it makes
> sense that the application could tell postgres which transactions we
> care about if power is lost. This will massively boost performance for
> websites I believe.

This would be massively useful. Very often all I care about is that the
transaction is semantically committed; that is, that other transactions
starting from that moment will see the modifications done. As opposed to
actually persisting data to disk.

In particular I have a situation where I attempt to utilize available
hardware by using concurrency. The problem is that I have to either
hugely complicate my client code or COMMIT more often than I would like
in order to satisfy dependencies between different transactions. If a
deferred/delayed commit were possible I could get all the performance
benefit without the code complexity, and with no penalty (because in
this case persistence is not important).

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org




signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Key/Value reference table generation: INSERT/UPDATE performance

2007-05-22 Thread Peter Childs

On 22 May 2007 01:23:03 -0700, valgog <[EMAIL PROTECTED]> wrote:


I found several post about INSERT/UPDATE performance in this group,
but actually it was not really what I am searching an answer for...

I have a simple reference table WORD_COUNTS that contains the count of
words that appear in a word array storage in another table.

CREATE TABLE WORD_COUNTS
(
  word text NOT NULL,
  count integer,
  CONSTRAINT PK_WORD_COUNTS PRIMARY KEY (word)
)
WITHOUT OIDS;




Is there any reason why count is not not null? (That should siplify your
code by removing the coalesce)

insert is more efficient than update because update is always a delete
followed by an insert.

Oh and group by is nearly always quicker than distinct and can always? be
rewritten as such. I'm not 100% sure why its different but it is.

Peter.



I have some PL/pgSQL code in a stored procedure like


  FOR r
   IN select id, array_of_words
from word_storage
  LOOP
begin
  -- insert the missing words
  insert into WORD_COUNTS
  ( word, count )
  ( select word, 0
  from ( select distinct (r.array_of_words)
[s.index] as d_word
   from generate_series(1,
array_upper( r.array_of_words, 1 ) ) as s(index) ) as distinct_words
 where word not in ( select d_word from
WORD_COUNTS  ) );
  -- update the counts
  update WORD_COUNTS
 set count = COALESCE( count, 0 ) + 1
   where word in ( select distinct (r.array_of_words)[s.index] as
word
from generate_series(1,
array_upper( r.array_of_words, 1) ) as s(index) );
exception when others then
  error_count := error_count + 1;
end;
record_count := record_count + 1;
  END LOOP;

This code runs extremely slowly. It takes about 10 minutes to process
1 records and the word storage has more then 2 million records to
be processed.

Does anybody have a know-how about populating of such a reference
tables and what can be optimized in this situation.

Maybe the generate_series() procedure to unnest the array is the place
where I loose the performance?

Are the set update/inserts more effitient, then single inserts/updates
run in smaller loops?

Thanks for your help,

Valentine Gogichashvili


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



Re: [PERFORM] max_fsm_pages, shared_buffers and checkpoint_segments

2007-05-23 Thread Peter Schuller
> increasing checkpoint_segments,which is also a disk thing. However, setting
> it to 25, and then increasing any of the other 2 variables, the postgresql
> daemon stops working. meaning it does not start upon reboot. When I bring

Sounds like you need to increase your shared memory limits.
Unfortunately this will require a reboot on FreeBSD :(

See:

   http://www.postgresql.org/docs/8.2/static/kernel-resources.html

Last time I checked PostgreSQL should be complaining about the shared
memory on startup rather than silently fail though. Check your logs
perhaps. Though I believe the RC script will cause the message to be
printed interactively at the console too, if you run it. (Assuming you
are using it installed from ports).

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org




signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] setting up raid10 with more than 4 drives

2007-05-30 Thread Peter Childs

On 30/05/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:


On Wed, 30 May 2007, Jonah H. Harris wrote:

> On 5/29/07, Luke Lonergan <[EMAIL PROTECTED]> wrote:
>>  AFAIK you can't RAID1 more than two drives, so the above doesn't make
>>  sense
>>  to me.
>
> Yeah, I've never seen a way to RAID-1 more than 2 drives either.  It
> would have to be his first one:
>
> D1 + D2 = MD0 (RAID 1)
> D3 + D4 = MD1 ...
> D5 + D6 = MD2 ...
> MD0 + MD1 + MD2 = MDF (RAID 0)
>

I don't know what the failure mode ends up being, but on linux I had no
problems creating what appears to be a massively redundant (but small)
array

md0 : active raid1 sdo1[10](S) sdn1[8] sdm1[7] sdl1[6] sdk1[5] sdj1[4]
sdi1[3] sdh1[2] sdg1[9] sdf1[1] sde1[11](S) sdd1[0]
   896 blocks [10/10] [UU]

David Lang



Good point, also if you had Raid 1 with 3 drives with some bit errors at
least you can take a vote on whats right. Where as if you only have 2 and
they disagree how do you know which is right other than pick one and hope...
But whatever it will be slower to keep in sync on a heavy write system.

Peter.


Re: [PERFORM] optimize query with a maximum(date) extraction

2007-09-05 Thread Peter Childs
On 05/09/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
>
> "Gregory Stark" <[EMAIL PROTECTED]> writes:
>
> > "JS Ubei" <[EMAIL PROTECTED]> writes:
> >
> >> I need to improve a query like :
> >>
> >> SELECT id, min(the_date), max(the_date) FROM my_table GROUP BY id;
> >...
> > I don't think you'll find anything much faster for this particular
> query. You
> > could profile running these two (non-standard) queries:
> >
> > SELECT DISTINCT ON (id) id, the_date AS min_date FROM my_table ORDER BY
> id, the_date ASC
> > SELECT DISTINCT ON (id) id, the_date AS max_date FROM my_table ORDER BY
> id, the_date DESC
>
> Something else you might try:
>
> select id,
>(select min(the_date) from my_table where id=x.id) as min_date,
>(select max(the_date) from my_table where id=x.id) as max_date
>   from (select distinct id from my_table)
>
> Recent versions of Postgres do know how to use the index for a simple
> ungrouped min() or max() like these subqueries.
>
> This would be even better if you have a better source for the list of
> distinct
> ids you're interested in than my_table. If you have a source that just has
> one
> record for each id then you won't need an extra step to eliminate
> duplicates.
>
>
My personal reaction is why are you using distinct at all?

why not

select id,
   min(the_date) as min_date,
   max(the_date) as max_date
  from my_table group by id;

Since 8.0 or was it earlier this will use an index should a reasonable one
exist.

Peter.


Re: [PERFORM] Long Running Commits - Not Checkpoints

2007-09-14 Thread Peter Childs
On 13/09/2007, Greg Smith <[EMAIL PROTECTED]> wrote:
>
>
> Every time the all scan writes a buffer that is frequently used, that
> write has a good chance that it was wasted because the block will be
> modified again before checkpoint time.  Your settings are beyond regular
> aggressive and into the hyperactive terrority where I'd expect such
> redundant writes are happening often.  I'd suggest you try to move toward
> dropping bgwriter_all_percent dramatically from its current setting and
> see how far down you can go before it starts to introduce blocks at
> checkpoint time.  With bgwriter_delay set to 1/4 the default, I would
> expect that even 5% would be a high setting for you.  That may be a more
> dramatic change than you want to make at once though, so lowering it in
> that direction more slowly (perhaps drop 5% each day) and seeing whether
> things improve as that happens may make more sense.
>
>
Are you suggesting that reducing bgwriter_delay and bg_writer_percent would
reduce the time spent doing commits?

I get quite a few commits that take over 500ms (the point when i start
logging queries). I always thought oh just one of those things but if they
can be reduced by changing a few config variables that would be great. I'm
just trying to workout what figures are worth trying to see if I can reduce
them.

>From time to time I get commits that take 6 or 7 seconds but not all the
time.

I'm currently working with the defaults.

Peter Childs


Re: [PERFORM] Long Running Commits - Not Checkpoints

2007-09-14 Thread Peter Childs
On 14/09/2007, Peter Childs <[EMAIL PROTECTED]> wrote:
>
>
>
> On 13/09/2007, Greg Smith <[EMAIL PROTECTED]> wrote:
> >
> >
> > Every time the all scan writes a buffer that is frequently used, that
> > write has a good chance that it was wasted because the block will be
> > modified again before checkpoint time.  Your settings are beyond regular
> >
> > aggressive and into the hyperactive terrority where I'd expect such
> > redundant writes are happening often.  I'd suggest you try to move
> > toward
> > dropping bgwriter_all_percent dramatically from its current setting and
> > see how far down you can go before it starts to introduce blocks at
> > checkpoint time.  With bgwriter_delay set to 1/4 the default, I would
> > expect that even 5% would be a high setting for you.  That may be a more
> > dramatic change than you want to make at once though, so lowering it in
> > that direction more slowly (perhaps drop 5% each day) and seeing whether
> > things improve as that happens may make more sense.
> >
> >
> Are you suggesting that reducing bgwriter_delay and bg_writer_percent
> would reduce the time spent doing commits?
>
> I get quite a few commits that take over 500ms (the point when i start
> logging queries). I always thought oh just one of those things but if they
> can be reduced by changing a few config variables that would be great. I'm
> just trying to workout what figures are worth trying to see if I can reduce
> them.
>
> From time to time I get commits that take 6 or 7 seconds but not all the
> time.
>
> I'm currently working with the defaults.
>
> Peter Childs
>

Hmm Always read the manual, Increase them from the defaults...

Peter.


Re: [PERFORM] Tablespaces and NFS

2007-09-19 Thread Peter Koczan
On 9/19/07, Carlos Moreno <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Anyone has tried a setup combining tablespaces with NFS-mounted partitions?
>
> I'm considering the idea as a performance-booster --- our problem is
> that we are
> renting our dedicated server from a hoster that does not offer much
> flexibility
> in terms of custom hardware configuration;  so, the *ideal* alternative
> to load
> the machine with 4 or 6 hard drives and use tablespaces is off the table
> (no pun
> intended).
>
> We could, however, set up a few additional servers where we could configure
> NFS shares, mount them on the main PostgreSQL server, and configure
> tablespaces to "load balance" the access to disk.
>
> Would you estimate that this will indeed boost performance??  (our system
> does lots of writing to DB --- in all forms:  inserts, updates, and deletes)
>
> As a corollary question:  what about the WALs and tablespaces??  Are the
> WALs "distributed" when we setup a tablespace and create tables in it?
> (that is, are the WALs corresponding to the tables in a tablespace stored
> in the directory corresponding to the tablespace?  Or is it only the
> data, and
> the WAL keeps being the one and only?)
>
> Thanks,
>
> Carlos

About 5 months ago, I did an experiment serving tablespaces out of
AFS, another shared file system.

You can read my full post at
http://archives.postgresql.org/pgsql-admin/2007-04/msg00188.php

On the whole, you're not going to see a performance improvement
running tablespaces on NFS (unless the disk system on the NFS server
is a lot faster) since you have to go through the network as well as
NFS, both of which add overhead.

Usually, locking mechanisms on shared file systems don't play nice
with databases. You're better off using something else to load balance
or replicate data.

Peter

P.S. Why not just set up those servers you're planning on using as NFS
shares as your postgres server(s)?

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Tablespaces and NFS

2007-09-20 Thread Peter Koczan
> Anyway...  One detail I don't understand --- why do you claim that
> "You can't take advantage of the shared file system because you can't
> share tablespaces among clusters or servers" ???

I say that because you can't set up two servers to point to the same
tablespace (i.e. you can't have server A and server B both point to
the tablespace in /mnt/nfs/postgres/), which basically defeats one of
the main purposes of using a shared file system, seeing, using, and
editing files from anywhere.

This is ill-advised and probably won't work for 2 reasons.

- Postgres tablespaces require empty directories to for
initialization. If you create a tablespace on server A, it puts files
in the previously empty directory. If you then try to create a
tablespace on server B pointing to the same location, it won't work
since the directory is no longer empty. You can get around this, in
theory, but you'd either have to directly mess with system tables or
fool Postgres into thinking that each server independently created
that tablespace (to which anyone will say, NO).

- If you do manage to fool postgres into having two servers pointing
at the same tablespace, the servers really, REALLY won't play nice
with these shared resources, since they have no knowledge of each
other (i mean, two clusters on the same server don't play nice with
memory). Basically, if they compete for the same file, either I/O will
be EXTREMELY slow because of file-locking mechanisms in the file
system, or you open things up to race conditions and data corruption.
In other words: BAD

I know this doesn't fully apply to you, but I thought I should explain
my points betters since you asked so nicely :-)

> This seems to be the killer point --- mainly because the network
> connection is a 100Mbps  (around 10 MB/sec --- less than 1/4 of
> the performance we'd expect from an internal hard drive).  If at
> least it was a Gigabit connection, I might still be tempted to
> retry the experiment.  I was thinking that *maybe* the latencies
> and contention due to heads movements (in the order of the millisec)
> would take precedence and thus, a network-distributed cluster of
> hard drives would end up winning.

If you get decently fast disks, or put some slower disks in RAID 10,
you'll easily get >100 MB/sec (and that's a conservative estimate).
Even with a Gbit network, you'll get, in theory 128 MB/sec, and that's
assuming that the NFS'd disks aren't a bottleneck.

> We're clear that that would be the *optimal* solution --- problem
> is, there's a lot of client-side software that we would have to
> change;  I'm first looking for a "transparent" solution in which
> I could distribute the load at a hardware level, seeing the DB
> server as a single entity --- the ideal solution, of course,
> being the use of tablespaces with 4 or 6 *internal* hard disks
> (but that's not an option with our current web hoster).

I sadly don't know enough networking to tell you tell the client
software "no really, I'm over here." However, one of the things I'm
fond of is using a module to store connection strings, and dynamically
loading said module on the client side. For instance, with Perl I
use...

use DBI;
use DBD::Pg;
use My::DBs;

my $dbh = DBI->connect($My::DBs::mydb);

Assuming that the module and its entries are kept up to date, it will
"just work." That way, there's only 1 module to change instead of n
client apps. I can have a new server with a new name up without
changing any client code.

> Anyway, I'll keep working on alternative solutions --- I think
> I have enough evidence to close this NFS door.

That's probably for the best.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] sequence query performance issues

2007-09-27 Thread Peter Koczan
Hello,

I have a weird performance issue with a query I'm testing. Basically,
I'm trying to port a function that generates user uids, and since
postgres offers a sequence generator function, I figure I'd take
advantage of that. Basically, I generate our uid range, filter out
those which are in use, and randomly pick however many I need.
However, when I run it it takes forever (>10 minutes and I get nothing
so I cancelled the query) and cpu usage on the server is maxed out.

Here's my query (I'll post the explain output later so as not to
obscure my question):
=> select a.uid from generate_series(1000, 32767) as a(uid) where
a.uid not in (select uid from people) order by random() limit 1;

I thought that nulls were a problem, so I tried:
=> select a.uid from generate_series(1000, 32767) as a(uid) where
a.uid not in (select coalesce(uid,0) from people) order by random()
limit 1;
And that finished in less than a second.

I then tried:
=> select a.uid from generate_series(1000, 32767) as a(uid) where
a.uid not in (select coalesce(uid,0) from people where uid is not
null) order by random() limit 1;
And we're back to taking forever.

So I have 2 questions:

- Is there a better query for this purpose? Mine works when coalesced,
but it seems a little brute-force and the random() sorting, while
kinda nice, is slow.

- Is this in any way expected? I know that nulls sometimes cause
problems, but why is it taking forever even when trying to filter
those out?

Thanks.

Peter

The gory details:
- There is an btree index on people(uid), and there are ~6300 rows, of
which ~1300 have null uids.

- EXPLAIN output (I couldn't get EXPLAIN ANALYZE output from the first
two queries since they took too long):
=> explain select a.uid from generate_series(1000, 32767) as a(uid)
where a.uid not in (select uid from people) order by random() limit 1;
QUERY PLAN
--
 Limit  (cost=40025.57..40025.60 rows=10 width=4)
   ->  Sort  (cost=40025.57..40026.82 rows=500 width=4)
 Sort Key: random()
 ->  Function Scan on generate_series a
(cost=693.16..40003.16 rows=500 width=4)
   Filter: (NOT (subplan))
   SubPlan
 ->  Materialize  (cost=693.16..756.03 rows=6287 width=2)
   ->  Seq Scan on people  (cost=0.00..686.87
rows=6287 width=2)
(8 rows)

=> explain select a.uid from generate_series(1000, 32767) as a(uid)
where a.uid not in (select uid from people where uid is not null)
order by random() limit 1;
QUERY PLAN
--
 Limit  (cost=31486.71..31486.73 rows=10 width=4)
   ->  Sort  (cost=31486.71..31487.96 rows=500 width=4)
 Sort Key: random()
 ->  Function Scan on generate_series a
(cost=691.79..31464.29 rows=500 width=4)
   Filter: (NOT (subplan))
   SubPlan
 ->  Materialize  (cost=691.79..741.00 rows=4921 width=2)
   ->  Seq Scan on people  (cost=0.00..686.87
rows=4921 width=2)
 Filter: (uid IS NOT NULL)
(9 rows)

=> explain select a.uid from generate_series(1000, 32767) as a(uid)
where a.uid not in (select coalesce(uid, 0) from people) order by
random() limit 1;
   QUERY PLAN

 Limit  (cost=756.97..756.99 rows=10 width=4)
   ->  Sort  (cost=756.97..758.22 rows=500 width=4)
 Sort Key: random()
 ->  Function Scan on generate_series a  (cost=718.30..734.55
rows=500 width=4)
   Filter: (NOT (hashed subplan))
   SubPlan
 ->  Seq Scan on people  (cost=0.00..702.59 rows=6287 width=2)
(7 rows)

=> explain analyze select a.uid from generate_series(1000, 32767) as
a(uid) where a.uid not in (select coalesce(uid, 0) from people) order
by random() limit 1;
   QUERY PLAN
-
 Limit  (cost=756.97..756.99 rows=10 width=4) (actual
time=370.444..370.554 rows=10 loops=1)
   ->  Sort  (cost=756.97..758.22 rows=500 width=4) (actual
time=370.434..370.472 rows=10 loops=1)
 Sort Key: random()
 ->  Function Scan on generate_series a  (cost=718.30..734.55
rows=500 width=4) (actual time=70.018..199.540 rows=26808 loops=1)
   Filter: (NOT (hashed subplan))
   SubPlan
 ->  Seq Scan on people  (cost=0.00..702.59 rows=6287
width=2) (actual time=0.023..29.167 rows=6294 loops=1)
 Total runtime: 372.224 ms
(8 rows)

---

Re: [PERFORM] sequence query performance issues

2007-09-28 Thread Peter Koczan
> > Hmm - why is it doing that?
>
> I'm betting that the OP's people.uid column is not an integer.  Existing
> PG releases can't use hashed subplans for cross-data-type comparisons
> (8.3 will be a bit smarter).

*light bulb* Ahhh, that's it. So, I guess the solution is either
to cast the column or wait for 8.3 (which isn't a problem since the
port won't be done until 8.3 is released anyway).

Thanks again.

Peter

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Non-blocking vacuum full

2007-09-28 Thread Peter Schuller
Hello,

I was wondering whether any thought has previously been given to
having a non-blocking "vacuum full", in the sense of space reclamation
and table compactation.

The motivation is that it is useful to be able to assume that
operations that span a table will *roughtly* scale linearly with the
size of the table. But when you have a table that over an extended
period of time begins small, grows large, and grows small again (where
"large" might be, say, 200 GB), that assumption is most definitely
not correct when you're on the downward slope of that graph. Having
this assumption remain true simplifies things a lot for certain
workloads (= my particular work load ;)).

I have only looked very very briefly at the PG code so I don't know
how far fetched it is, but my thought was that it should be possible
to have a slow background process (similar to normal non-full vacuums
nows) that would, instead of registering dead tuples in the FSM, move
live tuples around.

Combine that slow moving operations with a policy to a new tuple space
allocation policy that prefers earlier locations on-disk, it should in
time result in a situation where the physical on-disk file contains
only dead tuples after a certain percentage location. At this point
the file can be truncated, giving space back to the OS as well as
eliminating all that dead space from having to be covered by
sequential scans on the table.

This does of course increase the total cost of all updates and
deletes, but would be very useful in some senarios. It also has the
interesting property that the scan for live tuples to move need not
touch the entire table to be effective; it could by design be applied
to the last  percentage of the table, where  would be scaled
appropriately with the frequency of the checks relative to
update/insert frequency.

Other benefits:

  * Never vacuum full - EVER. Not even after discovering too small
max_fsm_pages or too infrequent vacuums and needing to retroactively
shrink the table.
  * Increased locality in general; even if one does not care about
the diskspace or sequential scanning. Particularly relevant for low-update 
frequency
tables suffering from sudden shrinkage, where a blocking VACUUM FULL Is not
acceptable.
  * Non-blocking CLUSTER is perhaps suddently more trivial to implement?
Or at least SORTOFCLUSTER when you want it for reasons other than
perfect order ("mostly sorted").

Opinions/thoughts?

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



pgpFbOXmSf908.pgp
Description: PGP signature


Re: [PERFORM] sequence query performance issues

2007-10-01 Thread Peter Koczan
> *light bulb* Ahhh, that's it. So, I guess the solution is either
> to cast the column or wait for 8.3 (which isn't a problem since the
> port won't be done until 8.3 is released anyway).

Just a quick bit of follow-up:

This query works and is equivalent to what I was trying to do (minus
the randomization and limiting):
=> select a.uid from generate_series(1000, 32000) as a(uid) where
a.uid::smallint not in (select uid from people where uid is not null);

It turns out that this and using coalesce are a wash in terms of
performance, usually coming within 10 ms of each other no matter what
limit and ordering constraints you put on the queries.

Peter

=> explain analyze select a.uid from generate_series(1000, 32767) as
a(uid) where a.uid not in (select coalesce(uid, 0) from people);
 QUERY PLAN
-
 Function Scan on generate_series a  (cost=718.41..733.41 rows=500
width=4) (actual time=68.742..186.340 rows=26808 loops=1)
   Filter: (NOT (hashed subplan))
   SubPlan
 ->  Seq Scan on people  (cost=0.00..702.68 rows=6294 width=2)
(actual time=0.025..28.368 rows=6294 loops=1)
 Total runtime: 286.311 ms
(5 rows)

=> explain analyze select a.uid from generate_series(1000, 32767) as
a(uid) where a.uid::smallint not in (select uid from people where uid
is not null);
 QUERY PLAN
-
 Function Scan on generate_series a  (cost=699.34..716.84 rows=500
width=4) (actual time=58.508..177.683 rows=26808 loops=1)
   Filter: (NOT (hashed subplan))
   SubPlan
 ->  Seq Scan on people  (cost=0.00..686.94 rows=4958 width=2)
(actual time=0.017..23.123 rows=4971 loops=1)
   Filter: (uid IS NOT NULL)
 Total runtime: 277.699 ms
(6 rows)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Memory Settings....

2007-10-22 Thread Peter Koczan
I recently tweaked some configs for performance, so I'll let you in on
what I changed.

For memory usage, you'll want to look at shared_buffers, work_mem, and
maintenance_work_mem. Postgres defaults to very low values of this,
and to get good performance and not a lot of disk paging, you'll want
to raise those values (you will need to restart the server and
possibly tweak some memory config for lots of shared_buffers, I had to
raise SHMMAX on Linux, but I don't know the Windows analogue). The
basic rule of thumb for shared_buffers is 25%-50% of main memory,
enough to use main memory but leaving some to allow work_mem to do its
thing and allow any other programs to run smoothly. Tweak this as
necessary.

The other big thing is the free space map, which tracks free space and
helps to prevent index bloat. A VACUUM VERBOSE in a database will tell
you what these values should be set to.

Go here for full details:
http://www.postgresql.org/docs/8.2/static/runtime-config.html, especially
http://www.postgresql.org/docs/8.2/static/runtime-config-resource.html

Peter

On 10/22/07, Lee Keel <[EMAIL PROTECTED]> wrote:
>
>
>
> I have a client server that is dedicated to being a Postgres 8.2.4 database
> server for many websites.  This server will contain approximately 15
> databases each containing between 40-100 tables.  Each database will have
> approximately 7 web applications pulling data from it, but there will
> probably be no more than 50 simultaneous requests.  The majority of the
> tables will be very small tables around 1K in total size.  However, most of
> the queries will be going to the other 10-15 tables that are in each
> database that will contain postgis shapes.  These tables will range in size
> from 50 to 730K rows and each row will range in size from a 2K to 3MB.  The
> data will be truncated and reinserted as part of a nightly process but other
> than that, there won't be many writes during the day.  I am trying to tune
> this server to its maximum capacity.  I would appreciate any advice on any
> of the settings that I should look at.  I have not changed any of the
> settings before because I have never really needed to.  And even now, I have
> not experienced any bad performance, I am simply trying to turn the track
> before the train gets here.
>
> Server Specification:
>
> Windows 2003 Enterprise R2
>
> Dual-Quad Core 2.33GHz
>
> 8GB RAM
>
> 263 GB HD (I am not 100% on drive speed, but I think it is 15K)
>
>
> Thanks in advance,
>
> Lee Keel
>
>  This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you have received this email in error please notify the sender. This
> message contains confidential information and is intended only for the
> individual named. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] pg_dump and pg_restore

2010-05-22 Thread Peter Koczan
On Mon, May 17, 2010 at 12:04 AM, Jayadevan M
 wrote:
> Hello all,
> I was testing how much time a pg_dump backup would take to get restored.
> Initially, I tried it with psql (on a backup taken with pg_dumpall). It took
> me about one hour. I felt that I should target for a recovery time of 15
> minutes to half an hour. So I went through the blogs/documentation etc and
> switched to pg_dump and pg_restore. I tested only the database with the
> maximum volume of data (about 1.5 GB). With
> pg_restore -U postgres -v -d PROFICIENT --clean -Fc proficient.dmp
> it took about 45 minutes. I tried it with
> pg_restore -U postgres -j8 -v -d PROFICIENT --clean -Fc proficient.dmp
> Not much improvement there either. Have I missed something or 1.5 GB data on
> a machine with the following configuration will take about 45 minutes? There
> is nothing else running on the machine consuming memory or CPU. Out of 300
> odd tables, about 10 tables have millions of records, rest are all having a
> few thousand records at most.
>
> Here are the specs  ( a pc class  machine)-
>
> PostgreSQL 8.4.3 on i686-pc-linux-gnu
> CentOS release 5.2
> Intel(R) Pentium(R) D CPU 2.80GHz
> 2 GB RAM
> Storage is local disk.
>
> Postgresql parameters (what I felt are relevant) -
> max_connections = 100
> shared_buffers = 64MB
> work_mem = 16MB
> maintenance_work_mem = 16MB
> synchronous_commit on

Do the big tables have lots of indexes? If so, you should raise
maintenance_work_mem.

Peter

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Add slowdown after conversion to UTF8

2010-06-17 Thread Peter Eisentraut
On tor, 2010-06-17 at 18:28 -0400, Brant Fitzsimmons wrote:
> Performance has dropped through the floor after converting my db from
> ASCI to UTF8.

Converting from ASCII to UTF8 is a noop.

If you did some configuration changes, you need to tell us which.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Questions on query planner, join types, and work_mem

2010-07-27 Thread Peter Hussey
I have spent the last couple of weeks digging into a Postgres performance
problem that ultimately boiled down to this:  the planner was choosing to
use hash joins on a set of join keys that were much larger than the
configured work_mem.  We found we could make the  performance much better by
either
1) increasing work_mem to 500MB or more, or
2) forcing the planner to choose index-backed nested loops by turning off
hash and merge joins as well as bitmap and sequential scans.

Now we are trying to decide which of these paths to choose, and asking why
the planner doesn't handle this for us.

Background:  LabKey builds an open source platform for biomedical research
data.  The platform consists of a tomcat web application and a relational
database.  we support two databases, Postgres and SQL Server.  We started
with SQL Server because we were very familiar with it.  Two of our technical
team came from the SQL Server development team.  We chose Postgres because
we assessed that it was the open source database most likely to be able to
handle our application  requirements for capacity and complex, nested,
generated SQL handling.  Postgres is now the default database for our
platform and most of our key customers use it.  In general we've been very
satisfied with Postgres' performance and compatibility, but our customers
are starting to hit situations where we really need to be able to understand
why a particular operation is slow.  We are currently recommending version
8.4 and using that ourselves.

The core of the problem query was

SELECT * INTO snapshot_table FROM
  (SELECT ... FROM  tableA A LEFT  OUTER JOIN tableB B ON (A.lsid = B.lsid)
and A.datasetid = ? )  query1

the join column, lsid, is a poor choice for a join column as it is a long
varchar value (avg length 101 characters) that us only gets unique way out
on the right hand side.  But we are stuck with this choice.  I can post the
SQL query and table definitions if it will help, but changes to either of
those would be risky and difficult, whereas setting the work_mem value or
forcing nested loop joins is less risky.

The Performance curve looks something like this

Join Type  work_mem(MB) time to populate snapshot (min)
__
Hash  5085
Hash  200   38
Hash  400   21
Hash  500   12
Hash 1000   12
___
NestedLoop5015
NestedLoop200   11
NestedLoop400   11
NestedLoop500   10
NestedLoop   1000   10


Table A contains about 3.5 million rows, and table B contains about 4.4
million rows.  By looking at the EXPLAIN ANALYZE reports I concluded that
the planner seemed to be accurately determining the approximate number of
rows returned on each side of the join node.  I also noticed that at the
work_mem = 50 test, the hash join query execution was using over a GB of
space in the pgsql_tmp, space that grew and shrank slowly over the course of
the test.

Now for the questions:
1)  If we tell the customer to set his work_mem value to 500MB or 1GB in
postgres.config, what problems might they see?  the documentation and the
guidelines we received from Rupinder Singh in support suggest a much lower
value, e.g. a max work_mem of 10MB.  Other documentation such as the "Guide
to Posting Slow Query Questions" suggest at least testing up to 1GB.  What
is a reasonable maximum to configure for all connnections?

2) How is work_mem used by a query execution?  For example, does each hash
table in an execution get allocated a full work_mem's worth of memory ?   Is
this memory released when the query is finished, or does it stay attached to
the connection or some other object?

3) is there a reason why the planner doesn't seem to recognize the condition
when the hash table won't fit in the current work_mem, and choose a
low-memory plan instead?

Excuse the long-winded post; I was trying to give the facts and nothing but
the facts.

Thanks,
Peter Hussey
LabKey Software


Re: [PERFORM] Questions on query planner, join types, and work_mem

2010-08-02 Thread Peter Hussey
I already had effective_cache_size set to 500MB.

I experimented with lowering  random_page_cost to 3 then 2.  It made no
difference in the choice of plan that I could see.  In the explain analyze
output the estimated costs of nested loop were in fact lowererd, but so were
the costs of the hash join plan, and the hash join remained the lowest
predicted costs in all tests i tried.

What seems wrong to me is that the hash join strategy shows almost no
difference in estimated costs as work_mem goes from 1MB to 500MB. The cost
function decreases by 1%, but the actual time for the query to execute
decreases by 86% as work_mem goes from 1MB to 500MB.

My questions are still
1)  Does the planner have any component of cost calculations based on the
size of work_mem, and if so why do those calculations  seem to have so
little effect here?

2) Why is the setting of work_mem something left to the admin and/or
developer?  Couldn't the optimizer say how much it thinks it needs to build
a hash table based on size of the keys and estimated number of rows?

It is difficult for a software development platform like ours to take
advantage of suggestions to set work_mem, or to change the cost function, or
turn on/off join strategies for individual queries.  The SQL we issue is
formed by user interaction with the product and rarely static.  How would we
know when to turn something on or off?  That's why I'm looking for a
configuratoin solution that I can set on a database-wide basis and have it
work well for all queries.

thanks
Peter


On Fri, Jul 30, 2010 at 7:03 AM, Tom Lane  wrote:

> Peter Hussey  writes:
> > Using the default of 1MB work_mem, the planner chooses a hash join plan :
> > "Hash Left Join  (cost=252641.82..11847353.87 rows=971572 width=111)
> (actual
> > time=124196.670..280461.604 rows=968080 loops=1)"
> > ...
> > For the same default 1MB work_mem, a nested loop plan is better
> > "Nested Loop Left Join  (cost=8.27..15275401.19 rows=971572 width=111)
> > (actual time=145.015..189957.023 rows=968080 loops=1)"
> > ...
>
> Hm.  A nestloop with nearly a million rows on the outside is pretty
> scary.  The fact that you aren't unhappy with that version of the plan,
> rather than the hash, indicates that the "object" table must be
> fully cached in memory, otherwise the repeated indexscans would be a
> lot slower than this:
>
> > "  ->  Index Scan using uq_object on object obj  (cost=0.00..3.51 rows=1
> > width=95) (actual time=0.168..0.170 rows=1 loops=968080)"
> > "Index Cond: ((sd.lsid)::text = (obj.objecturi)::text)"
>
> My take on it is that the estimate of the hash plan's cost isn't bad;
> what's bad is that the planner is mistakenly estimating the nestloop as
> being worse.  What you need to do is adjust the planner's cost
> parameters so that it has a better idea of the true cost of repeated
> index probes in your environment.  Crank up effective_cache_size if
> you didn't already, and experiment with lowering random_page_cost.
> See the list archives for more discussion of these parameters.
>
>regards, tom lane
>



-- 
Peter Hussey
LabKey Software
206-667-7193 (office)
206-291-5625 (cell)


Re: [PERFORM] help tuning queries on large database

2006-01-09 Thread peter royal

On Jan 9, 2006, at 2:01 PM, Luke Lonergan wrote:

Peter,

On 1/9/06 9:23 AM, "peter royal" <[EMAIL PROTECTED]> wrote:


This is a 2-disk RAID0


Your 2-disk results look fine - what about your 8-disk results?


after some further research the 2-disk RAID0 numbers are not bad.

I have a single drive of the same type hooked up to the SATA2 port on  
the motherboard to boot from, and its performance numbers are (linux  
2.6.15, ext3):


[EMAIL PROTECTED] ~]# time bash -c 'dd if=/dev/zero of=/tmp/bigfile bs=8k  
count=100 && sync'

100+0 records in
100+0 records out

real4m55.032s
user0m0.256s
sys 0m47.299s
[EMAIL PROTECTED] ~]# time dd if=/tmp/bigfile bs=8k of=/dev/null
100+0 records in
100+0 records out

real3m27.229s
user0m0.156s
sys 0m13.377s

so, there is a clear advantage to RAID over a single drive.


now, some stats in a 8-disk configuration:

8-disk RAID0, ext3, 16k read-ahead

[EMAIL PROTECTED] /opt/pgdata]# time bash -c 'dd if=/dev/zero of=/opt/ 
pgdata/bigfile bs=8k count=100 && sync'

100+0 records in
100+0 records out

real0m53.030s
user0m0.204s
sys 0m42.015s

[EMAIL PROTECTED] /opt/pgdata]# time dd if=/opt/pgdata/bigfile bs=8k of=/ 
dev/null

100+0 records in
100+0 records out

real0m23.232s
user0m0.144s
sys 0m13.213s


8-disk RAID0, xfs, 16k read-ahead

[EMAIL PROTECTED] /opt/pgdata]# time bash -c 'dd if=/dev/zero of=/opt/ 
pgdata/bigfile bs=8k count=100 && sync'

100+0 records in
100+0 records out

real0m32.177s
user0m0.212s
sys 0m21.277s

[EMAIL PROTECTED] /opt/pgdata]# time dd if=/opt/pgdata/bigfile bs=8k of=/ 
dev/null

100+0 records in
100+0 records out

real0m21.814s
user0m0.172s
sys 0m13.881s


... WOW.. highly impressed with the XFS write speed! going to stick  
with that!


Overall, I got a 50% boost in the overall speed of my test suite by  
using XFS and the 16k read-ahead.


Given that you want to run in production with RAID10, the most you  
should
expect is 2x the 2-disk results using all 8 of your disks.  If you  
want the

best rate for production while preserving data integrity, I recommend
running your Areca in RAID5, in which case you should expect 3.5x your
2-disk results (7 drives).  You can assume you'll get that if you  
use XFS +

readahead.  OTOH - I'd like to see your test results anyway :-)


I've been avoiding RAID5 after reading how performance drops when a  
drive is out/rebuilding. The performance benefit will outweigh the  
cost I think.


Thanks for the help!
-pete

--
(peter.royal|osi)@pobox.com - http://fotap.org/~osi



smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] Postgres slower than MS ACCESS

2006-02-16 Thread Peter Childs
On 15/02/06, Jay Greenfield <[EMAIL PROTECTED]> wrote:
I've been vacuuming between each test run.Not vacuuming results in times all the way up to 121 minutes.  For a directcomparison with Access, the vacuuming time with Postgres should really beincluded as this is not required with Access.


Hmm but then you would have to include Access Vacuum too I'll think you
will find "Tools -> Database Utils -> Compact Database" preforms
a simular purpose and is just as important as I've seen many Access
Databases bloat in my time.

Peter Childs 



Re: [PERFORM] Large Table With Only a Few Rows

2006-02-27 Thread Peter Childs
On 27/02/06, Chris Browne <[EMAIL PROTECTED]> wrote:
"Nik" <[EMAIL PROTECTED]> writes:> I have a table that has only a few records in it at the time, and they> get deleted every few seconds and new records are inserted. Table never
> has more than 5-10 records in it.>> However, I noticed a deteriorating performance in deletes and inserts> on it. So I performed vacuum analyze on it three times (twice in a row,> and once two days later). In the statistics it says that the table size
> is 863Mb, toast table size is 246Mb, and indexes size is 134Mb, even> though the table has only 5-10 rows in it it. I was wondering how can I> reclaim all this space and improve the performance?
You need to run VACUUM ANALYZE on this table very frequently.Based on what you describe, "very frequently" should be on the orderof at least once per minute.Schedule a cron job specifically to vacuum this table, with a cron
entry like the following:* * * * * /usr/local/bin/vacuumdb -z -t my_table -p 5432 my_databaseOf course, you need to bring it back down to size, first.You could run CLUSTER on the table to bring it back down to size;
that's probably the fastest way...   cluster my_table_pk on my_table;VACUUM FULL would also do the job, but probably not as quickly.--(reverse (concatenate 'string "gro.gultn" "@" "enworbbc"))
http://cbbrowne.com/info/sgml.html"Now they can put you in jail if they *THINK* you're gonna commit acrime.  Let me say that again, because it sounds vaguely important"
--george carlin---(end of broadcast)---TIP 9: In versions below 8.0, the planner will ignore your desire to   choose an index scan if your joining column's datatypes do not
   match
You probably want to do one or two other things.

1> Switch on autovacuum.

2> improve the setting of max_fsm_pages in your postgresql.conf a restart will be required.

if you do a "vacuum verbose;" the last couple of lines should tell you
how much free space is about against how much free space the database
can actuall remember to use. 

INFO:  free space map contains 5464 pages in 303 relations
DETAIL:  A total of 9760 page slots are in use (including overhead).
9760 page slots are required to track all free space.
Current limits are:  4 page slots, 1000 relations, using 299 KB.

if the required page slots (9760 in my case) goes above the current
limit (4 in my case) you will need to do a vacuum full to reclaim
the free space. (cluster of the relevent tables may work.

If you run Vacuum Verbose regullally you can check you are vacuuming
often enough and that your free space map is big enough to hold your
free space.

Peter Childs
 


[PERFORM] Index scan startup time

2006-03-30 Thread Peter Eisentraut
[Apologies if this already went through.  I don't see it in the archives.]

Normally one expects that an index scan would have a startup time of nearly 
zero.  Can anyone explain this:

EXPLAIN ANALYZE select activity_id from activity where state in (1, 10001) 
order by activity_id limit 100;

QUERY PLAN

Limit  (cost=0.00..622.72 rows=100 width=8) (actual 
time=207356.054..207356.876 rows=100 loops=1)
  ->  Index Scan using activity_pk on activity  (cost=0.00..40717259.91 
rows=6538650 width=8) (actual time=207356.050..207356.722 rows=100 loops=1)
Filter: ((state = 1) OR (state = 10001))
Total runtime: 207357.000 ms

The table has seen VACUUM FULL and REINDEX before this.

The plan choice and the statistics look right, but why does it take 3 minutes 
before doing anything?  Or is the measurement of the actual start time 
inaccurate?  This is quite reproducible, so it's not just a case of a 
temporary I/O bottleneck, say.

(PostgreSQL 8.0.3)

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Index scan startup time

2006-03-30 Thread Peter Eisentraut
Am Donnerstag, 30. März 2006 14:02 schrieb Steinar H. Gunderson:
> On Thu, Mar 30, 2006 at 01:59:10PM +0200, Peter Eisentraut wrote:
> > EXPLAIN ANALYZE select activity_id from activity where state in (1,
> > 10001) order by activity_id limit 100;
> >
> > QUERY PLAN
> >
> > Limit  (cost=0.00..622.72 rows=100 width=8) (actual
> > time=207356.054..207356.876 rows=100 loops=1)
> >   ->  Index Scan using activity_pk on activity  (cost=0.00..40717259.91
> > rows=6538650 width=8) (actual time=207356.050..207356.722 rows=100
> > loops=1) Filter: ((state = 1) OR (state = 10001))
> > Total runtime: 207357.000 ms
> >
> > The table has seen VACUUM FULL and REINDEX before this.
>
> The index scan is by activity_id, not by state. Do you have an index on
> state at all?

There is an index on state as well but the column is not selective enough.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Index scan startup time

2006-03-30 Thread Peter Eisentraut
Am Donnerstag, 30. März 2006 14:06 schrieb Michael Stone:
> On Thu, Mar 30, 2006 at 01:59:10PM +0200, Peter Eisentraut wrote:
> >The table has seen VACUUM FULL and REINDEX before this.
>
> But no analyze?

ANALYZE as well, but the plan choice is not the point anyway.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Index scan startup time

2006-03-30 Thread Peter Eisentraut
Am Donnerstag, 30. März 2006 14:31 schrieb Steinar H. Gunderson:
> Well, it's logical enough; it scans along activity_id until it finds one
> with state=1 or state=10001. You obviously have a _lot_ of records with
> low activity_id and state none of these two, so Postgres needs to scan all
> those records before it founds 100 it can output. This is the “startup
> cost” you're seeing.

The startup cost is the cost until the plan is set up to start outputting 
rows.  It is not the time until the first row is found.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Index scan startup time

2006-03-30 Thread Peter Eisentraut
Tom Lane wrote:
> The problem here appears to be a non-random correlation between state
> and activity, such that the desired state values are not randomly
> scattered in the activity sequence.  The planner doesn't know about
> that correlation and hence can't predict the poor startup time.

So from when to when is the startup time (the "x" in "x..y") actually 
measured?  When does the clock start ticking and when does it stop?  
That is what's confusing me.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Poor performance - fixed by restart

2006-06-21 Thread Peter Wilson

I've recently configured a new high-performance database server:
2xXeon 3.4G, 2G RAM, 4x15K SCSI disks in RAID 10, h/w RAID

This has been live for a couple of weeks.

The box is running Fedora Core 4.

The only thing running on this box is PostgreSQL 8.1.4 and some stub 
applications that handle the interface to Postgres (basically taking XML service 
requests, translating into SQL and using libpq). The database is a backend for a 
big web application. The web-server and processor intensive front-end run on a 
separate server.


Postgres has probably been running for 2 weeks now.

I've just uploaded a CSV file that the web-application turns into the contents 
into multiple requests to the database. Each row in the CSV file causes a few 
transactions to fire. Bascially adding rows into a couple of table. The tables 
at the moment aren't huge (20,000 rows in on, 150,000 in the other).


Performance was appalling - taking 85 seconds to upload the CSV file and create 
the records. A separate script to delete the rows took 45 seconds. While these 
activities were taking place the Postgres process was using 97% CPU on the 
server - nothing else much running.


For comparison, my test machine (750M Athlon, RedHat 8, 256M RAM, single IDE 
hard drive) created the records in 22 seconds and deleted them again in 17.


I had autovacuum ON - but to make sure I did first a vacuum analyze (no 
difference) then vacuum full (again no difference).


I'd tweaked a couple of parameters in postgres.conf - the significant one I 
thought being random_page_cost, so I changed this back to default and did a 
'service postgresql reload' - no difference, but I wasn't sure whether this 
could be changed via reload so I restarted Postgres.


The restart fixed the problem. The 85 second insert time dropped back down to 5 
seconds!!!


To check whether the random_page_cost was making the difference I restored the 
old postgres.conf, restarted postgres and redid the upload. Rather suprisingly - 
 the upload time was still at 5 seconds.


Any thoughts? I find it hard to believe that Postgres performance could degrade 
over a couple of weeks. Read performance seemed to be fine. The postgres memory 
size didn't seem to be huge. What else am I overlooking? What could I have 
changed by simply restarting Postgres that could make such a drastic change in 
performance?


Pete

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] increment Rows in an SQL Result Set postgresql

2006-07-15 Thread Peter Eisentraut
Hassan Adekoya wrote:
> I will like to preserve ordering

Tables are inherently unordered.  If you want a particular order, you 
need to use the ORDER BY clause.  And you will need to have a column to 
sort by.  If you don't have one, the generate_series() function may 
help.

This has nothing to do with performance, I gather, so it might be more 
appropriate for the pgsql-sql list.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Forcing using index instead of sequential scan?

2006-07-21 Thread Peter Eisentraut
[EMAIL PROTECTED] wrote:
> What is the best way to force the use of indexes in these queries?

Well, the brute-force method is to use SET enable_seqscan TO off, but if 
you want to get to the bottom of this, you should look at or post the 
EXPLAIN ANALYZE output of the offending queries.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] PostgreSQL runs a query much slower than BDE and MySQL

2006-08-16 Thread Peter Hardman
I'm in the process of migrating a Paradox 7/BDE 5.01 database from single-user 
Paradox to a web based interface to either MySQL or PostgreSQL.
The database is a pedigree sheep breed society database recording sheep and 
flocks (amongst other things).

My current problem is with one table and an associated query which takes 10 
times longer to execute on PostgreSQL than BDE, which in turn takes 10 times 
longer than MySQL. The table links sheep to flocks and is created as follows:

CREATE TABLE SHEEP_FLOCK
(
  regn_no varchar(7) NOT NULL,
  flock_no varchar(6) NOT NULL,
  transfer_date date NOT NULL,
  last_changed date NOT NULL,
  CONSTRAINT SHEEP_FLOCK_pkey PRIMARY KEY (regn_no, flock_no, 
transfer_date)
) 
WITHOUT OIDS;
ALTER TABLE SHEEP_FLOCK OWNER TO postgres;

I then populate the table with 

COPY SHEEP_FLOCK
FROM 'e:/ssbg/devt/devt/export_data/sheep_flock.txt'
WITH CSV HEADER

The table then has about 82000 records

The query I run is:

/* Select all sheep who's most recent transfer was into the subject flock */
SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
FROM SHEEP_FLOCK f1 JOIN 
/* The last transfer date for each sheep */
(SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date
FROM  SHEEP_FLOCK f
GROUP BY f.regn_no) f2 
ON f1.regn_no = f2.regn_no
WHERE f1.flock_no = '1359'
AND f1.transfer_date = f2.last_xfer_date

The sub-select on it's own returns about 32000 rows.

Using identically structured tables and the same primary key, if I run this on 
Paradox/BDE it takes about 120ms, on MySQL (5.0.24, local server) about 3ms, 
and on PostgresSQL (8.1.3, local server) about 1290ms). All on the same 
Windows XP Pro machine with 512MB ram of which nearly half is free.  

The query plan shows most of the time is spent sorting the 3+ rows from the 
subquery, so I added a further
subquery as follows: 

/* Select all sheep who's most recent transfer was into the subject flock */
SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
FROM SHEEP_FLOCK f1 JOIN 
/* The last transfer date for each sheep */
(SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date
FROM  SHEEP_FLOCK f
WHERE f.regn_no IN 
/* Limit the rows extracted by the outer sub-query to those relevant to 
the 
subject flock */
/* This typically reduces the time from 1297ms to 47ms - from 35000 
rows 
to 127 rows */
(SELECT s.regn_no FROM SHEEP_FLOCK s where s.flock_no = '1359')
GROUP BY f.regn_no) f2 
ON f1.regn_no = f2.regn_no
WHERE f1.flock_no = '1359'
AND f1.transfer_date = f2.last_xfer_date

then as the comment suggests I get a considerable improvement, but it's still 
an 
order of magnitude slower than MySQL.

Can anyone suggest why PostgreSQL performs the original query so much slower 
than even BDE?
 -- 
Peter Hardman
Acre Cottage, Horsebridge
King's Somborne
Stockbridge
SO20 6PT

== Breeder of Shetland Cattle and Shetland Sheep ==


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL

2006-08-16 Thread Peter Hardman
On 16 Aug 2006 at 20:02, Arjen van der Meijden wrote:

> On 16-8-2006 18:48, Peter Hardman wrote:
> > Using identically structured tables and the same primary key, if I run this 
> > on 
> > Paradox/BDE it takes about 120ms, on MySQL (5.0.24, local server) about 
> > 3ms, 
> > and on PostgresSQL (8.1.3, local server) about 1290ms). All on the same 
> > Windows XP Pro machine with 512MB ram of which nearly half is free.  
> 
> Is that with or without query caching? I.e. can you test it with SELECT 
> SQL_NO_CACHE ... ?
> In a read-only environment it will still beat PostgreSQL, but as soon as 
> you'd get a read-write environment, MySQL's query cache is of less use. 
> So you should compare both the cached and non-cached version, if applicable.
It seems to make no difference - not surprising really as I'm just running the 
query 
from the command line interface.
> 
> Besides that, most advices on this list are impossible without the 
> result of 'explain analyze', so you should probably get that as well.
Here is the output of EXPLAIN ANALYZE for the slow query:

Unique  (cost=7201.65..8487.81 rows=1 width=13) (actual 
time=1649.733..1811.684 rows=32 loops=1)
  ->  Merge Join  (cost=7201.65..8487.80 rows=1 width=13) (actual 
time=1649.726..1811.528 rows=32 loops=1)
Merge Cond: ((("outer".regn_no)::text = "inner"."?column3?") AND 
("outer".transfer_date = "inner".last_xfer_date))
->  Index Scan using sheep_flock_pkey on sheep_flock f1  
(cost=0.00..1033.19 rows=77 width=13) (actual time=15.357..64.237 rows=127 
loops=1)
  Index Cond: ((flock_no)::text = '1359'::text)
->  Sort  (cost=7201.65..7285.84 rows=33676 width=15) (actual 
time=1580.198..1653.502 rows=38277 loops=1)
  Sort Key: (f2.regn_no)::text, f2.last_xfer_date
  ->  Subquery Scan f2  (cost=0.00..4261.67 rows=33676 width=15) 
(actual 
time=0.331..598.246 rows=38815 loops=1)
->  GroupAggregate  (cost=0.00..3924.91 rows=33676 
width=13) 
(actual time=0.324..473.131 rows=38815 loops=1)
  ->  Index Scan using sheep_flock_pkey on sheep_flock 
f  
(cost=0.00..3094.95 rows=81802 width=13) (actual time=0.295..232.156 
rows=81802 loops=1)
Total runtime: 1812.737 ms


> 
> I'm not sure whether this is the same query, but you might want to try:
> SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
> FROM SHEEP_FLOCK f1
> WHERE
> f1.flock_no = '1359'
> AND f1.transfer_date = (SELECT MAX(f.transfer_date) FROM SHEEP_FLOCK f 
> WHERE regn_no = f1.regn_no)
> 
That's neat - I didn't know you could make a reference from a subselect to the 
outer select. Your query has the same performance as my very complex one on 
both MySQL and PostgreSQL. However I'm not entirely sure about the times for 
MySQL - every interface gives a different answer so I'll have to try them from 
a 
script so I know whats going on.
Interestingly BDE takes 7 seconds to run your query. Just as well I didn't 
start 
from there... 
> And you might need an index on (regn_no, transfer_date) and/or one 
> combined with that flock_no.
Explain says it only uses the primary key, so it seems there' no need for a 
separate index

Thanks for the help
-- 
Peter Hardman
Acre Cottage, Horsebridge
King's Somborne
Stockbridge
SO20 6PT

== Breeder of Shetland Cattle and Shetland Sheep ==


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL

2006-08-17 Thread Peter Hardman

On 17 Aug 2006 at 10:00, Mario Weilguni wrote:

> not really sure if this is right without any testdata, but isn't that what 
> you 
> want?
> 
> CREATE index foo on sheep_flock (flock_no);
> 
> SELECT DISTINCT on (f1.transfer_date) f1.regn_no, f1.transfer_date as date_in
> FROM SHEEP_FLOCK f1
> WHERE f1.flock_no = '1359'
> order by f1.transfer_date desc;
> 
> best regards, 
> mario weilguni
> 
> 
Mario, Thanks for the suggestion, but this query produces the wrong answer - 
but 
then I provided no data, nor properly explained what the data would be.
Each sheep will have multiple records, starting with one for when it's first 
registered, then one for each flock it's in (eg sold into) then one for when it 
dies 
and goes to the 'big flock in the sky'.

 So first I need to find the most recent record for each sheep and then select 
the 
sheep who's most recent record matches the flock in question.

Your query finds all the sheep that have been in the flock in question, then 
selects 
the first one from each set of records with the same date. So it collects data 
on 
dead sheep, and only selects one sheep if several were bought or registered on 
the same day.

Forgive me for being verbose - I want to make sure I understand it propely 
myself!

regards, 
 -- 
Peter Hardman
Acre Cottage, Horsebridge
King's Somborne
Stockbridge
SO20 6PT

== Breeder of Shetland Cattle and Shetland Sheep ==


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL

2006-08-17 Thread Peter Hardman
On 16 Aug 2006 at 18:51, Tom Lane wrote:

> "Peter Hardman" <[EMAIL PROTECTED]> writes:
> > I'm in the process of migrating a Paradox 7/BDE 5.01 database from 
> > single-user 


Arjen van der Meijden has proposed a very elegant query in another post. 

> What I find interesting though is that it sounds like both MSSQL and
> Paradox know something we don't about how to optimize it.  PG doesn't
> have any idea how to do the above query without forming the full output
> of the sub-select, but I suspect that the commercial DBs know a
> shortcut; perhaps they are able to automatically derive a restriction
> in the subquery similar to what you did by hand.  Does Paradox have
> anything comparable to EXPLAIN that would give a hint about the query
> plan they are using?

Sadly, no. In fact the ability to use SQL from Paradox at all is not well known 
and 
not very visible in the the documentation. 

I wonder whether Paradox and MySQL are just not doing the sort (this seems to 
be what eats up the time), since the output of the subquery is in fact already 
in the 
proper order.

> 
> Also, just as in the other thread, I'm thinking that a seqscan+hash
> aggregate would be a better idea than this bit:
> 
> >->  GroupAggregate  (cost=0.00..3924.91 rows=33676 
> > width=13) (actual time=0.324..473.131 rows=38815 loops=1)
> >  ->  Index Scan using sheep_flock_pkey on 
> > sheep_flock f (cost=0.00..3094.95 rows=81802 width=13) (actual 
> > time=0.295..232.156)
> 
> Possibly you need to raise work_mem to get it to consider the hash
> aggregation method.
> 
> BTW, are you *sure* you are testing PG 8.1?  The "Subquery Scan f2" plan
> node looks unnecessary to me, and I'd have expected 8.1 to drop it out.
> 8.0 and before would have left it in the plan though.  This doesn't make
> all that much difference performance-wise in itself, but it does make me
> wonder what you are testing.

Yes, the executables all say version 8.1.3.6044
> 
Regards,-- 
Peter Hardman
Acre Cottage, Horsebridge
King's Somborne
Stockbridge
SO20 6PT

== Breeder of Shetland Cattle and Shetland Sheep ==


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL

2006-08-17 Thread Peter Hardman
On 17 Aug 2006 at 12:11, Markus Schaber wrote:

> Hi, Peter,
> 
> Peter Hardman wrote:
> 
> >> BTW, are you *sure* you are testing PG 8.1?  The "Subquery Scan f2" plan
> >> node looks unnecessary to me, and I'd have expected 8.1 to drop it out.
> >> 8.0 and before would have left it in the plan though.  This doesn't make
> >> all that much difference performance-wise in itself, but it does make me
> >> wonder what you are testing.
> > 
> > Yes, the executables all say version 8.1.3.6044
> 
> Would you mind to look at the output of "select version();", too?
> 
> I ask this because I stumbled over it myself, that I had installed the
> correct postgresql and psql versions, but accidentally connected to a
> different database installation due to strange environment and script
> settings...
select version() returns

PostgreSQL 8.1.3 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 
(mingw-special)

Cheers,-- 
Peter Hardman
Acre Cottage, Horsebridge
King's Somborne
Stockbridge
SO20 6PT

== Breeder of Shetland Cattle and Shetland Sheep ==


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL

2006-08-17 Thread Peter Hardman


On 16 Aug 2006 at 17:48, Peter Hardman wrote:

> I'm in the process of migrating a Paradox 7/BDE 5.01 database from 
> single-user 
> Paradox to a web based interface to either MySQL or PostgreSQL.
 

I've uploaded my data to www.shetland-sheep.org.uk/pgdata/sheep-flock.zip

The flock SSBXXX is the 'big flock in the sky' and thus there should never be 
any 
date for a sheep greater than this. 

Yes, the primary key is regn_no + flock_no + transfer_date.

Thanks again for all the help and advice.

Regards,-- 
Peter Hardman
Acre Cottage, Horsebridge
King's Somborne
Stockbridge
SO20 6PT

== Breeder of Shetland Cattle and Shetland Sheep ==


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL

2006-08-17 Thread Peter Hardman
On 17 Aug 2006 at 14:33, Tom Lane wrote:

> I wrote:
> > Anywy, your point about the sort being redundant is a good one, and
> > offhand I'd have expected PG to catch that; I'll have to look into
> > why it didn't.  But that's not going to explain a 10x speed
> > difference, because the sort isn't 90% of the runtime.
> 
> I dug into this using some made-up test data, and was able to reproduce
> the plan you got after changing the order of the pkey index columns
> to (regn_no, transfer_date, flock_no) ... are you sure you quoted that
> accurately before?

Yes. Maybe the data I've uploaded to www.shetland-
sheep.org.uk/pgdata/sheep_flock.zip will help reproduce the plan.

 
> I found a couple of minor planner problems, which I've repaired in CVS
> HEAD.  You might consider using TEXT columns instead of VARCHAR(n),
> because the only bug that actually seemed to change the chosen plan
> involved the planner getting confused by the difference between
> varchar_var and varchar_var::text (which is what gets generated for
> sorting purposes because varchar doesn't have a separate sort operator).

As someone else suggested, these fields ought really to be CHAR no VARCHAR. 
I chose VARCHAR because the data mostly is shorter than the maximum lengths 
(although probably not enough to matter). I'd not really got into the 
subtleties of 
different behaviour of CHAR and VARCHAR.
> 
 

Regards,-- 
Peter Hardman
Acre Cottage, Horsebridge
King's Somborne
Stockbridge
SO20 6PT

== Breeder of Shetland Cattle and Shetland Sheep ==


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL

2006-08-17 Thread Peter Hardman



On 17 Aug 2006 at 20:58, Peter Hardman wrote:


> 
> 
> On 16 Aug 2006 at 17:48, Peter Hardman wrote:
> 
> > I'm in the process of migrating a Paradox 7/BDE 5.01 database from single-user 
> > Paradox to a web based interface to either MySQL or PostgreSQL.
>  
> 
> I've uploaded my data to www.shetland-sheep.org.uk/pgdata/sheep-flock.zip


Sorry - that should be www.shetland-sheep.org.uk/pgdata/sheep_flock.zip
> 
> The flock SSBXXX is the 'big flock in the sky' and thus there should never be any 
> date for a sheep greater than this. 
> 
> Yes, the primary key is regn_no + flock_no + transfer_date.
> 
> Thanks again for all the help and advice.
> 
> Regards,-- 
> Peter Hardman
> Acre Cottage, Horsebridge
> King's Somborne
> Stockbridge
> SO20 6PT
> 
> == Breeder of Shetland Cattle and Shetland Sheep ==
> 
> 
> -------(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster
> 


-- 
Peter Hardman
Acre Cottage, Horsebridge
King's Somborne
Stockbridge
SO20 6PT


== Breeder of Shetland Cattle and Shetland Sheep ==





Re: [PERFORM] Identifying bloated tables

2006-08-28 Thread Peter Childs

On 28/08/06, Michal Taborsky - Internet Mall <[EMAIL PROTECTED]> wrote:

Markus Schaber napsal(a):
> Hi, Michal,
>
> Michal Taborsky - Internet Mall wrote:
>
>> When using this view, you are interested in tables, which have the
>> "bloat" column higher that say 2.0 (in freshly dump/restored/analyzed
>> database they should all be around 1.0).
>
> I just noticed some columns in pg_catalog with a bloat value <1 and a
> negative "wasted space" - is this due to the pseudo nature of them?

It is more likely due to the fact, that these numbers are just
estimates, based on collected table statistics, so for small or
non-standard tables the statistical error is greater that the actual
value. You are usually not interested in tables, which have wasted space
of 1000kB or -1000kB. Also the database must be ANALYZEd properly for
these numbers to carry any significance.



I was just playing around with this table and noticed it preforms the
badly in tables with very small record sizes. This seams to be because
it ignores the system overhead (oid, xmin ctid etc) which seams to be
about 28 bytes per a record this can be quite significate in small
record tables and can cause trouble even with a smal numbers of
record.  Hence I've got a table thats static and fresly "vacuum full"
which reads with a bloat of 4.

Easy to recreate problem to

Create table regionpostcode (area varchar(4), regionid int);

then insert 12 records.

Peter.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] View columns calculated

2004-04-13 Thread Peter Darley
Folks,
I have a question about views:  I want to have a fairly wide view (lots of
columns) where most of the columns have some heavyish calculations in them,
but I'm concerned that it will have to calculate every column even when I'm
not selecting them.  So, the question is, if I have 5 columns in a view but
only select 1 column, is the system smart enough to not calculate the unused
columns, or am I taking a performance hit over a smaller view that doesn't
have the extra 4 columns?
Thanks,
Peter Darley


---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] Mysterious performance of query because of plsql function in where condition

2004-07-02 Thread Peter Alberer

Hi there,

i have a problem with a query that uses the result of a plsql function
In
the where clause:

SELECT
   assignments.assignment_id,
   assignments.package_id AS package_id,
   assignments.title AS title,
   COUNT(*) AS Count
FROM
   assignments INNER JOIN submissions ON
   (assignments.assignment_id=submissions.assignment_id)
WHERE
   package_id=949589 AND
   submission_status(submissions.submission_id)='closed'
GROUP BY
   assignments.assignment_id, assignments.package_id, assignments.title
ORDER BY
   assignments.title;

Postgres seems to execute the function "submission_status" for every row
of
the submissions table (~1500 rows). The query therefore takes quite a
lot
time, although in fact no row is returned from the assignments table
when
the condition package_id=949589 is used.

  QUERY PLAN

---
---
 Sort  (cost=41.21..41.21 rows=1 width=35) (actual
time=4276.978..4276.978
rows=0 loops=1)
   Sort Key: assignments.title
   ->  HashAggregate  (cost=41.19..41.20 rows=1 width=35) (actual
time=4276.970..4276.970 rows=0 loops=1)
 ->  Hash Join  (cost=2.40..41.18 rows=1 width=35) (actual
time=4276.966..4276.966 rows=0 loops=1)
   Hash Cond: ("outer".assignment_id =
"inner".assignment_id)
   ->  Seq Scan on submissions  (cost=0.00..38.73 rows=9
width=4) (actual time=10.902..4276.745 rows=38 loops=1)
 Filter: (submission_status(submission_id) =
'closed'::text)
   ->  Hash  (cost=2.40..2.40 rows=2 width=35) (actual
time=0.058..0.058 rows=0 loops=1)
 ->  Seq Scan on assignments  (cost=0.00..2.40
rows=2
width=35) (actual time=0.015..0.052 rows=2 loops=1)
   Filter: (package_id = 949589)
 Total runtime: 4277.078 ms
(11 rows)

I therefore tried to rephrase the query, to make sure that the function
is
only used for the rows returned by the join but not even the following
does
help (the subselect t1 does not return a single row):

select * from (
SELECT
  a.assignment_id, a.package_id, a.title, s.submission_id,
  COUNT(*) AS Count
FROM
  assignments a INNER JOIN submissions s ON
(a.assignment_id=s.assignment_id)
WHERE
a.package_id=949589
GROUP BY
a.assignment_id, a.package_id, a.title, s.submission_id
) t1
where
   submission_status(t1.submission_id)='closed'
order by
   title;

  QUERY PLAN

---
---
 Sort  (cost=41.21..41.22 rows=1 width=188) (actual
time=4114.251..4114.251
rows=0 loops=1)
   Sort Key: title
   ->  Subquery Scan t1  (cost=41.20..41.20 rows=1 width=188) (actual
time=4114.242..4114.242 rows=0 loops=1)
 ->  HashAggregate  (cost=41.20..41.20 rows=1 width=39) (actual
time=4114.238..4114.238 rows=0 loops=1)
   ->  Hash Join  (cost=2.40..41.18 rows=1 width=39) (actual
time=4114.235..4114.235 rows=0 loops=1)
 Hash Cond: ("outer".assignment_id =
"inner".assignment_id)
 ->  Seq Scan on submissions s  (cost=0.00..38.73
rows=9 width=8) (actual time=7.179..4113.984 rows=38 loops=1)
   Filter: (submission_status(submission_id) =
'closed'::text)
 ->  Hash  (cost=2.40..2.40 rows=2 width=35) (actual
time=0.100..0.100 rows=0 loops=1)
   ->  Seq Scan on assignments a
(cost=0.00..2.40
rows=2 width=35) (actual time=0.045..0.094 rows=2 loops=1)
 Filter: (package_id = 949589)
 Total runtime: 4114.356 ms
(12 rows)

The function is nevertheless executed for every row in the submissions
table. A simple "select *, submission_status(submission_id) from
submissions" takes about the same time as the 2 queries stated above.

The whole database has been vacuum analysed right before the explain
analyse output has been captured.

What can I do to reduce the time this query takes? And why is the
function
executed although there is no row in the result set of t1 in my
rephrased
query?

TIA, peter

Ps: table definitions:

  Table "public.assignments"
Column |Type |   Modifiers
---+-+
 assignment_id | integer | not null
 title | character varying(100)  | not null
 max_grade | smallint| not null
 start_date| timestamp without time zone | not null default now()
 end_date  | timestamp without time zone | not

Re: [HACKERS] [PERFORM] Reiser4

2004-08-14 Thread Peter Eisentraut
Bruce Momjian wrote:
> Pierre-Frédéric Caillaud wrote:
> > Is there also a possibility to tell Postgres : "I don't care if I
> > lose 30 seconds of transactions on this table if the power goes
> > out, I just want to be sure it's still ACID et al. compliant but
> > you can fsync less often and thus be faster" (with a possibility of
> > setting that on a per-table basis) ?

Then it would be "ACI" compliant.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Data type to use for primary key

2004-11-23 Thread Peter Darley
All,
Well, you should still escape any strings you're getting from a web 
page so
you can ensure you're not subject to a SQL insert attack, even if you're
expecting integers.
Thanks,
Peter Darley

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of
Pierre-Frédéric Caillaud
Sent: Monday, November 22, 2004 3:06 PM
To: [EMAIL PROTECTED]
Subject: Re: [PERFORM] Data type to use for primary key



> What is the common approach? Should I use directly the product_code as
> my ID, or use a sequantial number for speed? (I did the same for the
> company_id, this is a 'serial' and not the shor name of the customer.
> I just don't know what is usually done.

Use a serial :
- you can change product_code for a product easily
- you can pass around integers easier around, in web forms for instance,
you don't have to ask 'should I escape this string ?'
- it's faster
- it uses less space
- if one day you must manage products from another source whose
product_code overlap yours, you won't have problems
- you can generate them with a serial uniquely and easily

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-21 Thread Peter Darley
Tatsuo,
What would happen with SELECT queries that, through a function or some
other mechanism, updates data in the database?  Would those need to be
passed to pgpool in some special way?
Thanks,
Peter Darley

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Tatsuo Ishii
Sent: Thursday, January 20, 2005 5:40 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED];
pgsql-performance@postgresql.org
Subject: Re: [PERFORM] PostgreSQL clustering VS MySQL clustering


> On January 20, 2005 06:49 am, Joshua D. Drake wrote:
> > Stephen Frost wrote:
> > >* Herv? Piedvache ([EMAIL PROTECTED]) wrote:
> > >>Le Jeudi 20 Janvier 2005 15:30, Stephen Frost a écrit :
> > >>>* Herv? Piedvache ([EMAIL PROTECTED]) wrote:
> > >>>>Is there any solution with PostgreSQL matching these needs ... ?
> > >>>
> > >>>You might look into pg_pool.  Another possibility would be slony,
though
> > >>>I'm not sure it's to the point you need it at yet, depends on if you
can
> > >>>handle some delay before an insert makes it to the slave select
systems.
> > >>
> > >>I think not ... pgpool or slony are replication solutions ... but as I
> > >> have said to Christopher Kings-Lynne how I'll manage the scalabilty
of
> > >> the database ? I'll need several servers able to load a database
growing
> > >> and growing to get good speed performance ...
> > >
> > >They're both replication solutions, but they also help distribute the
> > >load.  For example:
> > >
> > >pg_pool will distribute the select queries amoung the servers.  They'll
> > >all get the inserts, so that hurts, but at least the select queries are
> > >distributed.
> > >
> > >slony is similar, but your application level does the load distribution
> > >of select statements instead of pg_pool.  Your application needs to
know
> > >to send insert statements to the 'main' server, and select from the
> > >others.
> >
> > You can put pgpool in front of replicator or slony to get load
> > balancing for reads.
>
> Last time I checked load ballanced reads was only available in pgpool if
you
> were using pgpools's internal replication.  Has something changed
recently?

Yes. However it would be pretty easy to modify pgpool so that it could
cope with Slony-I. I.e.

1) pgpool does the load balance and sends query to Slony-I's slave and
   master if the query is SELECT.

2) pgpool sends query only to the master if the query is other than
   SELECT.

Remaining problem is that Slony-I is not a sync replication
solution. Thus you need to prepare that the load balanced query
results might differ among servers.

If there's enough demand, I would do such that enhancements to pgpool.
--
Tatsuo Ishii

> > >>>>Is there any other solution than a Cluster for our problem ?
> > >>>
> > >>>Bigger server, more CPUs/disks in one box.  Try to partition up your
> > >>>data some way such that it can be spread across multiple machines,
then
> > >>>if you need to combine the data have it be replicated using slony to
a
> > >>>big box that has a view which joins all the tables and do your big
> > >>>queries against that.
> > >>
> > >>But I'll arrive to limitation of a box size quickly I thing a 4
> > >> processors with 64 Gb of RAM ... and after ?
> >
> > Opteron.
>
> IBM Z-series, or other big iron.
>
> >
> > >Go to non-x86 hardware after if you're going to continue to increase
the
> > >size of the server.  Personally I think your better bet might be to
> > >figure out a way to partition up your data (isn't that what google
> > >does anyway?).
> > >
> > >   Stephen
>
> --
> Darcy Buskermolen
> Wavefire Technologies Corp.
> ph: 250.717.0200
> fx:  250.763.1759
> http://www.wavefire.com
>
> ---(end of broadcast)---
> TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
>

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PgPool changes WAS: PostgreSQL clustering VS MySQL

2005-01-25 Thread Peter Darley
Josh,

Please excuse how my client quotes things...

> Are there ones that you use which might use several different connections to 
> send a series of queries from a single web-user, less than 5 seconds apart?

Using Apache/Perl I often have a situation where we're sending several 
queries from the same user (web client) within seconds, or even simultaneously, 
that use different connections.

When someone logs in to our system they get a frameset that has 5 
windows, each of which is filled with data from queries.  Since the pages in 
the frames are requested separately by the client the system doesn't insure 
that they go to the same process, and subsequently, that they're not served by 
the same db connection.

Session information is stored in the database (so it's easily 
persistent across server processes), so it would be bad if a request for a page 
was served by a db server that didn't yet have information about the user (such 
as that they're logged in, etc.).

If we ever have enough traffic to warrant it, we're going to go to a 
load balancer that passes requests to different identical web servers, at which 
point we won't even be getting requests from the same machine, much less the 
same connection.

Thanks,
Peter Darley

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Josh Berkus
Sent: Monday, January 24, 2005 3:46 PM
To: Ragnar HafstaÃ
Cc: pgsql-performance@postgresql.org; Tatsuo Ishii
Subject: Re: [PERFORM] PgPool changes WAS: PostgreSQL clustering VS
MySQL


Ragnar,

> note that these sometimes do not provide connection pooling as such,
> just persistent connections (Apache::DBI)

Yes, right.

> no. you can only count on web-server-process==connection, but not
> web-user==connection, unless you can garantee that the same user
> client always connects to same web-server process.

Are there ones that you use which might use several different connections to 
send a series of queries from a single web-user, less than 5 seconds apart?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Possibly slow query

2005-01-25 Thread Peter Darley
Folks,

I'm using PostgreSQL 7.4.1 on Linux, and I'm trying to figure out 
weather a
query I have is going to be slow when I have more information in my tables.
both tables involved will likely have ~500K rows within a year or so.

Specifically I can't tell if I'm causing myself future problems with the
subquery, and should maybe re-write the query to use a join.  The reason I
went with the subquery is that I don't know weather a row in Assignments
will have a corresponding row in Assignment_Settings

The query is:
SELECT User_ID
FROM Assignments A
WHERE A.User_ID IS NOT NULL
AND (SELECT Value FROM Assignment_Settings WHERE Setting='Status' AND
Assignment_ID=A.Assignment_ID) IS NULL
GROUP BY User_ID;

The tables and an explain analyze of the query are as follows:

neo=# \d assignments;
   Table "shopper.assignments"
Column |  Type  |
Modifiers
---++---
--
 assignment_id | integer| not null default
nextval('shopper.assignments_assignment_id_seq'::text)
 sample_id | integer| not null
 user_id   | integer|
 time  | timestamp(0) without time zone | not null default now()
 address_id| integer|
Indexes:
"assignments_pkey" primary key, btree (assignment_id)
"assignments_sample_id" unique, btree (sample_id)
"assignments_address_id" btree (address_id)
"assignments_user_id" btree (user_id)
Triggers:
assignments_check_assignment BEFORE INSERT ON assignments FOR EACH ROW
EXECUTE PROCEDURE check_assignment()

neo=# \d assignment_settings
   Table
"shopper.assignment_settings"
Column |  Type  |
Modifiers
---++---
--
 assignment_setting_id | integer| not null default
nextval('shopper.assignment_settings_assignment_setting_id_seq'::text)
 assignment_id | integer| not null
 setting   | character varying(250) | not null
 value | text   |
Indexes:
"assignment_settings_pkey" primary key, btree (assignment_setting_id)
"assignment_settings_assignment_id_setting" unique, btree
(assignment_id, setting)

neo=# explain analyze SELECT User_ID FROM Assignments A WHERE A.User_ID IS
NOT NULL AND (SELECT Value FROM Assignment_Settings WHERE Setti
ng='Status' AND Assignment_ID=A.Assignment_ID) IS NULL GROUP BY User_ID;
 QUERY PLAN


 HashAggregate  (cost=1.01..1.01 rows=1 width=4) (actual time=0.057..0.058
rows=1 loops=1)
   ->  Seq Scan on assignments a  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.033..0.040 rows=2 loops=1)
 Filter: ((user_id IS NOT NULL) AND ((subplan) IS NULL))
 SubPlan
   ->  Seq Scan on assignment_settings  (cost=0.00..0.00 rows=1
width=13) (actual time=0.001..0.001 rows=0 loops=2)
 Filter: (((setting)::text = 'Status'::text) AND
(assignment_id = $0))
 Total runtime: 0.159 ms
(7 rows)


Thanks in advance for any help!

Thanks,
Peter Darley


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Possibly slow query

2005-01-26 Thread Peter Darley
Richard,
I tried a left join, which has to be a little weird, because there may 
or
may not be a corresponding row in Assignment_Settings for each Assignment,
and they may or may not have Setting='Status', so I came up with:

SELECT User_ID
FROM Assignments A NATURAL LEFT JOIN (SELECT * FROM Assignment_Settings
WHERE Setting='Status') ASet
WHERE A.User_ID IS NOT NULL
AND ASet.Assignment_ID IS NULL
GROUP BY User_ID;

Which explain analyze is saying takes 0.816 ms as compared to 0.163 ms 
for
my other query.  So, I'm not sure that I'm writing the best LEFT JOIN that I
can.  Also, I suspect that these ratios wouldn't hold as the data got bigger
and started using indexes, etc.  I'll mock up a couple of tables with a
bunch of data and see how things go.  It would be nice to understand WHY I
get the results I get, which I'm not sure I will.

I'm not sure what you mean by selecting a distinct User_ID first.  Since
I'm joining the tables on Assignment_ID, I'm not sure how I'd do a distinct
before the join (because I'd lose Assignment_ID).  I was also under the
impression that group by was likely to be faster than a distinct, tho I
can't really recall where I got that idea from.

Thanks for your suggestions!
Peter Darley

-Original Message-
From: Richard Huxton [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 26, 2005 1:36 AM
To: Peter Darley
Cc: Pgsql-Performance
Subject: Re: [PERFORM] Possibly slow query


Peter Darley wrote:
> Folks,
>
>   I'm using PostgreSQL 7.4.1 on Linux, and I'm trying to figure out 
> weather
a
> query I have is going to be slow when I have more information in my
tables.
> both tables involved will likely have ~500K rows within a year or so.
>
>   Specifically I can't tell if I'm causing myself future problems with the
> subquery, and should maybe re-write the query to use a join.  The reason I
> went with the subquery is that I don't know weather a row in Assignments
> will have a corresponding row in Assignment_Settings
>
>   The query is:
> SELECT User_ID
> FROM Assignments A
> WHERE A.User_ID IS NOT NULL
>   AND (SELECT Value FROM Assignment_Settings WHERE Setting='Status' AND
> Assignment_ID=A.Assignment_ID) IS NULL
> GROUP BY User_ID;

You could always use a LEFT JOIN instead, like you say. I'd personally
be tempted to select distinct user_id's then join, but it depends on how
many of each.

You're not going to know for sure whether you'll have problems without
testing. Generate 500k rows of plausible looking test-data and give it a
try.

--
   Richard Huxton
   Archonet Ltd


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Possibly slow query

2005-01-31 Thread Peter Darley
Manfred,
Yeah, that was a typo.  It should have been ASet.Value IS NULL.
I have considered storing the setting names by key, since I do have a
separate table with the names and a key as you suggest, but since my
application is only ~75% finished, it's still pretty important to have human
readable/editable tables.
Thanks,
Peter Darley

-Original Message-
From: Manfred Koizar [mailto:[EMAIL PROTECTED]
Sent: Monday, January 31, 2005 3:06 AM
To: Peter Darley
Cc: Richard Huxton; Pgsql-Performance
Subject: Re: [PERFORM] Possibly slow query


On Wed, 26 Jan 2005 07:16:25 -0800, "Peter Darley"
<[EMAIL PROTECTED]> wrote:
>SELECT User_ID
>FROM Assignments A NATURAL LEFT JOIN (SELECT * FROM Assignment_Settings
>WHERE Setting='Status') ASet
>WHERE A.User_ID IS NOT NULL
>   AND ASet.Assignment_ID IS NULL
>GROUP BY User_ID;

"ASet.Assignment_ID IS NULL" and "value IS NULL" as you had in your
original post don't necessarily result in the same set of rows.

SELECT DISTINCT a.User_ID
  FROM Assignments a
   LEFT JOIN Assignment_Settings s
  ON (a.Assignment_ID=s.Assignment_ID
  AND s.Setting='Status')
 WHERE a.User_ID IS NOT NULL
   AND s.Value IS NULL;

Note how the join condition can contain subexpressions that only depend
on columns from one table.

BTW,
|neo=# \d assignment_settings
| [...]
| setting   | character varying(250) | not null
| [...]
|Indexes:
|[...]
|"assignment_settings_assignment_id_setting" unique, btree
(assignment_id, setting)

storing the setting names in their own table and referencing them by id
might speed up some queries (and slow down others).  Certainly worth a
try ...

Servus
 Manfred


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] doubt with pg_dump and high concurrent used databases

2007-11-25 Thread Peter Childs
On 25/11/2007, Erik Jones <[EMAIL PROTECTED]> wrote:
>
> On Nov 25, 2007, at 10:46 AM, Pablo Alcaraz wrote:
>
> > Hi all,
> >
> > I read that pg_dump can run while the database is being used and makes
> > "consistent backups".
> >
> > I have a huge and *heavy* selected, inserted and updated database.
> > Currently I have a cron task that disconnect the database users,
> > make a
> > backup using pg_dump and put the database online again. The problem
> > is,
> > now there are too much information and everyday the database store
> > more
> > and more data, the backup process needs more and more time to run
> > and I
> > am thinking about to do the backup using a process that let me to
> > do it
> > with the minimal interruptions for the users.
> >
> > I do not need a last second backup. I could the a backup with "almost
> > all" the data but I need the information on it to be coherent. For
> > example, if the backup store information about an invoice it *must* to
> > store both header and items invoice information. I could live if the
> > backup does not store some invoices information when is ran, because
> > they ll be backuped the next time the backup process run. But I can
> > not
> > store only a part of the invoices. That is I call a coherent backup.
> >
> > The best for me is that the cron tab does a concurrent backup with all
> > the information until the time it starts to run while the clients are
> > using the database. Example: if the cron launch the backup process at
> > 12:30 AM, the backup moust be builded with all the information *until*
> > 12:30AM. So if I need to restore it I get a database coherent with the
> > same information like it was at 12:30AM. it does not matter if the
> > process needs 4 hours to run.
> >
> > Does the pg_dump create this kind of "consistent backups"? Or do I
> > need
> > to do the backups using another program?
>
> Yes, that is exactly what pg_dump does.
>
>
Yes so long as you are using transactions correctly. Ie doing a begin before
each invoice and a commit afterwards if your not bothering and using auto
commit you *may* have problems. pg_dump will show a constant state at the
time when the backup was started. If your database was not "consistent"  at
that time you may have issues, But it will be constant  from a  database
point of view ie foreign keys, primary keys, check constraints, triggers
etc.

It all depends what you mean by consistent.

Peter.


Re: [PERFORM] doubt with pg_dump and high concurrent used databases

2007-11-26 Thread Peter Childs
On 25/11/2007, Pablo Alcaraz <[EMAIL PROTECTED]> wrote:
>
> Tom Lane wrote:
> > "Peter Childs" <[EMAIL PROTECTED]> writes:
> >
> >> On 25/11/2007, Erik Jones <[EMAIL PROTECTED]> wrote:
> >>
> >>>> Does the pg_dump create this kind of "consistent backups"? Or do I
> >>>> need to do the backups using another program?
> >>>>
> >>> Yes, that is exactly what pg_dump does.
> >>>
> >>>
> >> Yes so long as you are using transactions correctly. Ie doing a begin
> before
> >> each invoice and a commit afterwards if your not bothering and using
> auto
> >> commit you *may* have problems.
> >>
> >
> > I think you need to qualify that a bit more.  What you're saying is that
> > if an application has consistency requirements that are momentarily
> > violated during multi-statement updates, and it fails to wrap such
> > updates into a single transaction, then pg_dump could capture one of the
> > intermediate states.  That's true, but it's hardly pg_dump's fault.
> > If there were a system crash partway through such a sequence, the
> > consistency requirements would be violated afterwards, too.
> >
> >
>
> Agree. In my case I define "consistent database state" like the state
> the database has when the program that use it is stopped normally and
> without errors. In this "state" the program starts without troubles and
> "everything looks fine". I believe this behavior is because all the
> inserts and updates are made using transactions. Another things will be
> a bug, it ll be fixed and it ll not be pg_dump fault.
>
> So if pg_dump can capture a "consistent state" with all the data until
> the start time, without all the pending open transaction updates/inserts
> in the same way that I did when I stopped the program before start
> pg_dump, for me is usefull and enough to solve my problem.
>
> Thanks to all!
>
> Pablo
>
>
Given your long description over what you though was "constant" I thought it
important that the answer yes but was given rather than just a plain yes.
I've met quite a few apps that create inconstant databases when  the
database its self is actually consistent.

Peter


[PERFORM] TB-sized databases

2007-11-26 Thread Peter Koczan
Hi all,

I have a user who is looking to store 500+ GB of data in a database
(and when all the indexes and metadata are factored in, it's going to
be more like 3-4 TB). He is wondering how well PostgreSQL scales with
TB-sized databases and what can be done to help optimize them (mostly
hardware and config parameters, maybe a little advocacy). I can't
speak on that since I don't have any DBs approaching that size.

The other part of this puzzle is that he's torn between MS SQL Server
(running on Windows and unsupported by us) and PostgreSQL (running on
Linux...which we would fully support). If any of you have ideas of how
well PostgreSQL compares to SQL Server, especially in TB-sized
databases, that would be much appreciated.

We're running PG 8.2.5, by the way.

Peter

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] TB-sized databases

2007-11-27 Thread Peter Koczan
Thanks all. This is just what I needed.

On Nov 26, 2007 1:16 PM, Stephen Cook <[EMAIL PROTECTED]> wrote:
> I think either would work; both PostgreSQL and MS SQL Server have
> success stories out there running VLDBs.  It really depends on what you
> know and what you have.  If you have a lot of experience with Postgres
> running on Linux, and not much with SQL Server on Windows, of course the
> former would be a better choice for you.  You stand a much better chance
> working with tools you know.
>
>
>
> Pablo Alcaraz wrote:
> > I had a client that tried to use Ms Sql Server to run a 500Gb+ database.
> > The database simply colapsed. They switched to Teradata and it is
> > running good. This database has now 1.5Tb+.
> >
> > Currently I have clients using postgresql huge databases and they are
> > happy. In one client's database the biggest table has 237Gb+ (only 1
> > table!) and postgresql run the database without problem using
> > partitioning, triggers and rules (using postgresql 8.2.5).
> >
> > Pablo
> >
> > Peter Koczan wrote:
> >> Hi all,
> >>
> >> I have a user who is looking to store 500+ GB of data in a database
> >> (and when all the indexes and metadata are factored in, it's going to
> >> be more like 3-4 TB). He is wondering how well PostgreSQL scales with
> >> TB-sized databases and what can be done to help optimize them (mostly
> >> hardware and config parameters, maybe a little advocacy). I can't
> >> speak on that since I don't have any DBs approaching that size.
> >>
> >> The other part of this puzzle is that he's torn between MS SQL Server
> >> (running on Windows and unsupported by us) and PostgreSQL (running on
> >> Linux...which we would fully support). If any of you have ideas of how
> >> well PostgreSQL compares to SQL Server, especially in TB-sized
> >> databases, that would be much appreciated.
> >>
> >> We're running PG 8.2.5, by the way.
> >>
> >> Peter
> >>
> >> ---(end of broadcast)---
> >> TIP 4: Have you searched our list archives?
> >>
> >>http://archives.postgresql.org
> >>
> >>
> >
> >
> > ---(end of broadcast)---
> > TIP 5: don't forget to increase your free space map settings
> >
>
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings
>

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Commit takes a long time.

2008-01-03 Thread Peter Childs
Using Postgresql 8.1.10 every so often I get a transaction that takes a
while to commit.

I log everything that takes over 500ms and quite reguallly it says things
like

707.036 ms statement: COMMIT

Is there anyway to speed this up?

Peter Childs


Re: [PERFORM] Commit takes a long time.

2008-01-04 Thread Peter Childs
On 03/01/2008, Tom Lane <[EMAIL PROTECTED]> wrote:
>
> "Peter Childs" <[EMAIL PROTECTED]> writes:
> > Using Postgresql 8.1.10 every so often I get a transaction that takes a
> > while to commit.
>
> > I log everything that takes over 500ms and quite reguallly it says
> things
> > like
>
> > 707.036 ms statement: COMMIT
>
> AFAIK there are only two likely explanations for that:
>
> 1. You have a lot of deferred triggers that have to run at COMMIT time.
>
> 2. The disk system gets so bottlenecked that fsync'ing the commit record
> takes a long time.
>
> If it's #2 you could probably correlate the problem with spikes in I/O
> activity as seen in iostat or vmstat.
>
> If it is a disk usage spike then I would make the further guess that
> what causes it might be a Postgres checkpoint.  You might be able to
> dampen the spike a bit by playing with the checkpoint parameters, but
> the only real fix will be 8.3's spread-out-checkpoints feature.
>
> regards, tom lane
>


2 Seams most likely  as they seam to occur more often when other when large
queries (they are often followed by a record for a very very long query in a
deferent transaction) or at particularly busy period when quite a lots of
other short queries are also taking place.

I planning an upgrade to 8.3 once its out anyway so that might increase
speed anyway.

Peter.


Re: [PERFORM] Join Query Perfomance Issue

2008-02-12 Thread Peter Koczan
> I have serious performance problems with the following type of queries:
>
> Doesnt looks too bad to me, but i'm not that deep into sql query
> optimization. However, these type of query is used in a function to
> access a normalized, partitioned database, so better performance in this
> queries would speed up the whole database system big times.
> Any suggestions here would be great. I allready tested some things,
> using inner join, rearranging the order of the tables, but but only
> minor changes in the runtime, the version above seemed to get us the
> best performance.

Can you send the table definitions of the tables involved in the
query, including index information? Might be if we look hard enough we
can find something.

Peter

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Anyone using a SAN?

2008-02-13 Thread Peter Koczan
Hi all,

We're considering setting up a SAN where I work. Is there anyone using
a SAN, for postgres or other purposes? If so I have a few questions
for you.

- Are there any vendors to avoid or ones that are particularly good?

- What performance or reliability implications exist when using SANs?

- Are there any killer features with SANs compared to local storage?

Any other comments are certainly welcome.

Peter

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Anyone using a SAN?

2008-02-13 Thread Peter Koczan
Thanks for all your input, it is very helpful. A SAN for our postgres
deployment is probably sufficient in terms of performance, because we
just don't have that much data. I'm a little concerned about needs for
user and research databases, but if a project needs a big, fast
database, it might be wise to have them shell out for DAS.

My co-workers and I are meeting with a vendor in two weeks (3Par,
specifically), and I think I have a better idea of what I should be
looking at. I'll keep you all up on the situation. Keep the ideas
coming as I still would like to know of any other important factors.

Thanks again.

Peter

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] shared_buffers in 8.3 w/ lots of RAM on dedicated PG machine

2008-02-15 Thread Peter Schuller
Hello,

my impression has been that in the past, there has been a general
semi-consensus that upping shared_buffers to use the majority of RAM
has not generally been recommended, with reliance on the buffer cache
instead being the recommendation.

Given the changes that have gone into 8.3, in particular with regards
to minimizing the impact of large sequential scans, would it be
correct to say that given that

  - enough memory is left for other PG bits (sort mems and whatnot else)
  - only PG is running on the machine
  - you're on 64 bit so do not run into address space issues
  - the database working set is larger than RAM

it would be generally advisable to pump up shared_buffers pretty much
as far as possible instead of relying on the buffer cache?

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



pgpRe9ntcHta3.pgp
Description: PGP signature


Re: [PERFORM] shared_buffers in 8.3 w/ lots of RAM on dedicated PG machine

2008-02-15 Thread Peter Schuller
> PostgreSQL still depends on the OS for file access and caching. I
> think that the current recommendation is to have up to 25% of your
> RAM in the shared buffer cache.

This feels strange. Given a reasonable amount of RAM (let's say 8 GB
in this case), I cannot imagine why 75% of that would be efficiently
used for anything but the buffer cache (ignoring work_mem, stacks,
etc). Obviously the OS will need memory to do it's usual stuff
(buffering to do efficient I/O, and so on). But the need for that
should not increase with the amount of RAM in the machine, all else
being equal.

What type of file I/O, other than reading pages of PostgreSQL data
which are eligable for the PostgreSQL buffer cache, does PostgreSQL do
that would take advantage of the operating system caching so much
data?

(Assuming the database is not extreme to the point of file system meta
data being huge.)

If the 25% rule still holds true, even under circumstances where the
assumption is that the PostgreSQL buffer cache is more efficient (in
terms of hit ratio) at caching PostgreSQL database data pages, it
would be useful to understand why in order to understand the
trade-offs involved and make appropriate decisions.

Or is it a matter of PostgreSQL doing non-direct I/O, such that
anything cached in shared_buffers will also be cached by the OS?

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



pgpQMYjRMfywD.pgp
Description: PGP signature


Re: [PERFORM] shared_buffers in 8.3 w/ lots of RAM on dedicated PG machine

2008-02-17 Thread Peter Schuller
> PostgreSQL only uses direct I/O for writing to the WAL; everything else
> goes through the regular OS buffer cache unless you force it to do
> otherwise at the OS level (like some Solaris setups do with
> forcedirectio).  This is one reason it still make not make sense to give
> an extremely high percentage of RAM to PostgreSQL even with improvements
> in managing it.  

Ok - thank you for the input (that goes for everyone).

> Another is that shared_buffers memory has to be 
> reconciled with disk at every checkpoint, where OS buffers do not.

Hmm. Am I interpreting that correctly in that dirty buffers need to be flushed 
to disk at checkpoints? That makes perfect sense - but why would that not be 
the case with OS buffers? My understanding is that the point of the 
checkpoint is to essentially obsolete old WAL data in order to recycle the 
space, which would require flushing the data in question first (i.e.,  
normally you just fsync the WAL, but when you want to recycle space you need 
fsync() for the barrier and are then free to nuke the old WAL).

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



signature.asc
Description: This is a digitally signed message part.


Re: [PERFORM] Anyone using a SAN?

2008-02-18 Thread Peter Koczan
> That's true about SANs in general. You don't buy a SAN because it'll
> cost less than just buying the disks and a controller. You buy a SAN
> because it'll let you make managing it easier. The break-even point has
> more to do with how many servers you're able to put on the SAN and how
> often you need to do tricky backup and upgrade procedures than it
> doeswith the hardware.

One big reason we're really looking into a SAN option is that we have
a lot of unused disk space. A typical disk usage scheme for us is 6 GB
for a clean Linux install, and 20 GB for a Windows install. Our disks
are typically 80GB, and even after decent amounts of usage we're not
even approaching half that. We install a lot of software in AFS, our
networked file system, and users' home directories and project
directories are in AFS as well. Local disk space is relegated to the
OS and vendor software, servers that need it, and seldom-used scratch
space. There might very well be a break-even point for us in terms of
cost.

One of the other things I was interested in was the "hidden costs" of
a SAN. For instance, we'd probably have to invest in more UPS capacity
to protect our data. Are there any other similar points that people
don't initially consider regarding a SAN?

Again, thanks for all your help.

Peter

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] disabling an index without deleting it?

2008-02-26 Thread Peter Koczan
This might be a weird question...is there any way to disable a
particular index without dropping it?

There are a few queries I run where I'd like to test out the effects
of having (and not having) different indexes on particular query plans
and performance. I'd really prefer not to have to drop and ultimately
recreate a particular index, as some of the data sets are quite large.

So, is there any way to do this, or at least mimic this sort of behavior?

Peter

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Anyone using a SAN?

2008-03-14 Thread Peter Koczan
Hi all,

I had a few meetings with SAN vendors and I thought I'd give you some
follow-up on points of potential interest.

- Dell/EMC
The representative was like the Dell dude grown up. The sales pitch
mentioned "price point" about twenty times (to the point where it was
annoying), and the pitch ultimately boiled down to "Dude, you're
getting a SAN." My apologies in advance to bringing back repressed
memories of the Dell dude. As far as technical stuff goes, it's about
what you'd expect from a low-level SAN. The cost for a SAN was in the
$2-3 per GB range if you went with the cheap option...not terrible,
but not great either, especially since you'd have to buy lots of GB.
Performance numbers weren't bad, but they weren't great either.

- 3par
The sales pitch was more focused on technical aspects and only
mentioned "price point" twice...which is a win in my books, at least
compared to Dell. Their real place to shine was in the technical
aspect. Whereas Dell just wanted to sell you a storage system that you
put on a network, 3par wanted to sell you a storage system
specifically designed for a network, and change the very way you think
about storage. They had a bunch of cool management concepts, and very
advanced failover, power outage, and backup techniques and tools.
Performance wasn't shabby, either, for instance a RAID 5 set could get
about 90% the IOPS and transfer rate that a RAID 10 set could. How
exactly this compares to DAS they didn't say. The main stumbling block
with 3par is price. While they didn't give any specific numbers, best
estimates put a SAN in the $5-7 per GB range. The extra features just
might be worth it though.

- Lefthand
This is going to be an upcoming meeting, so I don't have as good of an
opinion. Looking at their website, they seem more to the Dell end in
terms of price and functionality. I'll keep you in touch as I have
more info. They seem good for entry-level SANs, though.

Luckily, almost everything here works with Linux (at least the major
distros), including the management tools, in case people were worried
about that. One of the key points to consider going forward is that
the competition of iSCSI and Fibre Channel techs will likely bring
price down in the future. While SANs are certainly more expensive than
their DAS counterparts, the gap appears to be closing.

However, to paraphrase a discussion between a few of my co-workers,
you can buy toilet paper or kitty litter in huge quantities because
you know you'll eventually use it...and it doesn't change in
performance or basic functionality. Storage is just something that you
don't always want to buy a lot of in one go. It will get bigger, and
cheaper, and probably faster in a relatively short amount of time. The
other thing is that you can't really get a small SAN. The minimum is
usually in the multiple TB range (and usually >10 TB). I'd love to be
able to put together a proof of concept and a test using 3par's
technology and commodity 80GB slow disks, but I really can't. You're
stuck with going all-in right away, and enough people have had
problems being married to specific techs or vendors that it's really
hard to break that uneasiness.

Thanks for reading, hopefully you found it slightly informative.

Peter

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] best way to run maintenance script

2008-03-16 Thread Peter Eisentraut
Vinubalaji Gopal wrote:
> I tried using the nice command (Linux system) on the maintenance script
> - it did not have any effect - guess it does not change the niceness of
> the postgresql vacuum process.

You are probably looking for the command ionice.  nice only affects the CPU 
priority, and that is usually not the primary problem for vacuum.  (And yes, 
you need to nice the server process, not the client script.)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] What is the best way to storage music files in Postgresql

2008-03-17 Thread Peter Koczan
>  > I am going to embarkon building a music library using apache,
>  > postgresql and php.  What is the best way to store the music files?
>
>  Your options are either to use a BLOB within the database or to store
>  paths to normal files in the file system in the database. I suspect
>  using normal files will make backup and management a great deal easier
>  than using in-database BLOBs, so personally I'd do it that way.

I discussed something like this with some co-workers recently, and
here's what I had to say. Not all of these apply to the original
message, but they are things to consider when marrying a database to a
file storage system.

Storing the files in the database as BLOBs:
Pros:
- The files can always be seen by the database system as long as it's
up (there's no dependence on an external file system).
- There is one set of locking mechanisms, meaning that the file
operations can be atomic with the database operations.
- There is one set of permissions to deal with.
Cons:
- There is almost no way to access files outside of the database. If
the database goes down, you are screwed.
- If you don't make good use of tablespaces and put blobs on a
separate disk system, the disk could thrash going between data and
blobs, affecting performance.
- There are stricter limits for PostgreSQL blobs (1 GB size limits, I've read).

Storing files externally, storing pathnames in the database:
Pros:
- You can access and manage files from outside the database and
possibly using different interfaces.
- There's a lot less to store directly in the database.
- You can use existing file-system permissions, mechanisms, and limits.
Cons:
- You are dealing with two storage systems and two different locking
systems which are unlikely to play nice with each other. Transactions
are not guaranteed to be atomic (e.g. a database rollback will not
rollback a file system operation, a commit will not guarantee that
data in a file will stay).
- The file system has to be seen by the database system and any remote
clients that wish to use your application, meaning that a networked FS
is likely to be used (depending on how many clients you have and how
you like to separate services), with all the fun that comes from
administering one of those. Note that this one in particular really
only applies to enterprise-level installations, not smaller
installations like the original poster's.
- If you don't put files on a separate disk-system or networked FS,
you can get poor performance from the disk thrashing between the
database and the files.

There are a couple main points:
1. The favorite answer in computing, "it depends", applies here. What
you decide depends on your storage system, your service and
installation policies, and how important fully atomic transactions are
to you.
2. If you want optimal performance out of either of these basic
models, you should make proper use of separate disk systems. I have no
idea which one is faster (it depends, I'm sure) nor do I have much of
an idea of how to benchmark this properly.

Peter

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] What is the best way to storage music files in Postgresql

2008-03-17 Thread Peter Koczan
>  It seems to me as such a database gets larger, it will become much harder to 
> manage with the 2 systems.  I am talking mostly about music.  So each song 
> should not get too large.

I was just talking about points to consider in general. Getting to
your specific situation...

As far as BLOBs vs. file pointers. Test it out, use what you're most
comfortable using.

I would not set up a networked file system for the sole purpose of
managing and storing files a database will point to. If you already
have access to a networked file system, consider that as an option,
but don't add more work for yourself if you don't have to. Many
applications I work on use the database to store pathnames while the
files themselves are stored in a networked file system. It's honestly
not a huge pain to manage this if it's already available, but as I
mentioned before, there are caveats.

Also, in my experiences, the amount of management you do in a database
doesn't directly depending on the amount of data you put in. In other
words, your database shouldn't become much more difficult to manage
over time if all you are doing is adding more rows to tables.

> I have read alot on this list and on other resources and there seems to be 
> leanings toward 1+0 raids for storage.  It seems to the most flexible when it 
> comes to speed, redundancy and recovery time.  I do want my database to be 
> fully atomic.  I think that is important as this database grows.  Are my 
> assumptions wrong?
>

As far as RAID levels go, RAID 10 is usually optimal for databases, so
your assumptions are correct. The extra cost for disks, I believe, is
paid off by the advantages you mentioned, at least for typical
database-related workloads. RAID 0 doesn't allow for any disaster
recovery, RAID 1 is ok as long as you can handle having only 2 disks
available, and RAID 5 and RAID 6 are just huge pains and terribly slow
for writes.

Note that you should go for a battery-backup if you use hardware RAID.

Hope this helps.

Peter

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Anyone using a SAN?

2008-03-19 Thread Peter Koczan
>  Dell acquired Equallogic last November/December.
>
>  I noticed your Dell meeting was a Dell/EMC meeting. Have you talked to them
> or anyone else about Equallogic?

Now that you mention it, I do recall a bit about Equalogic in the Dell
pitch. It didn't really stand out in my mind and a lot of the
technical details were similar enough to the EMC details that they
just melded in my mind.

>  When I was looking at iSCSI solutions, the Equallogic was really slick. Of
> course, I needed high-end performance, which of course came at a steep
> price, and the project got canned. Oh well. Still, the EL solution claimed
> near linear scalability when additional capacity/shelves were added. And,
> they have a lot of really nice technologies for managing the system.

If you think Equalogic is slick, check out 3par. They've got a lot of
very cool features and concepts. Unfortunately, this comes at a higher
price. To each his own, I guess.

Our meetings didn't focus a lot on scalability of capacity, as we just
didn't think to ask. I think the basic pitch was "it scales well"
without any real hard data.

Peter

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] postgresql is slow with larger table even it is in RAM

2008-03-26 Thread Peter Koczan
On Tue, Mar 25, 2008 at 3:35 AM, sathiya psql <[EMAIL PROTECTED]> wrote:
> Dear Friends,
>  I have a table with 32 lakh record in it. Table size is nearly 700 MB,
> and my machine had a 1 GB + 256 MB RAM, i had created the table space in
> RAM, and then created this table in this RAM.
>
> So now everything is in RAM, if i do a count(*) on this table it returns
> 327600 in 3 seconds, why it is taking 3 seconds ? because am sure that
> no Disk I/O is happening. ( using vmstat i had confirmed, no disk I/O is
> happening, swap is also not used )
>
> Any Idea on this ???
>
> I searched a lot in newsgroups ... can't find relevant things ( because
> everywhere they are speaking about disk access speed, here i don't want to
> worry about disk access )
>
>  If required i will give more information on this.

Two things:

- Are you VACUUM'ing regularly? It could be that you have a lot of
dead rows and the table is spread out over a lot of pages of mostly
dead space. That would cause *very* slow seq scans.

- What is your shared_buffers set to? If it's really low then postgres
could be constantly swapping from ram-disk to memory. Not much would
be cached, and performance would suffer.

FWIW, I did a select count(*) on a table with just over 30 rows,
and it only took 0.28 sec.

Peter

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planning a new server - help needed

2008-03-29 Thread Peter Eisentraut
Laszlo Nagy wrote:
> Question 1. We are going to use PostgreSQL 3.1 with FreeBSD. The pg docs
> say that it is better to use FreeBSD because it can alter the I/O
> priority of processes dynamically.

Where does it say that?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Replication Syatem

2008-04-28 Thread Peter Childs
2008/4/28 Gauri Kanekar <[EMAIL PROTECTED]>:

> All,
>
> We have a table "table1" which get insert and updates daily in high
> numbers, bcoz of which its size is increasing and we have to vacuum it every
> alternate day. Vacuuming "table1" take almost 30min and during that time the
> site is down.
>
> We need to cut down on this downtime.So thought of having a replication
> system, for which the replicated DB will be up during the master is getting
> vacuumed.
>
> Can anybody guide which will be the best suited replication solution for
> this.
>
> Thanx for any help
> ~ Gauri
>

I home your not using Vacuum Full... (Standard Reply for this type of
question)

What version of Postgresql are you using?

Have you tried autovacuum?

Run plain vacuum even more often on this even more often (like ever half
hour) and it should not take as long and save space.

If still have trouble run "vacuum analyse verbose table1;" and see what it
says.

If your doing it right you should be able to vacuum with the database up.

Sounds like you might be happier a fix for the problem rather than a complex
work around which will actually solve a completely different problem.

Regards

Peter.


[PERFORM] VACUUM ANALYZE blocking both reads and writes to a table

2008-06-30 Thread Peter Schuller
Hello,

my understanding, and generally my experience, has been that VACUUM
and VACUUM ANALYZE (but not VACUUM FULL) are never supposed to block
neither SELECT:s nor UPDATE:s/INSERT:s/DELETE:s to a table.

This is seemingly confirmed by reading the "explicit locking"
documentation, in terms of the locks acquired by various forms of
vacuuming, and with which other lock modes they conflict.

I have now seen it happen twice that a VACUMM ANALYZE has seemingly
been the triggering factor to blocking queries.

In the first instance, we had two particularly interesting things
going on:

  VACUUM ANALYZE thetable
  LOCK TABLE thetable IN ACCESS SHARE MODE

In addition there was one SELECT from the table, and a bunch of
INSERT:s (this is based on pg_stat_activity).

While I am unsure of why there is an explicit LOCK going on with
ACCESS SHARE MODE (no explicit locking is ever done on this table by
the application), it is supposed to be the locking used for selects. I
suspect it may be a referential integrity related acquisition
generated by PG.

The second time it happned, there was again a single SELECT, a bunch
of INSERT:s, and then:

  VACUUM ANALYZE thetable

This time there was no explicit LOCK visible.

In both cases, actitivy was completely blocked until the VACUUM
ANALYZE completed.

Does anyone have input on why this could be happening? The PostgreSQL
version is 8.2.4[1]. Am I correct in that it *should* not be possible
for this to happen?

For the next time this happens I will try to have a query prepared
that will dump as much relevant information as possible regarding
acquired locks. 

If it makes a difference the SELECT does have a subselect that also
selcts from the same table - a MAX(colum) on an indexed column.

[1] I did check the ChangeLog for 8.2.x releases above .4, and the 8.3
releases, but did not see anything that indicated locking/conflict
related fixes in relation to vacuums.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



pgpzyHqvAItZX.pgp
Description: PGP signature


Re: [PERFORM] VACUUM ANALYZE blocking both reads and writes to a table

2008-06-30 Thread Peter Schuller
Hello,

> No.  VACUUM takes an exclusive lock at the end of the operation to
> truncate empty pages.  (If it cannot get the lock then it'll just skip
> this step.)  In 8.2.4 there was a bug that caused it to sleep
> according to vacuum_delay during the scan to identify possibly empty
> pages.  This was fixed in 8.2.5:

[snip revision log]

Thank you very much! This does indeed seem to be the likely
culprit. Will try to either upgrade, or if not possible in time for
the next occurance, confirm that this is what is happening based on
pg_locks.

Thanks again for the very informative response.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



pgpUjtABeKgmx.pgp
Description: PGP signature


Re: [PERFORM] VACUUM ANALYZE blocking both reads and writes to a table

2008-06-30 Thread Peter Schuller
Actually, while on the topic:

> date: 2007-09-10 13:58:50 -0400;  author: alvherre;  state: Exp;  lines: 
> +6 -2;
> Remove the vacuum_delay_point call in count_nondeletable_pages, because 
> we hold
> an exclusive lock on the table at this point, which we want to release as 
> soon
> as possible.  This is called in the phase of lazy vacuum where we 
> truncate the
> empty pages at the end of the table.

Even with the fix the lock is held. Is the operation expected to be
"fast" (for some definition of "fast") and in-memory, or is this
something that causes significant disk I/O and/or scales badly with
table size or similar?

I.e., is this enough that, even without the .4 bug, one should not
really consider VACUUM ANALYZE non-blocking with respect to other
transactions?

(I realize various exclusive locks are taken for short periods of time
even for things that are officially declared non-blocking; the
question is whether this falls into this category.)

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



pgp1Tc16hAGGQ.pgp
Description: PGP signature


Re: [PERFORM] VACUUM ANALYZE blocking both reads and writes to a table

2008-07-01 Thread Peter Schuller
> > (2) If it's autovacuum we're talking about, it will get kicked off the
> > table if anyone else comes along and wants a conflicting lock.
> 
> Not on 8.2 though.

That is also nice to know. One more reason to upgrade to 8.3.

Thank you very much, both Alvaro and Tom, for the very insightful
discussion!

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



pgppa59ys6aP0.pgp
Description: PGP signature


Re: [PERFORM] The state of PG replication in 2008/Q2?

2008-08-22 Thread Peter Eisentraut
Dan Harris wrote:
> My desire would be to have a parallel server that could act as a hot
> standby system with automatic fail over in a multi-master role.

I will add my "me too" for DRBD + Heartbeat.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


  1   2   3   >