Re: [PERFORM] One tuple per transaction

2005-03-15 Thread Tambet Matiisen
> --
> 
> Date: Mon, 14 Mar 2005 09:41:30 +0800
> From: "Qingqing Zhou" <[EMAIL PROTECTED]>
> To: pgsql-performance@postgresql.org
> Subject: Re: One tuple per transaction
> Message-ID: <[EMAIL PROTECTED]>
> 
> ""Tambet Matiisen"" <[EMAIL PROTECTED]> writes
...
> > If I'm correct, the dead tuples must be scanned also during 
> table and 
> > index scan, so a lot of dead tuples slows down queries 
> considerably, 
> > especially when the table doesn't fit into shared buffers any more. 
> > And as I'm in transaction, I can't VACUUM to get rid of 
> those tuples. 
> > In one occasion the page count for a table went from 400 to 
> 22000 at 
> > the end.
> 
> Not exactly. The dead tuple in the index will be scanned the 
> first time (and its pointed heap tuple as well), then we will 
> mark it dead, then next time we came here, we will know that 
> the index tuple actually points to a uesless tuple, so we 
> will not scan its pointed heap tuple.
> 

But the dead index tuple will still be read from disk next time? Maybe
really the performance loss will be neglible, but if most of tuples in
your table/index are dead, then it might be significant.

Consider the often suggested solution for speeding up "select count(*)
from table" query: make another table rowcounts and for each of the
original tables add insert and delete triggers to update row count in
rowcounts table. Actually this is standard denormalization technique,
which I use often. For example to ensure that order.total =
sum(order_line.total).

Now, if typical inserts into your most active table occur in batches of
3 rows, in one transaction, then row count for this table is updated 3
times during transaction. 3 updates generate 3 tuples, while 2 of them
are dead from the very start. You effectively commit 2 useless tuples.
After millions of inserts you end up with rowcounts table having 2/3 of
dead tuples and queries start to slow down.

Current solution is to vacuum often. My proposal was to create new tuple
only with first update. The next updates in the same transaction would
update the existing tuple, not create a new. 

But as I'm writing this, I'm starting to get some of the associated
implementation problems. The updated tuple might not be the same size as
previous tuple. Tuple updates are probably not implemented anyway. And
for a reason, as disk write takes the same time, regardless if you
update or write new data. And tons of other problems, which developers
are probably more aware of.

But one thing still bothers me. Why is new index tuple generated when I
update non-indexed column? OK, I get it again. Index tuple points to
heap tuple, thus after update it would point to dead tuple. And as it
takes the same time to update pointer or to write a new tuple, it's
easier to write a new.

Case closed.

  Tambet

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] One tuple per transaction

2005-03-15 Thread Richard Huxton
Tambet Matiisen wrote:
Not exactly. The dead tuple in the index will be scanned the 
first time (and its pointed heap tuple as well), then we will 
mark it dead, then next time we came here, we will know that 
the index tuple actually points to a uesless tuple, so we 
will not scan its pointed heap tuple.


But the dead index tuple will still be read from disk next time? Maybe
really the performance loss will be neglible, but if most of tuples in
your table/index are dead, then it might be significant.
When a block is read from disk, any dead tuples in that block will be 
read in. Vacuum recovers these.

Consider the often suggested solution for speeding up "select count(*)
from table" query: make another table rowcounts and for each of the
original tables add insert and delete triggers to update row count in
rowcounts table. Actually this is standard denormalization technique,
which I use often. For example to ensure that order.total =
sum(order_line.total).
This does of course completely destroy concurrency. Since you need to 
lock the summary table, other clients have to wait until you are done.

Now, if typical inserts into your most active table occur in batches of
3 rows, in one transaction, then row count for this table is updated 3
times during transaction. 3 updates generate 3 tuples, while 2 of them
are dead from the very start. You effectively commit 2 useless tuples.
After millions of inserts you end up with rowcounts table having 2/3 of
dead tuples and queries start to slow down.
Current solution is to vacuum often. My proposal was to create new tuple
only with first update. The next updates in the same transaction would
update the existing tuple, not create a new. 
How do you roll back to a savepoint with this model?
--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] One tuple per transaction

2005-03-15 Thread Tambet Matiisen


> -Original Message-
> From: Richard Huxton [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, March 15, 2005 11:38 AM
> To: Tambet Matiisen
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] One tuple per transaction
> 
...
> 
> > Consider the often suggested solution for speeding up 
> "select count(*) 
> > from table" query: make another table rowcounts and for each of the 
> > original tables add insert and delete triggers to update 
> row count in 
> > rowcounts table. Actually this is standard denormalization 
> technique, 
> > which I use often. For example to ensure that order.total = 
> > sum(order_line.total).
> 
> This does of course completely destroy concurrency. Since you need to 
> lock the summary table, other clients have to wait until you are done.
> 

Yes, it does for rowcounts table. But consider the orders example - it
only locks the order which I add lines. As there is mostly one client
dealing with one order, but possibly thousands dealing with different
orders, it should not pose any concurrency restrictions.

> > Now, if typical inserts into your most active table occur 
> in batches 
> > of 3 rows, in one transaction, then row count for this table is 
> > updated 3 times during transaction. 3 updates generate 3 
> tuples, while 
> > 2 of them are dead from the very start. You effectively commit 2 
> > useless tuples. After millions of inserts you end up with rowcounts 
> > table having 2/3 of dead tuples and queries start to slow down.
> > 
> > Current solution is to vacuum often. My proposal was to create new 
> > tuple only with first update. The next updates in the same 
> transaction 
> > would update the existing tuple, not create a new.
> 
> How do you roll back to a savepoint with this model?
> 

Every savepoint initiates a new (sub)transaction.

  Tambet

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Changing the random_page_cost default (was: cpu_tuple_cost)

2005-03-15 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Josh Berkus wrote:
> I don't agree.  The defaults are there for people who aren't going to read
> enough of the documentation to set them.  As such, conservative for the
> defaults is appropriate.

Sure, but I would argue that 4 is *too* conservative. We've certainly changed
other values over the years. I see it as those most affected by this change
are those who are least likely to have the know-how to change the default,
and are also the majority of our users. I've often had to reduce this, working
on many databases, on many versions of PostgreSQL. Granted, I don't work on
any huge, complex, hundreds of gig databases, but that supports my point -
if you are really better off with a /higher/ (than 3) random_page_cost, you
already should be tweaking a lot of stuff yourself anyway. Tom Lane has a
good point about tweaking other default parameters as well, and that's a
worthy goal, but I don't think extended searching for a "sweet spot" should
prevent us from making a small yet important (IMO!) change in the default
of this one variable.

N.B. My own personal starting default is 2, but I thought 3 was a nice
middle ground more likely to reach consensus here. :)

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200503141727
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iD8DBQFCNhFAvJuQZxSWSsgRAgZiAJ9947emxFoMMXKooJHi2ZPIQr9xGACgjaFf
hBCPTuHZwGFzomf1Z1TDpVo=
=KX9t
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Changing the random_page_cost default (was: cpu_tuple_cost)

2005-03-15 Thread Jeff Hoffmann
On Mar 15, 2005, at 6:35 AM, Greg Sabino Mullane wrote:
Granted, I don't work on
any huge, complex, hundreds of gig databases, but that supports my 
point -
if you are really better off with a /higher/ (than 3) 
random_page_cost, you
already should be tweaking a lot of stuff yourself anyway.
I think this is a good point.  The people that tend to benefit from the 
lower cost are precisely the people least likely to know to change it.  
It's the "install & go" crowd with smaller databases and only a few 
users/low concurrency that expect it to "just work".  The bigger 
installations are more like to have dedicated DB admins that understand 
tuning.

Wasn't there an idea on the table once to ship with several different 
configuration files with different defaults for small, medium, large, 
etc. installs?  Wouldn't it make sense to ask the user during initdb to 
pick from one of the default config files?  Or even have a few simple 
questions like "How much memory do you expect to be available to 
PostgreSQL?" and "How many concurrent users do you expect to have?".  
It's one thing to know how much memory is in a machine, it quite 
another thing to know how much the user wants dedicated to PostgreSQL.  
A couple of questions like that can go a long way to coming up with 
better ballpark figures.

--
Jeff Hoffmann
[EMAIL PROTECTED]
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


[PERFORM] interesting benchmarks PG/Firebird Linux/Windows fsync/nofsync

2005-03-15 Thread Chris Mair
Hello,

just recently I held a short course on PG.

One course attendant, Robert Dollinger, got 
interested in benchmarking single inserts (since
he currently maintains an application that does
exactly that on Firebird and speed is an issue
there).

He came up with a table that I think is interesting
for other people so I asked permission to publish
it on this list.

Here it is:
http://1006.org/pg/postgresql_firebird_win_linux.pdf

Note: some german words are there, I can't change
the pdf, so here's a short explanation:

He tested the speed of 4000 inserts through a Delphi
application with zeos components.

the 3 parameters are:

* transaction
  - single: all 4000 inserts inside 1 transaction
  - multi: 4000 inserts with 4000 commits

* fsync (for PG) or forced writes (for FB)
  - true/false

* "Verbindung" = connection
  - local
  - LAN
  - wireless

  notes: the server ran either on a windows desktop
  machine or a linux laptop; the client allways ran
  on the windows desktop

Timings are in msec, note that you cannot directly
compare Windows and Linux Performance, since machines
were different.

You can, however, compare PG to Firebird, and you
can see the effect of the 3 varied parametert.

One thing that stands out is how terribly
bad Windows performed with many small single
transactions and fsync=true.

Appearantly fsync on Windows is a very costly
operation.

Another (good) thing is that PG beats FB on all
other tests :-)


Bye, Chris.






---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] interesting benchmarks PG/Firebird Linux/Windows fsync/nofsync

2005-03-15 Thread Magnus Hagander

> One thing that stands out is how terribly bad Windows 
> performed with many small single transactions and fsync=true.
> 
> Appearantly fsync on Windows is a very costly operation.

What's the hardware? If you're running on disks with write cache
enabled, fsync on windows will write through the write cache *no matter
what*. I don't know of any other OS where it will do that.

If you don't have a battery backed write cache, then all other
configurations are considered very dangerous in case your machine
crashes.

If you have battery backed write cache, then yes, pg on windows will
perform poorly indeed.


There is a patch in the queue for 8.0.2, and already applied to 8.1
IIRC, that will fix the bad performance with write-cache on win32.

(can't read the PDF, it crashes my adobe reader for some reason. Perhaps
it contains the information above...)

//Magnus

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] interesting benchmarks PG/Firebird Linux/Windows fsync/nofsync

2005-03-15 Thread Bruce Momjian
Chris Mair wrote:
> Timings are in msec, note that you cannot directly
> compare Windows and Linux Performance, since machines
> were different.
> 
> You can, however, compare PG to Firebird, and you
> can see the effect of the 3 varied parametert.
> 
> One thing that stands out is how terribly
> bad Windows performed with many small single
> transactions and fsync=true.
> 
> Appearantly fsync on Windows is a very costly
> operation.

Yes, we now enable FILE_FLAG_WRITE_THROUGH on Win32 for open_sync and I
am about to open a discussion whether this should be the default for
Win32, and whether we should backpatch this to 8.0.X.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Slow loads when indexes added.

2005-03-15 Thread Stef
Hi all,

I suspect this problem/bug has been dealt with already, but I couldn't
find anything in the mail archives.

I'm using postgres 7.3, and I managed to recreate the problem using the 
attached 
files. 

The database structure is in slow_structure.sql

After creating the database, using this script, I ran run_before_load__fast.sql

Then I created a load file using create_loadfile.sh (It creates a file called 
load.sql)

I timed the loading of this file, and it loaded in 1 min 11.567 sec

Then I recreated the database from slow_structure.sql, ran 
run_before_load__slow.sql,
and then loaded the same load.sql and it took 3 min 51.293 sec which is about 6 
times slower.

I tried the same thing on postgres 8.0.0 to see if it does the same thing, but 
there it
was consistently slow : 3 min 31.367 sec

The only way I got the load.sql to load fast on postgres 8.0.0,  was by not 
creating
any of the foreign key constraints that point to the "main" table, and then 
enabling them
afterwards. This gave me the fastest time overall  : 1 min 4.911 sec

My problem is that on the postgres 7.3.4 database I'm working with, a load 
process that
used to take 40 minutes, now takes 4 hours, because of 3 rows data being loaded 
into 
a table (similar in setup to the "main" table in the example) before the 
indexes were created.
(This happens automatically  when you dump and re-import the database (7.3.4))

Is there a way to get it to load fast again on the 7.3 database without 
dropping the foreign 
key constraints (After running run_before_load_slow.sql) ?

And, if someone knows off-hand, what's happening here?

TIA
Kind Regards
Stefan#!/bin/bash

for x in one two three four five six seven eight nine ten
do 
   for y in eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen twenty
   do
  for z in `seq 1 100`
  do
echo "insert into main (c_text,d_text) values ('${x}','${y}');" >> load.sql
echo "insert into a(id) values (currval('public.main_id_seq'::text));" >> load.sql
echo "insert into b(id) values (currval('public.main_id_seq'::text));" >> load.sql
echo "insert into e(id) values (currval('public.main_id_seq'::text));" >> load.sql
echo "insert into f(id) values (currval('public.main_id_seq'::text));" >> load.sql
echo "insert into g(id) values (currval('public.main_id_seq'::text));" >> load.sql
echo "insert into h(id) values (currval('public.main_id_seq'::text));" >> load.sql
  done
   done
done


run_before_load__fast.sql
Description: Binary data


run_before_load__slow.sql
Description: Binary data


slow_structure.sql
Description: Binary data

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Changing the random_page_cost default (was: cpu_tuple_cost)

2005-03-15 Thread Tom Lane
"Greg Sabino Mullane" <[EMAIL PROTECTED]> writes:
> N.B. My own personal starting default is 2, but I thought 3 was a nice
> middle ground more likely to reach consensus here. :)

Your argument seems to be "this produces nice results for me", not
"I have done experiments to measure the actual value of the parameter
and it is X".  I *have* done experiments of that sort, which is where
the default of 4 came from.  I remain of the opinion that reducing
random_page_cost is a band-aid that compensates (but only partially)
for problems elsewhere.  We can see that it's not a real fix from
the not-infrequent report that people have to reduce random_page_cost
below 1.0 to get results anywhere near local reality.  That doesn't say
that the parameter value is wrong, it says that the model it's feeding
into is wrong.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Avoiding tuple construction/deconstruction during joining

2005-03-15 Thread PFC
	I have asked him for the data and played with his queries, and obtained  
massive speedups with the following queries :

http://boutiquenumerique.com/pf/miroslav/query.sql
http://boutiquenumerique.com/pf/miroslav/query2.sql
http://boutiquenumerique.com/pf/miroslav/materialize.sql
	Note that my optimized version of the Big Joins is not much faster that  
the materialized view without index (hash joins are damn fast in postgres)  
but of course using an index...

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Avoiding tuple construction/deconstruction during joining

2005-03-15 Thread PFC
	On my machine (Laptop with Pentium-M 1.6 GHz and 512MB DDR333) I get the  
following timings :

	Big Joins Query will all the fields and no order by (I just put a SELECT  
* in the first table) yielding about 6k rows :
	=> 12136.338 ms

	Replacing the SELECT * from the table with many fields by just a SELECT  
of the foreign key columns :
	=> 1874.612 ms

	I felt like playing a bit so I implemented a hash join in python  
(download the file, it works on Miroslav's data) :
	All timings do not include time to fetch the data from the database.  
Fetching all the tables takes about 1.1 secs.

	* With something that looks like the current implementation (copying  
tuples around) and fetching all the fields from the big table :
	=> Fetching all the tables : 1.1 secs.
	=> Joining : 4.3 secs

* Fetching only the integer fields
=> Fetching all the tables : 0.4 secs.
=> Joining : 1.7 secs
	* A smarter join which copies nothing and updates the rows as they are  
processed, adding fields :
	=> Fetching all the tables :  1.1 secs.
	=> Joining : 0.4 secs
	With the just-in-time compiler activated, it goes down to about 0.25  
seconds.

	First thing, this confirms what Tom said.
	It also means that doing this query in the application can be a lot  
faster than doing it in postgres including fetching all of the tables.  
There's a problem somewhere ! It should be the other way around ! The  
python mappings (dictionaries : { key : value } ) are optimized like crazy  
but they store column names for each row. And it's a dynamic script  
language ! Argh.

Note : run the program like this :
python test.py |less -S
	So that the time spent scrolling your terminal does not spoil the  
measurements.

Download test program :
http://boutiquenumerique.com/pf/miroslav/test.py
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Performance problem on delete from for 10k rows. May takes 20 minutes through JDBC interface

2005-03-15 Thread David Gagnon
Hi all,
 Il get this strange problem when deleting rows from a Java program.  
Sometime (For what I noticed it's not all the time) the server take 
almost forever to delete rows from table.

Here It takes 20 minutes to delete the IC table.
Java logs:
INFO  [Thread-386] (Dao.java:227)  2005-03-15 15:38:34,754 : Execution 
SQL file: resources/ukConfiguration/reset_application.sql
DELETE FROM YR
INFO  [Thread-386] (Dao.java:227)  2005-03-15 15:38:34,964 : Execution 
SQL file: resources/inventory/item/reset_application.sql
DELETE FROM IC
INFO  [Thread-386] (Dao.java:227)  2005-03-15 15:58:45,072 : Execution 
SQL file: resources/ukResource/reset_application.sql
DELETE FROM RA

I get this problem on my dev (Windows/7.4/Cygwin) environment.  But now 
I see that it's also have this problem on my production env.  Yes I 
tought I was maybe just a cygwin/Windows problem .. apparently not :-

On my dev I can see the Postgresql related process running at almost 50% 
of CPU usage for all the time.  So I suppose it's something inside 
Postgresql.  I rememeber having tried to delete the content of my table 
(IC) from PgAdminIII and I took couples of seconds!!! Not minutes.  So 
the process don't jam but take time .. any Idea what postgresql is doing 
during this time??

If you have any idea on what the problem could be... I really appreciate 
it. 

Thanks for any help!
/David


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] One tuple per transaction

2005-03-15 Thread Robert Treat
On Tuesday 15 March 2005 04:37, Richard Huxton wrote:
> Tambet Matiisen wrote:
> > Now, if typical inserts into your most active table occur in batches of
> > 3 rows, in one transaction, then row count for this table is updated 3
> > times during transaction. 3 updates generate 3 tuples, while 2 of them
> > are dead from the very start. You effectively commit 2 useless tuples.
> > After millions of inserts you end up with rowcounts table having 2/3 of
> > dead tuples and queries start to slow down.
> >
> > Current solution is to vacuum often. My proposal was to create new tuple
> > only with first update. The next updates in the same transaction would
> > update the existing tuple, not create a new.
>
> How do you roll back to a savepoint with this model?
>

You can't, but you could add the caveat to just do this auto-reuse within any 
given nested transaction.   Then as long as you aren't using savepoints you 
get to reclaim all the space/ 

 On a similar note I was just wondering if it would be possible to mark any of 
these dead tuples as ready to be reused at transaction commit time, since we 
know that they are dead to any and all other transactions currently going on.  
This would save you from having to vacuum to get the tuples marked ready for 
reuse.  In the above scenario this could be a win, whether it would be 
overall is hard to say. 

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Changing the random_page_cost default (was:

2005-03-15 Thread Mark Kirkwood
Tom Lane wrote:
"Greg Sabino Mullane" <[EMAIL PROTECTED]> writes:
N.B. My own personal starting default is 2, but I thought 3 was a nice
middle ground more likely to reach consensus here. :)

Your argument seems to be "this produces nice results for me", not
"I have done experiments to measure the actual value of the parameter
and it is X".  I *have* done experiments of that sort, which is where
the default of 4 came from.  I remain of the opinion that reducing
random_page_cost is a band-aid that compensates (but only partially)
for problems elsewhere.  We can see that it's not a real fix from
the not-infrequent report that people have to reduce random_page_cost
below 1.0 to get results anywhere near local reality.  That doesn't say
that the parameter value is wrong, it says that the model it's feeding
into is wrong.
I would like to second that. A while back I performed a number of 
experiments on differing hardware and came to the conclusion that *real* 
random_page_cost was often higher than 4 (like 10-15 for multi-disk raid 
 systems).

However I have frequently adjusted Pg's random_page_cost to be less than 
4 - if it helped queries perform better.

So yes, it looks like the model is the issue - not the value of the 
parameter!

regards
Mark
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Performance problem on delete from for 10k rows. May takes 20 minutes through JDBC interface

2005-03-15 Thread Michael Fuhr
On Tue, Mar 15, 2005 at 04:24:17PM -0500, David Gagnon wrote:

>  Il get this strange problem when deleting rows from a Java program.  
> Sometime (For what I noticed it's not all the time) the server take 
> almost forever to delete rows from table.

Do other tables have foreign key references to the table you're
deleting from?  If so, are there indexes on the foreign key columns?

Do you have triggers or rules on the table?

Have you queried pg_locks during the long-lasting deletes to see
if the deleting transaction is waiting for a lock on something?

> I rememeber having tried to delete the content of my table (IC) from
> PgAdminIII and I took couples of seconds!!! Not minutes.

How many records did you delete in this case?  If there are foreign
key references, how many records were in the referencing tables?
How repeatable is the disparity in delete time?  A single test case
might have been done under different conditions, so it might not
mean much.  No offense intended, but "I remember" doesn't carry as
much weight as a documented example.

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] cpu_tuple_cost

2005-03-15 Thread David Brown
Gregory Stark wrote:
The "this day and age" argument isn't very convincing. Hard drive capacity
growth has far outstripped hard drive seek time and bandwidth improvements.
Random access has more penalty than ever.
 

In point of fact, there haven't been noticeable seek time improvements 
for years. Transfer rates, on the other hand, have gone through the roof.

Which is why I would question the published tuning advice that 
recommends lowering it to 2 for arrays. Arrays increase the effective 
transfer rate more than they reduce random access times. Dropping from 4 
to 2 would reflect going from a typical single 7200rpm ATA drive to a 
15000rpm SCSI drive, but striping will move it back up again - probably 
even higher than 4 with a big array (at a guess, perhaps the 
relationship might be approximated as a square root after allowing for 
the array type?).

With default settings, I've seen the planner pick the wrong index unless 
random_page_cost was set to 2. But in testing on an ATA drive, I 
achieved slightly better plan costings by increasing cpu_tuple_cost 
(relative to cpu_index_tuple_cost - by default it's only a factor of 10) 
and actually *raising* random_page_cost to 5! So why pick on one 
parameter? It's all going to vary according to the query and the data.

I agree with Tom 100%. Pulling levers on a wonky model is no solution.
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Performance problem on delete from for 10k rows.

2005-03-15 Thread Rod Taylor
>  I get this problem on my dev (Windows/7.4/Cygwin) environment.  But now 
> I see that it's also have this problem on my production env.  Yes I 
> tought I was maybe just a cygwin/Windows problem .. apparently not :-

Care to try again with logging enabled on the PostgreSQL side within the
development environment?

log_statement = true
log_duration = true
log_connections = on

Then run it via Java and from pgAdminIII and send us the two log
snippets as attachments?

Thanks.
-- 


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] One tuple per transaction

2005-03-15 Thread Tom Lane
Robert Treat <[EMAIL PROTECTED]> writes:
>  On a similar note I was just wondering if it would be possible to
> mark any of these dead tuples as ready to be reused at transaction
> commit time, since we know that they are dead to any and all other
> transactions currently going on.

I believe VACUUM already knows that xmin = xmax implies the tuple
is dead to everyone.

> This would save you from having to vacuum to get the tuples marked
> ready for reuse.

No; you forgot about reclaiming associated index entries.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] One tuple per transaction

2005-03-15 Thread Alvaro Herrera
On Tue, Mar 15, 2005 at 06:51:19PM -0500, Tom Lane wrote:
> Robert Treat <[EMAIL PROTECTED]> writes:
> >  On a similar note I was just wondering if it would be possible to
> > mark any of these dead tuples as ready to be reused at transaction
> > commit time, since we know that they are dead to any and all other
> > transactions currently going on.
> 
> I believe VACUUM already knows that xmin = xmax implies the tuple
> is dead to everyone.

Huh, that is too simplistic in a subtransactions' world, isn't it?

One way to solve this would be that a transaction that kills a tuple
checks whether it was created by itself (not necessarily the same Xid),
and somehow report it to the FSM right away.

That'd mean physically moving a lot of tuples in the page, so ISTM it's
too expensive an "optimization."  Oh, and also delete the tuple from
indexes.

-- 
Alvaro Herrera (<[EMAIL PROTECTED]>)
"Vivir y dejar de vivir son soluciones imaginarias.
La existencia está en otra parte" (Andre Breton)

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Changing the random_page_cost default (was: cpu_tuple_cost)

2005-03-15 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


> Your argument seems to be "this produces nice results for me", not
> "I have done experiments to measure the actual value of the parameter
> and it is X".  I *have* done experiments of that sort, which is where
> the default of 4 came from.  I remain of the opinion that reducing
> random_page_cost is a band-aid that compensates (but only partially)
> for problems elsewhere.  We can see that it's not a real fix from
> the not-infrequent report that people have to reduce random_page_cost
> below 1.0 to get results anywhere near local reality.  That doesn't say
> that the parameter value is wrong, it says that the model it's feeding
> into is wrong.

Good points: allow me to rephrase my question then:

When I install a new version of PostgreSQL and start testing my
applications, one of the most common problems is that many of my queries
are not hitting an index. I typically drop random_page_cost to 2 or
lower and this speeds things very significantly. How can I determine a
better way to speed up my queries, and why would this be advantageous
over simply dropping random_page_cost? How can I use my particular
situation to help develop a better model and perhaps make the defaults
work better for my queries and other people with databaes like mine.
(fairly simple schema, not too large (~2 Gig total), SCSI, medium to
high complexity queries, good amount of RAM available)?

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200503150600
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-BEGIN PGP SIGNATURE-

iD8DBQFCNsCbvJuQZxSWSsgRAs0sAJwLFsGApzfYNV5jPL0gGVW5BH37hwCfRSW8
ed3sLnMg1UOTgN3oL9JSIFo=
=cZIe
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] One tuple per transaction

2005-03-15 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> On Tue, Mar 15, 2005 at 06:51:19PM -0500, Tom Lane wrote:
>> I believe VACUUM already knows that xmin = xmax implies the tuple
>> is dead to everyone.

> Huh, that is too simplistic in a subtransactions' world, isn't it?

Well, it's still correct as a fast-path check.  There are extensions
you could imagine making ... but offhand I agree that it's not worth
the trouble.  Maybe in a few years when everyone and his sister is
using subtransactions constantly, we'll feel a need to optimize these
cases. 

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] cpu_tuple_cost

2005-03-15 Thread Greg Stark

David Brown <[EMAIL PROTECTED]> writes:

> Gregory Stark wrote:
> 
> >The "this day and age" argument isn't very convincing. Hard drive capacity
> >growth has far outstripped hard drive seek time and bandwidth improvements.
> >Random access has more penalty than ever.
>
> In point of fact, there haven't been noticeable seek time improvements for
> years. Transfer rates, on the other hand, have gone through the roof.

Er, yeah. I stated it wrong. The real ratio here is between seek time and
throughput.

Typical 7200RPM drives have average seek times are in the area of 10ms.
Typical sustained transfer rates are in the range of 40Mb/s. Postgres reads
8kB blocks at a time.

So 800kB/s for random access reads. And 40Mb/s for sequential reads. That's a
factor of 49. I don't think anyone wants random_page_cost to be set to 50
though.

For a high end 15k drive I see average seek times get as low as 3ms. And
sustained transfer rates get as high as 100Mb/s. So about 2.7Mb/s for random
access reads or about a random_page_cost of 37. Still pretty extreme.

So what's going on with the empirically derived value of 4? Perhaps this is
because even though Postgres is reading an entire table sequentially it's
unlikely to be the only I/O consumer? The sequential reads would be
interleaved occasionally by some other I/O forcing a seek to continue.

In which case the true random_page_cost seems like it would be extremely
sensitive to the amount of readahead the OS does. To reach a random_page_cost
of 4 given the numbers above for a 7200RPM drive requires that just under 25%
of the I/O of a sequential table scan be random seeks [*]. That translates to
32kB of sequential reading, which actually does sound like a typical value for
OS readahead.

I wonder if those same empirical tests would show even higher values of
random_page_cost if the readahead were turned up to 64kB or 128kB.




[*] A bit of an algebraic diversion: 

1s/10ms = 100 random buffers/s. 
random_page_cost = 4 so net sequential buffers/s = 400.

solve:

400 buffers = rnd+seq
1000ms = .2*seq + 10*rnd


-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster