Re: [PERFORM] Determine dead tuples size

2007-03-16 Thread Heikki Linnakangas

Alexey Romanchuk wrote:

thanks, i install contribs and try to analyze result of pgstattuple
function and found it strange.


Try "SELECT * FROM pgstattuple('foo')", that'll tell you what the 
columns are. Take a look at README.pgstattuple as well for more details.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Postgres batch write very slow - what to do

2007-03-16 Thread Merlin Moncure

On 3/15/07, femski <[EMAIL PROTECTED]> wrote:


I tried maxpages = 0 and  full_page_writes=off and it seemed to be taking
forever.
All other tricks I have already tried.

At this point I wondering if its a jdbc client side issue - I am using the
latest 8.1.
(as I said in an earlier post - I am using addBatch with batch size of 100).
But just in case - I am missing something.

If 17k record/sec is right around expected then I must say I am little
disappointed from the "most advanced open source database".


Be careful...you are just testing one very specific thing and it its
extremely possible that the Oracle JDBC batch insert is more optimized
than PostgreSQL's.  On my little pentium 4 workstation, by inserting
10 rows per insert:
insert values ([...]), ([...]), [8 more rows];

I got a 5x speedup in insert performance using this feature (which is
unfortunately new for 8.2).  Oracle is most likely pulling similar
tricks inside the driver.  PostgreSQL is much faster than you think...

merlin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Postgres batch write very slow - what to do

2007-03-16 Thread Merlin Moncure

On 3/16/07, Bob Dusek <[EMAIL PROTECTED]> wrote:

This may or may not be related to what you're seeing... but, when we
changed from Postgres 7.4.2 to 7.4.8, our batch processing slowed down
fairly significantly.

Here's what we were doing:

Step 1) Build a larg file full of SQL insert statements.
Step 2) Feed the file directly to "psql" using "psql dbname <
insertfile".

The time of execution for step 2 seemed like it nearly doubled from
7.4.2 to 7.4.8, for whatever reason (could have been the way Suse
compiled the binaries).  Perhaps the slowdown was something we could
have/should have tweaked with config options.



At any rate, what we did to speed it up was to wrap the entire file in a
transaction, as such: "BEGIN; ..filecontents.. COMMIT;"

Apparently the autocommit stuff in the version of 7.4.8 we were using
was just *doggedly* slow.

Perhaps you're already using a transaction for your batch, though.  Or,
maybe the problem isn't with Postgres.  Just thought I'd share.


If you are inserting records one by one without transaction (and no
fsync), i/o is going to determine your insertion speed.  not really
sure what was happening in your case...it looks like quite a different
type of issue from the OP.

anyways, to the OP some quick googling regarding postgresql jdbc
driver showed that the batch insert case is just not as optimized (in
the driver) as it could be.  The driver could do multi statement
inserts or use the libpq copy api, either of which would result in
huge performance gain.

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Postgres batch write very slow - what to do

2007-03-16 Thread Merlin Moncure

On 3/16/07, Merlin Moncure <[EMAIL PROTECTED]> wrote:

Be careful...you are just testing one very specific thing and it its
extremely possible that the Oracle JDBC batch insert is more optimized
than PostgreSQL's.  On my little pentium 4 workstation, by inserting
10 rows per insert:
insert values ([...]), ([...]), [8 more rows];


small correction here, I actually went and looked at the JDBC api and
realized 'addBatch' means to run multiple stmts at once, not batch
inserting.  femski, your best bet is to lobby the JDBC folks to build
support for 'copy' into the driver for faster bulk loads (or help out
in that regard).

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Postgres batch write very slow - what to do

2007-03-16 Thread Joshua D. Drake
femski wrote:
> Folks !
> 
> I have a batch application that writes approx. 4 million rows into a narrow
> table. I am using JDBC addBatch/ExecuteBatch with auto commit turned off.
> Batch size is 100. So far I am seeing Postgres take roughly five times the
> time it takes to do this in the Oracle. 

If you are using 8.2 could you try with the multi value inserts?

insert into foo(bar) values (bang) (bong) (bing) ...?

> 
> I have played with many parameters. Only one that seems to have any affect
> is fsync - but thats only 10% or so. 
> Initially I got the warning that checkpoints were happening too often so I
> increased the segments to 24. Warnings stopped, but no real improvement in
> performance.
> 
> Is postgres really that slow ? What am I missing ? 
> 
> Here are the changes to my postgressql.cong file. 
> 
> shared_buffers = 768MB
> work_mem = 256MB 
> maintenance_work_mem = 128MB
> fsync = off 
> 
> checkpoint_segments = 24
> autovacuum = on 
> 
> Thank you,
> 
> -Sanjay


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Determine dead tuples size

2007-03-16 Thread Tom Lane
Alexey Romanchuk <[EMAIL PROTECTED]> writes:
> When i try to sum all size (live, dead and free) the sum is not equal
> total size. For this table 206555428 + 3380295 + 13896816 = 223832539.
> The difference between total and sum is 9410085. It is near 5%.

pgstattuple is a bit simplistic: it doesn't count the page headers or
item pointers at all.  It looks to me like it also fails to consider
the effects of alignment padding --- if a tuple's length is shown as
63, that's what it counts, even though the effective length is 64.
(This might not be a problem in practice --- I'm not sure if the stored
t_len has always been maxaligned or not.)

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[PERFORM] unsubscribe

2007-03-16 Thread cedric


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Postgres batch write very slow - what to do

2007-03-16 Thread Carlos Moreno

Joshua D. Drake wrote:


insert into foo(bar) values (bang) (bong) (bing) ...?

 



Nit pick  (with a "correct me if I'm wrong" disclaimer :-)) :

Wouldn't that be  (bang), (bong), (bing)   ??

Carlos
--


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Postgres batch write very slow - what to do

2007-03-16 Thread Joshua D. Drake
Carlos Moreno wrote:
> Joshua D. Drake wrote:
> 
>> insert into foo(bar) values (bang) (bong) (bing) ...?
>>
>>  
>>
> 
> Nit pick  (with a "correct me if I'm wrong" disclaimer :-)) :
> 
> Wouldn't that be  (bang), (bong), (bing)   ??

Yes.

J


> 
> Carlos
> -- 
> 
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster
> 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PostgreSQL 8.2.3 VACUUM Timings/Performance

2007-03-16 Thread Bruce McAlister
Hi All,

Okay, I'm getting a little further now. I'm about to create entries in the 
pg_autovacuum system tables. However, I'm a little confused as to how I go 
about finding out the OID value of the tables. The pg_autovacuum table 
requires the OID of the table you want to create settings for (vacrelid). 
Can anyone shed some light on how I can extract the OID of the table? Also, 
what happens if you create a table without OID's, are you still able to add 
it's details in the pg_autovacuum table if there is no OID associated with a 
table?

  Name Type References Description
  vacrelid oid pg_class.oid The table this entry is for
  enabled bool   If false, this table is never autovacuumed
  vac_base_thresh integer   Minimum number of modified tuples before 
vacuum
  vac_scale_factor float4   Multiplier for reltuples to add to 
vac_base_thresh
  anl_base_thresh integer   Minimum number of modified tuples before 
analyze
  anl_scale_factor float4   Multiplier for reltuples to add to 
anl_base_thresh
  vac_cost_delay integer   Custom vacuum_cost_delay parameter
  vac_cost_limit integer   Custom vacuum_cost_limit parameter
  freeze_min_age integer   Custom vacuum_freeze_min_age parameter
  freeze_max_age integer   Custom autovacuum_freeze_max_age parameter


Thanks
Bruce


"Bruce McAlister" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Hi Tom,
>
> Thanks for the suggestion. It's been a while since I replied to this as I 
> had to go and do some further investigation of the docs with regards the 
> autovacuum daemons configuration. According to the documentation, the 
> formula's for the vacuum and analyze are as follows:
>
> Vacuum
>vacuum threshold = vacuum base threshold + vacuum scale factor * number 
> of tuples
> Analyze
>analyze threshold = analyze base threshold + analyze scale factor * 
> number of tuples
>
> My current settings for autovacuum are as follows:
>
> # - Cost-Based Vacuum Delay -
>
> vacuum_cost_delay = 200 # 0-1000 milliseconds
> vacuum_cost_page_hit= 1 # 0-1 credits
> vacuum_cost_page_miss = 10   # 0-1 credits
> vacuum_cost_page_dirty = 20   # 0-1 credits
> vacuum_cost_limit   = 200 # 0-1 credits
>
> #---
> # AUTOVACUUM PARAMETERS
> #---
>
> autovacuum   = on  # 
> enable autovacuum subprocess?
> 
> # 'on' requires stats_start_collector
> 
> # and stats_row_level to also be on
> autovacuum_naptime = 1min   # time 
> between autovacuum runs
> autovacuum_vacuum_threshold = 500 # min # of tuple 
> updates before
> 
> # vacuum
> autovacuum_analyze_threshold  = 250# min # of 
> tuple updates before
> 
> # analyze
> autovacuum_vacuum_scale_factor = 0.2 # fraction of rel 
> size before
> 
> # vacuum
> autovacuum_analyze_scale_factor = 0.1 # fraction of 
> rel size before
> 
> # analyze
> autovacuum_freeze_max_age   = 2 # maximum XID age 
> before forced vacuum
> 
> # (change requires restart)
> autovacuum_vacuum_cost_delay  = -1   # default vacuum 
> cost delay for
> 
> # autovacuum, -1 means use
> 
> # vacuum_cost_delay
> autovacuum_vacuum_cost_limit= -1   # default 
> vacuum cost limit for
> 
> # autovacuum, -1 means use
> 
> # vacuum_cost_limit
>
> Thus to make the autovacuum more aggressive I am thinking along the lines 
> of changing the following parameters:
>
> autovacuum_vacuum_threshold = 250
> autovacuum_analyze_threshold = 125
>
> The documentation also mentions that when the autovacuum runs it selects a 
> single database to process on that run. This means that the particular 
> table that we are interrested in will only be vacuumed once every 17 
> minutes, assuming we have 18 databases and the selection process is 
> sequential through the database list.
>
> From my understanding of the documentation, the only way to work around 
> this issue is to manually update the system catalog table pg_autovacuum 
> and set the pg_autovacuum.enabled field to false to skip the autovacuum on 
> tables that dont require such frequent vacuums. If I do enable this 
> feature, and I manually run a vacuumdb from the command line against that 
> particular disabled table, will the vacuum still process the table? I'm 
> assuming too, that the best tables to disable autovacuum on will be ones 
> with a minimal amount of update/delete queries run against it. For 
> example, if we have a table that only has inserts applied to it, it is 
> safe to assume that that table can safely be ignored by autovacuum.
>
> Do you have any other s

Re: [PERFORM] Postgres batch write very slow - what to do

2007-03-16 Thread hubert depesz lubaczewski

On 3/13/07, femski <[EMAIL PROTECTED]> wrote:

I have a batch application that writes approx. 4 million rows into a narrow
table. I am using JDBC addBatch/ExecuteBatch with auto commit turned off.
Batch size is 100. So far I am seeing Postgres take roughly five times the
time it takes to do this in the Oracle.


you can try to use pg_bulkload.
since it is called as standard function you shouldn't have problems
with jdbc. and it's apparently fast.

depesz

http://pgfoundry.org/projects/pgbulkload/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly