Re: [PERFORM] pl/pgsql faster than raw SQL?

2005-03-09 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

John A Meinel wrote:
 > That sounds more like you had bad statistics on the field1 column, which
> caused postgres to switch from a seqscan to an index scan, only there
> were so many rows with field1='New' that it actually would have been
> faster with a seqscan.

The field1 was a calculated field and with the filter "='New'"
postgres was executing that function on more rows than without filter.



Regards
Gaetano Mendola



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCLtwZ7UpzwH2SGd4RAhU5AJwMeFWwIO/UfdU0QTDo+FTCxPhqYACfYNVl
1yBUEObhZhUDnNDXdsJ/bi0=
=xc8U
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] vacuum full, why multiple times ?

2005-03-09 Thread Gaetano Mendola
Michael Fuhr wrote:
> On Wed, Mar 09, 2005 at 02:02:13AM +0100, Gaetano Mendola wrote:
> 
> 
>>Basically I'm noticing that a simple vacuum full is not enough to
>>shrink completelly the table:
>>
>># vacuum full verbose url;
>>INFO:  vacuuming "public.url"
>>INFO:  "url": found 268392 removable, 21286 nonremovable row versions in 8563 
>>pages
>>DETAIL:  22 dead row versions cannot be removed yet.
> 
> 
> How busy is the database?  I'd guess that each time you run VACUUM,
> there are still open transactions that have visibility to the dead
> rows, so VACUUM doesn't touch them.  Those transactions eventually
> complete, and eventually VACUUM FULL does what you're expecting.
> I don't know if that's the only possible cause, but I get results
> similar to yours if I have transactions open when I run VACUUM.
> 

That was my first tough but it seem strange that 2 dead rows where
grabbing 7883 pages, don't you think ?


# vacuum full verbose url;
INFO:  vacuuming "public.url"
INFO:  "url": found 74 removable, 21266 nonremovable row versions in 8550 pages
DETAIL:  2 dead row versions cannot be removed yet.
[SNIP]
INFO:  "url": moved 11 row versions, truncated 8550 to 8504 pages


and in the next run:


# vacuum full verbose url;
INFO:  vacuuming "public.url"
INFO:  "url": found 13 removable, 21264 nonremovable row versions in 8504 pages
DETAIL:  0 dead row versions cannot be removed yet.
[SNIP]
INFO:  "url": moved 5666 row versions, truncated 8504 to 621 pages




Regards
Gaetano Mendola





---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Help trying to tune query that executes 40x slower than in SqlServer

2005-03-09 Thread Hugo Ferreira
Hi,

Well, I think the problem is far more complex than just joins
reordering... I've restrucutred the query so that it won't use any
explicit joins.Instead it now has a series of 'in (select ...)' and
'not exists (select ...)'. This actually got faster... sometimes!!!

select 1, 1168,  C.contxt_id, C.contxt_elmt_ix, null, null, null,
null, null, null, 1
from CONTXT as P, CONTXT_ELMT as C, MRS_REPLICATION_OUT as S,
MRS_TRANSACTION as TRANS
where S.age=0 
and S.trans_id=1 
and S.trans_id = TRANS.trans_id  
and S.ent_id = 1029 
and (P.contxt_id=C.contxt_id) and (P.contxt_id = S.pk1)  
and (C.owner_id not in (select non_repl_data_owner_id from
NON_REPL_DATA_OWNER))
AND (C.owner_id not in (select repl_data_owner_id from 
REPL_DATA_OWNER_RSDNC 
where rsdnc_node_id = 
TRANS.recv_node_id))
AND (not exists (select pk1 from MRS_REPLICATION_OUT 
where trans_id=1 
and ent_id=1168 
and C.contxt_id = pk1 
AND C.contxt_elmt_ix = pk2)) 
AND (not exists (select pk1 from MRS_TRANSACTION RED_TRANS,
MRS_REPLICATION_OUT RED_OUT
where RED_TRANS.cat_code = 'OUT' 
and RED_TRANS.trans_type in ('X01', 
'X02') 
and RED_TRANS.trans_id=RED_OUT.trans_id 
and 
RED_TRANS.prov_node_id=TRANS.prov_node_id 
and 
RED_TRANS.recv_node_id=TRANS.recv_node_id 
and RED_OUT.ent_id=1168 
and C.contxt_id = pk1 
AND C.contxt_elmt_ix = pk2))


For example... I run the query, it takes 122seconds. Then I delete the
target tables, vacuum the database, re-run it again: 9s. But if I run
vacuum several times, and then run, it takes again 122seconds. If I
stop this 122seconds query, say, at second 3 and then run it again, it
will only take 9s. It simply doesn't make sense. Also, explain analyse
will give me diferent plans each time I run it... Unfortunately, this
is rendering PostgreSQL unusable for our goals. Any ideas?

By the way, I got the following indexes over MRS_REPLICATION_OUT which
seems to speed up things:

CREATE INDEX ix_mrs_replication_out_all ON mrs_replication_out 
USING btree (ent_id, age, trans_id);

CREATE INDEX ix_mrs_replication_pks ON mrs_replication_out 
USING btree (trans_id, ent_id, pk1, pk2, pk3, pk4, pk5, pk6, pk7);

Note: pk2... pk7 are nullable columns. trans_id is the least variant
column. pk1 is the most variant column. Most of the times, the
execution plan includes an 'index scan' over the first index
(ix_mrs_replication_out_all), followed by a filter with columns from
the second index (trans_id, ent_id, pk1, pk2, pk3, pk4, pk5, pk6,
pk7), though the 'age' column is not used... Any guess why??

Thanks in advance,

Hugo Ferreira

> It is possible but complicated to determine that reordering outer joins
> is safe in some cases.  We don't currently have such logic in PG.  It
> may be that SQL Server does have that capability and that's why it's
> finding a much better plan ... but for now you have to do that by hand
> in PG.

-- 
GPG Fingerprint: B0D7 1249 447D F5BB 22C5  5B9B 078C 2615 504B 7B85

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] vacuum full, why multiple times ?

2005-03-09 Thread Richard Huxton
Gaetano Mendola wrote:
# vacuum full verbose url;
INFO:  vacuuming "public.url"
INFO:  "url": found 74 removable, 21266 nonremovable row versions in 8550 pages
DETAIL:  2 dead row versions cannot be removed yet.
[SNIP]
INFO:  "url": moved 11 row versions, truncated 8550 to 8504 pages
and in the next run:
# vacuum full verbose url;
INFO:  vacuuming "public.url"
INFO:  "url": found 13 removable, 21264 nonremovable row versions in 8504 pages
DETAIL:  0 dead row versions cannot be removed yet.
[SNIP]
INFO:  "url": moved 5666 row versions, truncated 8504 to 621 pages
If page number 8549 was the one being held, I don't think vacuum can 
truncate the file. The empty space can be re-used, but the rows can't be 
moved to a lower page while a transaction is using them.

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] vacuum full, why multiple times ?

2005-03-09 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Richard Huxton wrote:
> Gaetano Mendola wrote:
> 
>>
>> # vacuum full verbose url;
>> INFO:  vacuuming "public.url"
>> INFO:  "url": found 74 removable, 21266 nonremovable row versions in
>> 8550 pages
>> DETAIL:  2 dead row versions cannot be removed yet.
>> [SNIP]
>> INFO:  "url": moved 11 row versions, truncated 8550 to 8504 pages
>>
>>
>> and in the next run:
>>
>>
>> # vacuum full verbose url;
>> INFO:  vacuuming "public.url"
>> INFO:  "url": found 13 removable, 21264 nonremovable row versions in
>> 8504 pages
>> DETAIL:  0 dead row versions cannot be removed yet.
>> [SNIP]
>> INFO:  "url": moved 5666 row versions, truncated 8504 to 621 pages
> 
> 
> If page number 8549 was the one being held, I don't think vacuum can
> truncate the file. The empty space can be re-used, but the rows can't be
> moved to a lower page while a transaction is using them.

It's clear now.


Regards
Gaetano Mendola
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCLwhu7UpzwH2SGd4RAhEIAKDodnb03RvInDOJz9H+4w//DgJifACeNINP
0UMkQ0yBwNAZw91clvAUjRI=
=e+mM
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] vacuum full, why multiple times ?

2005-03-09 Thread Tom Lane
Gaetano Mendola <[EMAIL PROTECTED]> writes:
> Richard Huxton wrote:
>> If page number 8549 was the one being held, I don't think vacuum can
>> truncate the file. The empty space can be re-used, but the rows can't be
>> moved to a lower page while a transaction is using them.

> It's clear now.

Not entirely.  VACUUM FULL doesn't really worry about whether anyone
else "is using" the table --- it knows no one else is, because it holds
exclusive lock on the table.  However it must preserve dead tuples that
would still be visible to any existing transaction, because that other
transaction could come along and look at the table after VACUUM
finishes and releases the lock.

What really drives the process is that VACUUM FULL moves tuples in order
to make the file shorter (release empty pages at the end) --- and not
for any other reason.  So it could stop when there is still plenty of
dead space in the table.  It stops when the last nonempty page contains
a tuple that it can't find room for in any earlier page.

What I suppose you saw was that page 8503 contained a tuple so large it
wouldn't fit in the free space on any earlier page.  By the time of the
second vacuum, either this tuple was deleted, or deletion of some other
tuples had made a hole big enough for it to fit in.

The extent of the truncation in the second vacuum says that you had
quite a lot of free space, so it's a bit surprising that there wasn't
enough room in any one page for such a tuple to be moved, but that seems
to be what happened.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] vacuum full, why multiple times ?

2005-03-09 Thread Gaetano Mendola
Tom Lane wrote:
> Gaetano Mendola <[EMAIL PROTECTED]> writes:
> 
>>Richard Huxton wrote:
>>
>>>If page number 8549 was the one being held, I don't think vacuum can
>>>truncate the file. The empty space can be re-used, but the rows can't be
>>>moved to a lower page while a transaction is using them.
> 
> 
>>It's clear now.
> 
> 
> Not entirely.  VACUUM FULL doesn't really worry about whether anyone
> else "is using" the table --- it knows no one else is, because it holds
> exclusive lock on the table.  However it must preserve dead tuples that
> would still be visible to any existing transaction, because that other
> transaction could come along and look at the table after VACUUM
> finishes and releases the lock.
> 
> What really drives the process is that VACUUM FULL moves tuples in order
> to make the file shorter (release empty pages at the end) --- and not
> for any other reason.  So it could stop when there is still plenty of
> dead space in the table.  It stops when the last nonempty page contains
> a tuple that it can't find room for in any earlier page.
> 
> What I suppose you saw was that page 8503 contained a tuple so large it
> wouldn't fit in the free space on any earlier page.  By the time of the
> second vacuum, either this tuple was deleted, or deletion of some other
> tuples had made a hole big enough for it to fit in.
> 
> The extent of the truncation in the second vacuum says that you had
> quite a lot of free space, so it's a bit surprising that there wasn't
> enough room in any one page for such a tuple to be moved, but that seems
> to be what happened.

All rows of that table are almost of the same size, so this is not the
reason, and neither any row was deleted.
May be the page 8503 was cointainig a dead row ?

I can send you off line another vacuum full sequence if you need it, I sent it
to list but apparently the size was too much and noone of you seen it.

Regards
Gaetano Mendola








---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] index scan on =, but not < ?

2005-03-09 Thread Jim C. Nasby
On Tue, Mar 08, 2005 at 11:20:20PM -0600, Bruno Wolff III wrote:
> On Tue, Mar 08, 2005 at 22:55:19 -0600,
>   "Jim C. Nasby" <[EMAIL PROTECTED]> wrote:
> > On Tue, Mar 08, 2005 at 10:38:21PM -0600, Bruno Wolff III wrote:
> > > Not exactly. If the number of rows to be examined is on the order of 5%
> > > of the table, an index scan will probably be slower than a sequential
> > > scan. The visibility issue makes index scans slower in the case that
> > 
> > Shouldn't that be 50%?
> 
> No. When you are doing an index scan of a significant part of the table,
> you will fetch some heap pages more than once. You will also be fetching
> blocks out of order, so you will lose out on read ahead optimization
> by the OS. This assumes that you don't get a lot of cache hits on the
> help pages. If a significant portion of the table is cached, then the
> trade off point will be at a higher percentage of the table.

Ahh, I was thinking of a high correlation factor on the index. I still
question 5% though... that seems awefully low.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] How to speed up tsearch2 indexing

2005-03-09 Thread Werner Bohl
All,

I have a table with ~ 3 million records. I'm indexing a field holding
names, no more than 200 bytes each. Indexing the resulting tsvector
takes forever. It's been running now for more than 40 hours on a Linux
with PG 8.01, a single Xeon & 4GB RAM. My  work_mem postgresql.conf
parameter is at 240960 and maintenance_work_mem at 96384, although the
index task is using at most 12MB. Task is 99% cpu bound. Is there any
way I may speed up the indexing?


TIA,


-- 
Werner Bohl <[EMAIL PROTECTED]>
IDS de Costa Rica S.A.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How to speed up tsearch2 indexing

2005-03-09 Thread Oleg Bartunov
On Wed, 9 Mar 2005, Werner Bohl wrote:
All,
I have a table with ~ 3 million records. I'm indexing a field holding
names, no more than 200 bytes each. Indexing the resulting tsvector
takes forever. It's been running now for more than 40 hours on a Linux
with PG 8.01, a single Xeon & 4GB RAM. My  work_mem postgresql.conf
parameter is at 240960 and maintenance_work_mem at 96384, although the
index task is using at most 12MB. Task is 99% cpu bound. Is there any
way I may speed up the indexing?
What's your tsearch2 configuration ? Do you use dictionaries ?
I wrote a brief explanation of tsearch2 internals
http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_internals
Hope, it could help you.

TIA,

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] How to speed up tsearch2 indexing

2005-03-09 Thread Werner Bohl
On Wed, 2005-03-09 at 20:41 +0300, Oleg Bartunov wrote:

> What's your tsearch2 configuration ? Do you use dictionaries ?
> I wrote a brief explanation of tsearch2 internals
> http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_internals
> 
Tsearch2 is using default english configuration. No dictionaries, just
put some more stop words (10) in english.stop.

> Hope, it could help you.
> 
> >
> >
> > TIA,
> >
> >
> >
> 
>   Regards,
>   Oleg
> _
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
> 
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
> 
>http://archives.postgresql.org


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] How to speed up tsearch2 indexing

2005-03-09 Thread Oleg Bartunov
On Wed, 9 Mar 2005, Werner Bohl wrote:
On Wed, 2005-03-09 at 20:41 +0300, Oleg Bartunov wrote:
What's your tsearch2 configuration ? Do you use dictionaries ?
I wrote a brief explanation of tsearch2 internals
http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_internals
Tsearch2 is using default english configuration. No dictionaries, just
put some more stop words (10) in english.stop.
it's not good, because you, probably, have a lot of unique words.
Do you have some statistics, see stat() function ?


Hope, it could help you.

TIA,

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 6: Have you searched our list archives?
   http://archives.postgresql.org

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Help trying to tune query that executes 40x slower than in SqlServer

2005-03-09 Thread Russell Smith
On Wed, 9 Mar 2005 11:08 pm, Hugo Ferreira wrote:
> For example... I run the query, it takes 122seconds. Then I delete the
> target tables, vacuum the database, re-run it again: 9s. But if I run
> vacuum several times, and then run, it takes again 122seconds. If I
> stop this 122seconds query, say, at second 3 and then run it again, it
> will only take 9s. It simply doesn't make sense. Also, explain analyse
> will give me diferent plans each time I run it... Unfortunately, this
> is rendering PostgreSQL unusable for our goals. Any ideas?
> 
The explain analyze is still be best information if you want assistance with
what postgresql is doing, and how to stop it.  If you could attach 
explain analyzes for both the fast (9s), and slow (122s) runs, that would
help people get an idea of how the query is running.  At the moment
we don't know how postgresql is actually executing the query.

Regards

Russell Smith.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] index scan on =, but not < ?

2005-03-09 Thread David Brown
Assuming your system isn't starved for memory, shouldn't repeated page 
fetches be hitting the cache?

I've also wondered about the conventional wisdom that read ahead doesn't 
help random reads. I may well be missing something, but *if* the OS has 
enough memory to cache most of the table, surely read ahead will still 
work to your advantage?

Bruno Wolff III wrote:
No. When you are doing an index scan of a significant part of the table,
you will fetch some heap pages more than once. You will also be fetching
blocks out of order, so you will lose out on read ahead optimization
by the OS. This assumes that you don't get a lot of cache hits on the
help pages. If a significant portion of the table is cached, then the
trade off point will be at a higher percentage of the table.
 

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] index scan on =, but not < ?

2005-03-09 Thread David Brown
Jim C. Nasby wrote:
Ahh, I was thinking of a high correlation factor on the index. I still
question 5% though... that seems awefully low.
 

Not really. It all depends on how many records you're packing into each 
page. 1% may well be the threshold for small records.

Tom mentioned this in the last couple of months. He was citing a uniform 
distribution as an example and I thought that sounded a little 
pessimistic, but when I did the (possibly faulty) math with a random 
distribution, I discovered he wasn't far off.

It's not this simple, but if you can fit 50 randomly organized records 
into each page and you want to retrieve 2% of the rows, it's likely 
you'll have to fetch every page - believe it or not.

What concerns me is that this all depends on the correlation factor, and 
I suspect that the planner is not giving enough weight to this. 
Actually, I'm wondering if it's even looking at the statistic, but I 
haven't created a test to check. It might explain quite a few complaints 
about the planner not utilizing indexes.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Help trying to tune query that executes 40x slower

2005-03-09 Thread Jim Johannsen
Hugo,
   I think your problem is with the MRS_TRANSACTION TRANS table.  It is 
not joining anything when declared, but later it is joining thru a LEFT 
JOIN of the REPL_DATA_OWNER_RSDNC table.  In fact I'm not sure that this 
table is really needed.  I would suggest rewriting your FROM clause.  It 
appears a little busy and includes additional filters that are taken 
care of in the WHERE clause.

   What are the table layouts and what fields are indexed? 


Hugo Ferreira wrote:
Hi there :-)
I'm really, really having trouble with this query... It is a part of,
hmmm... 200 similar querys that I dinyamically build and run in a
stored procedure. This one, for example, takes 27seconds to run. The
whole stored procedure executes in about 15minutes. This is too much
when compared to the exact same database, with the same indexes and
same data running under SqlServer 2000, which takes 21seconds to run
the whole batch.
Any help would be extremely appreciated. I've also tried to tune up
the configuration
insert into MRS_REPLICATION_OUT select 514, 1168,  C.contxt_id,
C.contxt_elmt_ix, CAST(null as NUMERIC(18)), CAST(null as
NUMERIC(18)), CAST(null as NUMERIC(18)), CAST(null as NUMERIC(18)),
CAST(null as NUMERIC(18)), null, 1 from c2iedm.CONTXT as P inner join
c2iedm.CONTXT_ELMT as C on (P.contxt_id=C.contxt_id) inner join
MRS_REPLICATION_OUT as S on S.ent_id=1029 and (CAST(P.contxt_id AS
numeric(18)) = S.pk1) inner join MRS_TRANSACTION TRANS on
TRANS.trans_id=514 left join NON_REPL_DATA_OWNER NRDO on
NRDO.non_repl_data_owner_id=C.owner_id left join REPL_DATA_OWNER_RSDNC
RDOR on RDOR.owner_id=C.owner_id and
RDOR.rsdnc_node_id=TRANS.recv_node_id left join MRS_REPLICATION_OUT
OUT on OUT.trans_id=514 and OUT.ent_id=1168 and ((CAST(C.contxt_id
AS numeric(18)) = OUT.pk1 AND CAST(C.contxt_elmt_ix AS numeric(18)) =
OUT.pk2)) inner join MRS_TRANSACTION RED_TRANS on
TRANS.prov_node_id=RED_TRANS.prov_node_id and
TRANS.recv_node_id=RED_TRANS.recv_node_id left join
MRS_REPLICATION_OUT RED_OUT on RED_TRANS.cat_code = 'OUT' and
RED_TRANS.trans_type in ('X01', 'X02') and
RED_TRANS.trans_id=RED_OUT.trans_id where S.age=0 and S.trans_id=514
and (NRDO.non_repl_data_owner_id is null) AND (RDOR.repl_data_owner_id
is null) AND (OUT.trans_id is null) AND (RED_OUT.trans_id is null);
This kind of inserts generate few rows. Between 8k and 15k for this particular
insert, and about 20k for the whole batch. If I try to run a batch
to generate about 50k rows, then I'll be stuck here for more that 45h.
Compare this to 12minutes when running SqlServer 2000.
Here is the result of explain analyze:
"Merge Left Join  (cost=1338.32..1377.99 rows=45 width=32) (actual
time=719.000..26437.000 rows=14862 loops=1)"
"  Merge Cond: ("outer".trans_id = "inner".trans_id)"
"  Join Filter: (("outer".cat_code = 'OUT'::bpchar) AND
(("outer".trans_type = 'X01'::bpchar) OR ("outer".trans_type =
'X02'::bpchar)))"
"  Filter: ("inner".trans_id IS NULL)"
"  ->  Sort  (cost=1067.36..1067.47 rows=45 width=56) (actual
time=719.000..735.000 rows=14862 loops=1)"
"Sort Key: red_trans.trans_id"
"->  Merge Join  (cost=851.66..1066.12 rows=45 width=56)
(actual time=407.000..673.000 rows=14862 loops=1)"
"  Merge Cond: ("outer".recv_node_id = "inner".recv_node_id)"
"  Join Filter: ("outer".prov_node_id = "inner".prov_node_id)"
"  ->  Nested Loop Left Join  (cost=847.14..987.28
rows=3716 width=60) (actual time=407.000..610.000 rows=14862 loops=1)"
"Join Filter: ((("outer".contxt_id)::numeric(18,0)
= "inner".pk1) AND (("outer".contxt_elmt_ix)::numeric(18,0) =
"inner".pk2))"
"Filter: ("inner".trans_id IS NULL)"
"->  Merge Left Join  (cost=718.22..746.87
rows=3716 width=60) (actual time=407.000..563.000 rows=14862 loops=1)"
"  Merge Cond: (("outer".recv_node_id =
"inner".rsdnc_node_id) AND ("outer".owner_id = "inner".owner_id))"
"  Filter: ("inner".repl_data_owner_id IS NULL)"
"  ->  Sort  (cost=717.19..726.48 rows=3716
width=74) (actual time=407.000..423.000 rows=14862 loops=1)"
"Sort Key: trans.recv_node_id, c.owner_id"
"->  Nested Loop Left Join
(cost=1.01..496.84 rows=3716 width=74) (actual time=0.000..312.000
rows=14862 loops=1)"
"  Join Filter:
("inner".non_repl_data_owner_id = "outer".owner_id)"
"  Filter:
("inner".non_repl_data_owner_id IS NULL)"
"  ->  Nested Loop
(cost=0.00..412.22 rows=3716 width=74) (actual time=0.000..186.000
rows=14862 loops=1)"
"->  Seq Scan on
mrs_transaction trans  (cost=0.00..2.05 rows=1 width=28) (actual
time=0.000..0.000 rows=1 loops=1)"
"  Filter: (trans_id =
514::numeric)"
"  

[PERFORM] Multi-line requests in COPY ... FROM STDIN

2005-03-09 Thread Mischa
I'm using a 7.4.6 Perl app that bulk-loads a table, 
by executing a "COPY TMP_Message FROM STDIN", then letting
$dbh->func($message_text."\n", "putline")

Speculation made me try catenating Several \n-terminated lines
together, and making a single putline() call with that.
Lo and behold, all the lines went in as separate rows, as I hoped.

I haven't measured the performance difference using this multiline
batching. I'm hoping that there will be as much,since the app is really sucking
on a 500 msg/sec firehose, and the db side needs serious speeding up. Question
is, am I  playing with a version-dependent anomaly, or should I expect this
to continue in 8.x (until, eventually, silently, something causes
this to break)?

I'm presuming that this is not a Perl DBI/DBD::Pg question,
but rather, depending on the underlying pq lib and fe protocol.

-- 
"Dreams come true, not free."


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[PERFORM] What's better: Raid 0 or disk for seperate pg_xlog

2005-03-09 Thread Karim Nassar
From rom http://www.powerpostgresql.com/PerfList/

"even in a two-disk server, you can put the transaction log onto the
operating system disk and reap some benefits."

Context: I have a two disk server that is about to become dedicated to
postgresql (it's a sun v40z running gentoo linux).

What's "theoretically better"? 

1) OS and pg_xlog on one disk, rest of postgresql on the other? (if I
   understand the above correctly)
2) Everything striped Raid 0?
3) 

TIA,
-- 
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221


signature.asc
Description: This is a digitally signed message part