Re: [PERFORM] Join runs for > 10 hours and then fills up >1.3TB of disk space

2008-05-16 Thread kevin kempter
Also, I'm running version 8.3 on a centOS box with 2 dual core CPU's and 32Gig of ram On May 16, 2008, at 12:58 AM, kevin kempter wrote: Sorry I goofed on the query text Here's the correct query: select f14.xpublisher_dim_id, f14.xtime_dim_id, f14.xlocation_dim_id, f14.xreferrer_dim_id, f1

Re: [PERFORM] Join runs for > 10 hours and then fills up >1.3TB of disk space

2008-05-16 Thread Claus Guttesen
> I have a table with 9,961,914 rows in it (see the describe of > bigtab_stats_fact_tmp14 below) > > I also have a table with 7,785 rows in it (see the describe of xsegment_dim > below) > > I'm running the join shown below and it takes > 10 hours and eventually runs > out of disk space on a 1.4TB f

Re: [PERFORM] Join runs for > 10 hours and then fills up >1.3TB of disk space

2008-05-16 Thread kevin kempter
Sorry I goofed on the query text Here's the correct query: select f14.xpublisher_dim_id, f14.xtime_dim_id, f14.xlocation_dim_id, f14.xreferrer_dim_id, f14.xsite_dim_id, f14.xsystem_cfg_dim_id, f14.xaffiliate_dim_id, f14.customer_id, f14.pf_dts_id, f14.episode_id, f14.sessionid, f14.bytes_received

Re: [PERFORM] Join runs for > 10 hours and then fills up >1.3TB of disk space

2008-05-16 Thread Simon Riggs
On Fri, 2008-05-16 at 00:31 -0600, kevin kempter wrote: > I'm running the join shown below and it takes > 10 hours and > eventually runs out of disk space on a 1.4TB file system Well, running in 10 hours doesn't mean there's a software problem, nor does running out of disk space. Please crunc

Re: [PERFORM] Join runs for > 10 hours and then fills up >1.3TB of disk space

2008-05-16 Thread Richard Huxton
kevin kempter wrote: Hi List; I have a table with 9,961,914 rows in it (see the describe of bigtab_stats_fact_tmp14 below) I also have a table with 7,785 rows in it (see the describe of xsegment_dim below) I'm running the join shown below and it takes > 10 hours and eventually runs out of

Re: [PERFORM] Join runs for > 10 hours and then fills up >1.3TB of disk space

2008-05-16 Thread kevin kempter
I'm expecting 9,961,914 rows returned. Each row in the big table should have a corresponding key in the smaller tale, I want to basically "expand" the big table column list by one, via adding the appropriate key from the smaller table for each row in the big table. It's not a cartesion prod

Re: [PERFORM] Join runs for > 10 hours and then fills up >1.3TB of disk space

2008-05-16 Thread Richard Huxton
kevin kempter wrote: I'm expecting 9,961,914 rows returned. Each row in the big table should have a corresponding key in the smaller tale, I want to basically "expand" the big table column list by one, via adding the appropriate key from the smaller table for each row in the big table. It's not

Re: [PERFORM] Join runs for > 10 hours and then fills up >1.3TB of disk space

2008-05-16 Thread Richard Huxton
kevin kempter wrote: Hi List; I have a table with 9,961,914 rows in it (see the describe of bigtab_stats_fact_tmp14 below) I also have a table with 7,785 rows in it (see the describe of xsegment_dim below) Something else is puzzling me with this - you're joining over four fields. from bi

Re: [PERFORM] Join runs for > 10 hours and then fills up >1.3TB of disk space

2008-05-16 Thread Luke Lonergan
Try 'set enable-mergejoin=false' and see if you get a hashjoin. - Luke - Original Message - From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> To: Richard Huxton <[EMAIL PROTECTED]> Cc: pgsql-performance@postgresql.org Sent: Fri May 16 04:00:41 2008 Subject: Re: [PERFORM] Join runs for > 10 hou

Re: [PERFORM] Join runs for > 10 hours and then fills up >1.3TB of disk space

2008-05-16 Thread kevin kempter
On further investigation it turns out that I/we have a serious data issue in that my small table is full of 'UNKNOWN' tags so my query cannot associate the data correctly - thus I will end up with 2+ billion rows. Thanks everyone for your help On May 16, 2008, at 1:38 AM, Simon Riggs w

Re: [PERFORM] which ext3 fs type should I use for postgresql

2008-05-16 Thread david
On Thu, 15 May 2008, [EMAIL PROTECTED] wrote: On Thu, 15 May 2008, Matthew Wakeling wrote: On Thu, 15 May 2008, Philippe Amelant wrote: using mkfs.ext3 I can use "-T" to tune the filesytem mkfs.ext3 -T fs_type ... fs_type are in /etc/mke2fs.conf (on debian) If you look at that file, you'd

Re: [PERFORM] which ext3 fs type should I use for postgresql

2008-05-16 Thread Scott Marlowe
On Thu, May 15, 2008 at 9:38 AM, Joshua D. Drake <[EMAIL PROTECTED]> wrote: > Guillaume Cottenceau wrote: >> >> Matthew Wakeling writes: > >> It is still relevant, as with 5% margin, you can afford changing >> that to 0% with tune2fs, just the time for you to start PG and >> remove some data by SQ

Re: [PERFORM] which ext3 fs type should I use for postgresql

2008-05-16 Thread Joshua D. Drake
On Fri, 16 May 2008 11:07:17 -0600 "Scott Marlowe" <[EMAIL PROTECTED]> wrote: > Sorry, but that's like saying that open heart surgery isn't a fix for > clogged arteries because you should have been taking aspirin everyday > and exercising. It might not be the best answer, but sometimes it's > the

[PERFORM] very slow left join

2008-05-16 Thread Ben
I've inherited an Oracle database that I'm porting to Postgres, and this has been going quite well until now. Unfortunately, I've found one view (a largish left join) that runs several orders of magnitude slower on Postgres than it did on Oracle. => select version();

Re: [PERFORM] I/O on select count(*)

2008-05-16 Thread Greg Smith
On Thu, 15 May 2008, Alvaro Herrera wrote: Starting a transaction does not write anything to pg_clog. For Matt and others, some details here are in src/backend/access/transam/README: "pg_clog records the commit status for each transaction that has been assigned an XID." "Transactions and

Re: [PERFORM] very slow left join

2008-05-16 Thread Scott Marlowe
On Fri, May 16, 2008 at 11:56 AM, Ben <[EMAIL PROTECTED]> wrote: > I've inherited an Oracle database that I'm porting to Postgres, and this has > been going quite well until now. Unfortunately, I've found one view (a > largish left join) that runs several orders of magnitude slower on Postgres > th

Re: [PERFORM] very slow left join

2008-05-16 Thread Ben
On Fri, 16 May 2008, Scott Marlowe wrote: Just for giggles, try running the query like so: set enable_nestloop = off; explain analyze ... and see what happens. I'm guessing that the nested loops are bad choices here. You guess correctly, sir! Doing so shaves 3 orders of magnitude off the r

Re: [PERFORM] I/O on select count(*)

2008-05-16 Thread Simon Riggs
On Fri, 2008-05-16 at 14:05 -0400, Greg Smith wrote: > After reading the code and that documentation a bit, the part I'm > still not sure about is whether the CLOG entry is created when the XID > is assigned and then kept current as the state changes, or whether > that isn't even in CLOG until the

Re: [PERFORM] very slow left join

2008-05-16 Thread Scott Marlowe
On Fri, May 16, 2008 at 12:21 PM, Ben <[EMAIL PROTECTED]> wrote: > On Fri, 16 May 2008, Scott Marlowe wrote: > >> Just for giggles, try running the query like so: >> >> set enable_nestloop = off; >> explain analyze ... >> >> and see what happens. I'm guessing that the nested loops are bad choices

Re: [PERFORM] very slow left join

2008-05-16 Thread Ben
On Fri, 16 May 2008, Scott Marlowe wrote: Well, I'm guessing that you aren't in locale=C and that the text Correct, I am not. And my understanding is that by moving to the C locale, I would loose utf8 validation, so I don't want to go there. Though, it's news to me that I would get any kind

Re: [PERFORM] very slow left join

2008-05-16 Thread Craig Ringer
Ben wrote: On Fri, 16 May 2008, Scott Marlowe wrote: Well, I'm guessing that you aren't in locale=C and that the text Correct, I am not. And my understanding is that by moving to the C locale, I would loose utf8 validation, so I don't want to go there. Though, it's news to me that I would g

Re: [PERFORM] I/O on select count(*)

2008-05-16 Thread Alvaro Herrera
Greg Smith wrote: > After reading the code and that documentation a bit, the part I'm still > not sure about is whether the CLOG entry is created when the XID is > assigned and then kept current as the state changes, or whether that > isn't even in CLOG until the transaction is committed. It

Re: [PERFORM] Regexps - never completing join.

2008-05-16 Thread Scott Marlowe
On Wed, May 14, 2008 at 9:33 AM, Rusty Conover <[EMAIL PROTECTED]> wrote: > Returning to this problem this morning, I made some more insight. > > One way I did find that worked to control the loop (but doesn't yield the > same results because its a left join) > > select wc_rule.id from wc_rule left

Re: [PERFORM] I/O on select count(*)

2008-05-16 Thread Alvaro Herrera
Alvaro Herrera wrote: > pg_clog is allocated in pages of 8kB apiece(*). On allocation, pages are > zeroed, which is the bit pattern for "transaction in progress". So when > a transaction starts, it only needs to ensure that the pg_clog page that > corresponds to it is allocated, but it need not

Re: [PERFORM] Please ignore ...

2008-05-16 Thread Alvaro Herrera
Marc G. Fournier wrote: > Someone on this list has one of those 'confirm your email' filters on their > mailbox, which is bouncing back messages ... this is an attempt to try and > narrow down the address that is causing this ... So it seems you're still unable to determine the problematic addr

Re: [PERFORM] Regexps - never completing join.

2008-05-16 Thread Rusty Conover
On May 16, 2008, at 2:35 PM, Scott Marlowe wrote: On Wed, May 14, 2008 at 9:33 AM, Rusty Conover <[EMAIL PROTECTED]> wrote: Returning to this problem this morning, I made some more insight. One way I did find that worked to control the loop (but doesn't yield the same results because its

Re: [PERFORM] Regexps - never completing join.

2008-05-16 Thread Scott Marlowe
On Fri, May 16, 2008 at 3:37 PM, Rusty Conover <[EMAIL PROTECTED]> wrote: > > On May 16, 2008, at 2:35 PM, Scott Marlowe wrote: > >> On Wed, May 14, 2008 at 9:33 AM, Rusty Conover <[EMAIL PROTECTED]> >> wrote: >>> >>> Returning to this problem this morning, I made some more insight. >>> >>> One way

Re: [PERFORM] I/O on select count(*)

2008-05-16 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Greg Smith wrote: >> After reading the code and that documentation a bit, the part I'm still >> not sure about is whether the CLOG entry is created when the XID is >> assigned and then kept current as the state changes, or whether that >> isn't even