[PERFORM] debugging handle exhaustion and 15 min/ 5mil row delete

2010-05-07 Thread Mark Stosberg

Hello,

We've been a satified user of PostgreSQL for several years, and use it
to power a national pet adoption website: http://www.adoptapet.com/

Recently we've had a regularly-timed middle-of-the-night problem where
database handles are exhausted for a very brief period.

In tracking it down, I have found that the event seems to correspond to
a time when a cron script is deleting from a large logging table, but
I'm not certain if this is the cause or a correlation.

We are deleting about 5 million rows from a time-based logging table
that is replicated by Slony. We are currently using a single delete
statement, which takes about 15 minutes to run. There is no RI on the
table, but the use of Slony means that a trigger call and action is made
for every row deleted, which causes a corresponding insertion in another
table so the deletion can be replicated to the slave.

My questions:

- Could this kind of activity lead to an upward spiral in database
  handle usage?

- Would it be advisable to use several small DELETE statements instead,
  to delete rows in batches of 1,000. We could use the recipe for this
  that was posted earlier to this list:

  delete from table where pk in
(select pk from table where delete_condition limit X);

Partitions seems attractive here, but aren't easy to use Slony. Perhaps
once we migrate to PostgreSQL 9.0 and the hot standby feature we can
consider that.

Thanks for your help!

Mark

 . . . . . . . . . . . . . . . . . . . . . . . . . . . 
   Mark StosbergPrincipal Developer  
   m...@summersault.com Summersault, LLC 
   765-939-9301 ext 202 database driven websites
 . . . . . http://www.summersault.com/ . . . . . . . .



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] debugging handle exhaustion and 15 min/ 5mil row delete

2010-05-07 Thread Kenneth Marshall
On Fri, May 07, 2010 at 09:37:42AM -0400, Mark Stosberg wrote:
> 
> Hello,
> 
> We've been a satified user of PostgreSQL for several years, and use it
> to power a national pet adoption website: http://www.adoptapet.com/
> 
> Recently we've had a regularly-timed middle-of-the-night problem where
> database handles are exhausted for a very brief period.
> 
> In tracking it down, I have found that the event seems to correspond to
> a time when a cron script is deleting from a large logging table, but
> I'm not certain if this is the cause or a correlation.
> 
> We are deleting about 5 million rows from a time-based logging table
> that is replicated by Slony. We are currently using a single delete
> statement, which takes about 15 minutes to run. There is no RI on the
> table, but the use of Slony means that a trigger call and action is made
> for every row deleted, which causes a corresponding insertion in another
> table so the deletion can be replicated to the slave.
> 
> My questions:
> 
> - Could this kind of activity lead to an upward spiral in database
>   handle usage?
Yes.
> 
> - Would it be advisable to use several small DELETE statements instead,
>   to delete rows in batches of 1,000. We could use the recipe for this
>   that was posted earlier to this list:
Yes, that is the method we use in several cases to avoid this behavior.
Deletion is a more intensive process in PostgreSQL, so batching it will
keep from dragging down other queries which results in your out-of-handles
error.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] debugging handle exhaustion and 15 min/ 5mil row delete

2010-05-07 Thread Marcos Ortiz

El 07/05/2010 15:37, Mark Stosberg escribió:

Hello,

We've been a satified user of PostgreSQL for several years, and use it
to power a national pet adoption website: http://www.adoptapet.com/

Recently we've had a regularly-timed middle-of-the-night problem where
database handles are exhausted for a very brief period.

In tracking it down, I have found that the event seems to correspond to
a time when a cron script is deleting from a large logging table, but
I'm not certain if this is the cause or a correlation.

We are deleting about 5 million rows from a time-based logging table
that is replicated by Slony. We are currently using a single delete
statement, which takes about 15 minutes to run. There is no RI on the
table, but the use of Slony means that a trigger call and action is made
for every row deleted, which causes a corresponding insertion in another
table so the deletion can be replicated to the slave.

My questions:

- Could this kind of activity lead to an upward spiral in database
   handle usage?

- Would it be advisable to use several small DELETE statements instead,
   to delete rows in batches of 1,000. We could use the recipe for this
   that was posted earlier to this list:

   delete from table where pk in
 (select pk from table where delete_condition limit X);

Partitions seems attractive here, but aren't easy to use Slony. Perhaps
once we migrate to PostgreSQL 9.0 and the hot standby feature we can
consider that.

Thanks for your help!

 Mark

  . . . . . . . . . . . . . . . . . . . . . . . . . . .
Mark StosbergPrincipal Developer
m...@summersault.com Summersault, LLC
765-939-9301 ext 202 database driven websites
  . . . . . http://www.summersault.com/ . . . . . . . .



   
You can use TRUNCATE instead DELETE. TRUNCATE is more efficient and 
faster that DELETE.
Now, we need more information about your system to give you a certain 
solution:

Are you using a RAID controller for you data?
Do you have separated the xlog directory from the data directory?
Which is your Operating System?
Which is you architecture?

Regards

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] debugging handle exhaustion and 15 min/ 5mil row delete

2010-05-07 Thread Mark Stosberg

> You can use TRUNCATE instead DELETE. TRUNCATE is more efficient and 
> faster that DELETE.

Thanks for the suggestion. However, TRUNCATE is not compatible with
Slony, and we also have some rows which remain in table. 

> Now, we need more information about your system to give you a certain 
> solution:
> Are you using a RAID controller for you data? 

Yes.

> Do you have separated the xlog directory from the data directory?

No.

> Which is your Operating System?

FreeBSD.

> Which is you architecture?

i386.

Thanks for the feedback. I'm going to try batching the deletes for now,
which is approach was worked well for some of our other long-running
deletes.

Mark

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner issue on sorting joining of two tables with limit

2010-05-07 Thread Alexander Korotkov
>
> Well, no, because that plan wouldn't produce the specified ordering;
> or at least it would be a lucky coincidence if it did.  It's only
> sorting on t1.value.
>
I just don't find why it is coincidence. I think that such plan will always
produce result ordered by two columns, because such nested index scan always
produce this result.
Let's consider some simple example in order to illustrate how this plan
works.

t1
id | value
---+--
1  | 0.1
2  | 0.3
3  | 0.2

t2
id | id1 | value
---+-+--
1  |  2  | 0.2
2  |  1  | 0.9
3  |  2  | 0.6
4  |  1  | 0.7
5  |  1  | 0.4
6  |  3  | 0.2

1) The outer index scan will find the row of t1 with least value using
test1_value_idx. It will be row (1, 0.1)
2) The inner index scan will find all the rows in t2 where id1 = 1 using
test2_id1_value_idx. But index test2_id1_value_idx have second order by
value column and the index scan result will be ordered by value. That's why
inner index scan will find rows (5, 1, 0.4), (4, 1, 0.7) and (2, 1, 0.9).
And following query output will be produced:
value1  | value2
+---
0.1 | 0.4
0.1 | 0.7
0.1 | 0.9
3) The outer index scan will find the row of t1 with the second value using
test1_value_idx. It will be row (3, 0.2)
4) The inner index scan will find all the rows in t2 where id1 = 3 using
test2_id1_value_idx. This row is (6, 3, 0.2). The following query output
will be produced:
value1  | value2
+---
0.2 | 0.2
5) The outer index scan will find the row of t1 with the third value using
test1_value_idx. It will be row (2, 0.3)
6) The inner index scan will find all the rows in t2 where id1 = 2 using
test2_id1_value_idx. These rows are (1, 2, 0.2) and (3, 2, 0.6). The
following query output will be produced:
value1  | value2
+---
0.3 | 0.2
0.3 | 0.6

And the whole query result is:
value1  | value2
+---
0.1 | 0.4
0.1 | 0.7
0.1 | 0.9
0.2 | 0.2
0.3 | 0.2
0.3 | 0.6

And this result is really ordered by t1.value, t2.value.
I can't find error in my reasoning :)

The query without limit produce similar plan.

EXPLAIN SELECT t1.value AS value1, t2.value AS value2 FROM test1 t1 JOIN
test2 t2 ON t2.id1 = t1.id ORDER BY t1.value

Nested Loop  (cost=0.00..62109.86 rows=995025 width=16)
  ->  Index Scan using test1_value_idx on test1 t1  (cost=0.00..19.19
rows=200 width=12)
  ->  Index Scan using test2_id1_value_idx on test2 t2  (cost=0.00..248.27
rows=4975 width=12)
Index Cond: (t2.id1 = t1.id)

And I checked that the result is ordered by t1.value and t2.value.

2010/4/26 Tom Lane 

> =?KOI8-R?B?68/Sz9TLz9cg4czFy9PBzsTS?=  writes:
> > So PostgreSQL planner can produce the plan I need but it doesn't produce
> > this plan when I specify particular second ordering column.
>
> Well, no, because that plan wouldn't produce the specified ordering;
> or at least it would be a lucky coincidence if it did.  It's only
> sorting on t1.value.
>
> > So is there any
> > way to make planner produce desired plan when particular second ordering
> > column is specified?
>
> Not when the ordering columns come from two different tables.  (If they
> were in the same table then scanning a two-column index could produce
> the demanded sort order.)  I don't see any way to satisfy this query
> without an explicit sort step, which means it has to look at the whole
> join output.
>
> If you're willing to make assumptions like "the required 10 rows will be
> within the first 100 t1.value rows" then you could nest an ORDER BY
> t1.value LIMIT 100 query inside something that did an ORDER BY with both
> columns.  But this fails if you have too many duplicate t1.value values,
> and your test case suggests that you might have a lot of them.  In any
> case it would stop being fast if you make the inner LIMIT very large.
>
>regards, tom lane
>


Re: [PERFORM] Planner issue on sorting joining of two tables with limit

2010-05-07 Thread Alexander Korotkov
I found my mistake. My supposition is working only if value column in t1
table is unique. But if I replace the index by unique one then plan is the
same.

On Mon, May 3, 2010 at 5:57 PM, Alexander Korotkov wrote:

> Well, no, because that plan wouldn't produce the specified ordering;
>> or at least it would be a lucky coincidence if it did.  It's only
>> sorting on t1.value.
>>
> I just don't find why it is coincidence. I think that such plan will always
> produce result ordered by two columns, because such nested index scan always
> produce this result.
>
> Let's consider some simple example in order to illustrate how this plan
> works.
>
> t1
> id | value
> ---+--
> 1  | 0.1
> 2  | 0.3
> 3  | 0.2
>
> t2
> id | id1 | value
> ---+-+--
> 1  |  2  | 0.2
> 2  |  1  | 0.9
> 3  |  2  | 0.6
> 4  |  1  | 0.7
> 5  |  1  | 0.4
> 6  |  3  | 0.2
>
> 1) The outer index scan will find the row of t1 with least value using
> test1_value_idx. It will be row (1, 0.1)
> 2) The inner index scan will find all the rows in t2 where id1 = 1 using
> test2_id1_value_idx. But index test2_id1_value_idx have second order by
> value column and the index scan result will be ordered by value. That's why
> inner index scan will find rows (5, 1, 0.4), (4, 1, 0.7) and (2, 1, 0.9).
> And following query output will be produced:
>
> value1  | value2
> +---
> 0.1 | 0.4
> 0.1 | 0.7
> 0.1 | 0.9
> 3) The outer index scan will find the row of t1 with the second value using
> test1_value_idx. It will be row (3, 0.2)
> 4) The inner index scan will find all the rows in t2 where id1 = 3 using
> test2_id1_value_idx. This row is (6, 3, 0.2). The following query output
> will be produced:
>
> value1  | value2
> +---
> 0.2 | 0.2
> 5) The outer index scan will find the row of t1 with the third value using
> test1_value_idx. It will be row (2, 0.3)
> 6) The inner index scan will find all the rows in t2 where id1 = 2 using
> test2_id1_value_idx. These rows are (1, 2, 0.2) and (3, 2, 0.6). The
> following query output will be produced:
>
> value1  | value2
> +---
> 0.3 | 0.2
> 0.3 | 0.6
>
> And the whole query result is:
> value1  | value2
> +---
> 0.1 | 0.4
> 0.1 | 0.7
> 0.1 | 0.9
> 0.2 | 0.2
> 0.3 | 0.2
> 0.3 | 0.6
>
> And this result is really ordered by t1.value, t2.value.
> I can't find error in my reasoning :)
>
> The query without limit produce similar plan.
>
> EXPLAIN SELECT t1.value AS value1, t2.value AS value2 FROM test1 t1 JOIN
> test2 t2 ON t2.id1 = t1.id ORDER BY t1.value
>
> Nested Loop  (cost=0.00..62109.86 rows=995025 width=16)
>   ->  Index Scan using test1_value_idx on test1 t1  (cost=0.00..19.19
> rows=200 width=12)
>   ->  Index Scan using test2_id1_value_idx on test2 t2  (cost=0.00..248.27
> rows=4975 width=12)
> Index Cond: (t2.id1 = t1.id)
>
> And I checked that the result is ordered by t1.value and t2.value.
>
> 2010/4/26 Tom Lane 
>
>> =?KOI8-R?B?68/Sz9TLz9cg4czFy9PBzsTS?=  writes:
>>
>> > So PostgreSQL planner can produce the plan I need but it doesn't produce
>> > this plan when I specify particular second ordering column.
>>
>> Well, no, because that plan wouldn't produce the specified ordering;
>> or at least it would be a lucky coincidence if it did.  It's only
>> sorting on t1.value.
>>
>> > So is there any
>> > way to make planner produce desired plan when particular second ordering
>> > column is specified?
>>
>> Not when the ordering columns come from two different tables.  (If they
>> were in the same table then scanning a two-column index could produce
>> the demanded sort order.)  I don't see any way to satisfy this query
>> without an explicit sort step, which means it has to look at the whole
>> join output.
>>
>> If you're willing to make assumptions like "the required 10 rows will be
>> within the first 100 t1.value rows" then you could nest an ORDER BY
>> t1.value LIMIT 100 query inside something that did an ORDER BY with both
>> columns.  But this fails if you have too many duplicate t1.value values,
>> and your test case suggests that you might have a lot of them.  In any
>> case it would stop being fast if you make the inner LIMIT very large.
>>
>>regards, tom lane
>>
>
>


Re: [PERFORM] Planner issue on sorting joining of two tables with limit

2010-05-07 Thread Kevin Grittner
Alexander Korotkov  wrote:
> Alexander Korotkov  wrote:
 
>>> Well, no, because that plan wouldn't produce the specified
>>> ordering; or at least it would be a lucky coincidence if it did.
>>> It's only sorting on t1.value.
>>>
>> I just don't find why it is coincidence. I think that such plan
>> will always produce result ordered by two columns, because such
>> nested index scan always produce this result.
 
Assuming a nested index scan, or any particular plan, is unwise. 
New data or just the "luck of the draw" on your next ANALYZE could
result in a totally different plan which wouldn't produce the same
ordering unless specified.
 
> I found my mistake. My supposition is working only if value column
> in t1 table is unique. But if I replace the index by unique one
> then plan is the same.
 
Yeah, maybe, for the moment.  When you have ten times the quantity
of data, a completely different plan may be chosen.  If you want a
particular order, ask for it.  The planner will even take the
requested ordering into account when choosing a plan, so the cutoff
for switching to an in-memory hash table or a bitmap index scan
might shift a bit based on the calculated cost of sorting data.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner issue on sorting joining of two tables with limit

2010-05-07 Thread Tom Lane
"Kevin Grittner"  writes:
> Alexander Korotkov  wrote:
>>> I just don't find why it is coincidence. I think that such plan
>>> will always produce result ordered by two columns, because such
>>> nested index scan always produce this result.
 
> Assuming a nested index scan, or any particular plan, is unwise. 

I think he's proposing that the planner should recognize that a plan
of this type produces a result sorted by the additional index columns.
I'm not convinced either that the sortedness property really holds,
or that it would be worth the extra planning effort to check for;
but it's not a fundamentally misguided proposal.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] partioning tips?

2010-05-07 Thread Josh Berkus

On 05/05/2010 01:25 PM, Richard Yen wrote:

Problem is, I have foreign keys that link almost all of our tables together (as 
a business requirement/IT policy).  However, I know (er, I have a gut feeling) 
that many people out there have successfully deployed table partitioning, so 
I'm hoping to solicit some advice with respect to this.  I've looked at 
documentation, tried creating a prototype, etc...looks like foreign keys have 
to go.  But do they?  What have other people out there done to get their tables 
partitioned?


Well, it's possible to work around the limitation on FKs, but probably 
not worth it.  In general, the reasons you want to partition (being able 
to cheaply drop segments, no scans against the whole table, ever) are 
reasons why you wouldn't want an FK to a partition table in any case.


The specific cases where it works to have FKs anyway are:

1) if you're making FKs between two partitioned tables whose partition 
ranges match exactly.  In this case, you can just FK the individual 
partitions (there is a TODO, and some draft code from Aster, to make 
this happen automatically).


2) If the partitioned table has very wide rows, and it's large for that 
reason rather than because of having many rows.  In this case, you can 
create an FK join table containing only the SKs for creating FKs to, 
just like a many-to-many join table.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] debugging handle exhaustion and 15 min/ 5mil row delete

2010-05-07 Thread Marcos Ortiz

El 07/05/2010 16:10, Mark Stosberg escribió:
   

You can use TRUNCATE instead DELETE. TRUNCATE is more efficient and
faster that DELETE.
 

Thanks for the suggestion. However, TRUNCATE is not compatible with
Slony, and we also have some rows which remain in table.

   

Now, we need more information about your system to give you a certain
solution:
Are you using a RAID controller for you data?
 

Yes.

   

Do you have separated the xlog directory from the data directory?
 

No.

   

Which is your Operating System?
 

FreeBSD.

   

Which is you architecture?
 

i386.

Thanks for the feedback. I'm going to try batching the deletes for now,
which is approach was worked well for some of our other long-running
deletes.

 Mark

   

Have you valorated to use a 64 bits version of FreeBSD for that?
The 64 bits OS can help you very much on large databases because yo can 
use actually all available RAM that you have on the server.


Many experts on this list recommende to separate the xlog directory on a 
RAID 1 configuration and the data directory on RAID 10 to obtain a 
better performance.
The filesystems are very diverse, but I ´ve seen that ZFS is very useful 
on these cases.


Which version of Slony-I are you using?

Regards

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Dell Perc HX00 RAID controllers: What's inside?

2010-05-07 Thread Craig James

Now that it's time to buy a new computer, Dell has changed their RAID models 
from the Perc6 to Perc H200 and such.  Does anyone know what's inside these?  I 
would hope they've stuck with the Megaraid controller...

Also, I can't find any info on Dell's site about how these devices can be 
configured.  I was thinking of ten disks, as

  OS: RAID1
  WAL: RAID1
  Database: RAID10 using 6 disks

But it's not clear to me if these RAID controllers can handle multible arrays, 
or if you need a separate controller for each array.  We're a small shop and I 
only get to do this every year or so, and everything changes in between 
purchases!

Thanks,
Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] 8K recordsize bad on ZFS?

2010-05-07 Thread Josh Berkus
Jignesh, All:

Most of our Solaris users have been, I think, following Jignesh's advice
from his benchmark tests to set ZFS page size to 8K for the data zpool.
 However, I've discovered that this is sometimes a serious problem for
some hardware.

For example, having the recordsize set to 8K on a Sun 4170 with 8 drives
recently gave me these appalling Bonnie++ results:

Version  1.96   --Sequential Output-- --Sequential Input-
--Random-
Concurrency   4 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
/sec %CP
db111   24G   260044  33 62110  17   89914  15
1167  25
Latency6549ms4882ms  3395ms
107ms

I know that's hard to read.  What it's saying is:

Seq Writes: 260mb/s combined
Seq Reads: 89mb/s combined
Read Latency: 3.3s

Best guess is that this is a result of overloading the array/drives with
commands for all those small blocks; certainly the behavior observed
(stuttering I/O, latency) is in line with that issue.

Anyway, since this is a DW-like workload, we just bumped the recordsize
up to 128K and the performance issues went away ... reads up over 300mb/s.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance