[PERFORM] debugging handle exhaustion and 15 min/ 5mil row delete
Hello, We've been a satified user of PostgreSQL for several years, and use it to power a national pet adoption website: http://www.adoptapet.com/ Recently we've had a regularly-timed middle-of-the-night problem where database handles are exhausted for a very brief period. In tracking it down, I have found that the event seems to correspond to a time when a cron script is deleting from a large logging table, but I'm not certain if this is the cause or a correlation. We are deleting about 5 million rows from a time-based logging table that is replicated by Slony. We are currently using a single delete statement, which takes about 15 minutes to run. There is no RI on the table, but the use of Slony means that a trigger call and action is made for every row deleted, which causes a corresponding insertion in another table so the deletion can be replicated to the slave. My questions: - Could this kind of activity lead to an upward spiral in database handle usage? - Would it be advisable to use several small DELETE statements instead, to delete rows in batches of 1,000. We could use the recipe for this that was posted earlier to this list: delete from table where pk in (select pk from table where delete_condition limit X); Partitions seems attractive here, but aren't easy to use Slony. Perhaps once we migrate to PostgreSQL 9.0 and the hot standby feature we can consider that. Thanks for your help! Mark . . . . . . . . . . . . . . . . . . . . . . . . . . . Mark StosbergPrincipal Developer m...@summersault.com Summersault, LLC 765-939-9301 ext 202 database driven websites . . . . . http://www.summersault.com/ . . . . . . . . -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] debugging handle exhaustion and 15 min/ 5mil row delete
On Fri, May 07, 2010 at 09:37:42AM -0400, Mark Stosberg wrote: > > Hello, > > We've been a satified user of PostgreSQL for several years, and use it > to power a national pet adoption website: http://www.adoptapet.com/ > > Recently we've had a regularly-timed middle-of-the-night problem where > database handles are exhausted for a very brief period. > > In tracking it down, I have found that the event seems to correspond to > a time when a cron script is deleting from a large logging table, but > I'm not certain if this is the cause or a correlation. > > We are deleting about 5 million rows from a time-based logging table > that is replicated by Slony. We are currently using a single delete > statement, which takes about 15 minutes to run. There is no RI on the > table, but the use of Slony means that a trigger call and action is made > for every row deleted, which causes a corresponding insertion in another > table so the deletion can be replicated to the slave. > > My questions: > > - Could this kind of activity lead to an upward spiral in database > handle usage? Yes. > > - Would it be advisable to use several small DELETE statements instead, > to delete rows in batches of 1,000. We could use the recipe for this > that was posted earlier to this list: Yes, that is the method we use in several cases to avoid this behavior. Deletion is a more intensive process in PostgreSQL, so batching it will keep from dragging down other queries which results in your out-of-handles error. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] debugging handle exhaustion and 15 min/ 5mil row delete
El 07/05/2010 15:37, Mark Stosberg escribió: Hello, We've been a satified user of PostgreSQL for several years, and use it to power a national pet adoption website: http://www.adoptapet.com/ Recently we've had a regularly-timed middle-of-the-night problem where database handles are exhausted for a very brief period. In tracking it down, I have found that the event seems to correspond to a time when a cron script is deleting from a large logging table, but I'm not certain if this is the cause or a correlation. We are deleting about 5 million rows from a time-based logging table that is replicated by Slony. We are currently using a single delete statement, which takes about 15 minutes to run. There is no RI on the table, but the use of Slony means that a trigger call and action is made for every row deleted, which causes a corresponding insertion in another table so the deletion can be replicated to the slave. My questions: - Could this kind of activity lead to an upward spiral in database handle usage? - Would it be advisable to use several small DELETE statements instead, to delete rows in batches of 1,000. We could use the recipe for this that was posted earlier to this list: delete from table where pk in (select pk from table where delete_condition limit X); Partitions seems attractive here, but aren't easy to use Slony. Perhaps once we migrate to PostgreSQL 9.0 and the hot standby feature we can consider that. Thanks for your help! Mark . . . . . . . . . . . . . . . . . . . . . . . . . . . Mark StosbergPrincipal Developer m...@summersault.com Summersault, LLC 765-939-9301 ext 202 database driven websites . . . . . http://www.summersault.com/ . . . . . . . . You can use TRUNCATE instead DELETE. TRUNCATE is more efficient and faster that DELETE. Now, we need more information about your system to give you a certain solution: Are you using a RAID controller for you data? Do you have separated the xlog directory from the data directory? Which is your Operating System? Which is you architecture? Regards -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] debugging handle exhaustion and 15 min/ 5mil row delete
> You can use TRUNCATE instead DELETE. TRUNCATE is more efficient and > faster that DELETE. Thanks for the suggestion. However, TRUNCATE is not compatible with Slony, and we also have some rows which remain in table. > Now, we need more information about your system to give you a certain > solution: > Are you using a RAID controller for you data? Yes. > Do you have separated the xlog directory from the data directory? No. > Which is your Operating System? FreeBSD. > Which is you architecture? i386. Thanks for the feedback. I'm going to try batching the deletes for now, which is approach was worked well for some of our other long-running deletes. Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner issue on sorting joining of two tables with limit
> > Well, no, because that plan wouldn't produce the specified ordering; > or at least it would be a lucky coincidence if it did. It's only > sorting on t1.value. > I just don't find why it is coincidence. I think that such plan will always produce result ordered by two columns, because such nested index scan always produce this result. Let's consider some simple example in order to illustrate how this plan works. t1 id | value ---+-- 1 | 0.1 2 | 0.3 3 | 0.2 t2 id | id1 | value ---+-+-- 1 | 2 | 0.2 2 | 1 | 0.9 3 | 2 | 0.6 4 | 1 | 0.7 5 | 1 | 0.4 6 | 3 | 0.2 1) The outer index scan will find the row of t1 with least value using test1_value_idx. It will be row (1, 0.1) 2) The inner index scan will find all the rows in t2 where id1 = 1 using test2_id1_value_idx. But index test2_id1_value_idx have second order by value column and the index scan result will be ordered by value. That's why inner index scan will find rows (5, 1, 0.4), (4, 1, 0.7) and (2, 1, 0.9). And following query output will be produced: value1 | value2 +--- 0.1 | 0.4 0.1 | 0.7 0.1 | 0.9 3) The outer index scan will find the row of t1 with the second value using test1_value_idx. It will be row (3, 0.2) 4) The inner index scan will find all the rows in t2 where id1 = 3 using test2_id1_value_idx. This row is (6, 3, 0.2). The following query output will be produced: value1 | value2 +--- 0.2 | 0.2 5) The outer index scan will find the row of t1 with the third value using test1_value_idx. It will be row (2, 0.3) 6) The inner index scan will find all the rows in t2 where id1 = 2 using test2_id1_value_idx. These rows are (1, 2, 0.2) and (3, 2, 0.6). The following query output will be produced: value1 | value2 +--- 0.3 | 0.2 0.3 | 0.6 And the whole query result is: value1 | value2 +--- 0.1 | 0.4 0.1 | 0.7 0.1 | 0.9 0.2 | 0.2 0.3 | 0.2 0.3 | 0.6 And this result is really ordered by t1.value, t2.value. I can't find error in my reasoning :) The query without limit produce similar plan. EXPLAIN SELECT t1.value AS value1, t2.value AS value2 FROM test1 t1 JOIN test2 t2 ON t2.id1 = t1.id ORDER BY t1.value Nested Loop (cost=0.00..62109.86 rows=995025 width=16) -> Index Scan using test1_value_idx on test1 t1 (cost=0.00..19.19 rows=200 width=12) -> Index Scan using test2_id1_value_idx on test2 t2 (cost=0.00..248.27 rows=4975 width=12) Index Cond: (t2.id1 = t1.id) And I checked that the result is ordered by t1.value and t2.value. 2010/4/26 Tom Lane > =?KOI8-R?B?68/Sz9TLz9cg4czFy9PBzsTS?= writes: > > So PostgreSQL planner can produce the plan I need but it doesn't produce > > this plan when I specify particular second ordering column. > > Well, no, because that plan wouldn't produce the specified ordering; > or at least it would be a lucky coincidence if it did. It's only > sorting on t1.value. > > > So is there any > > way to make planner produce desired plan when particular second ordering > > column is specified? > > Not when the ordering columns come from two different tables. (If they > were in the same table then scanning a two-column index could produce > the demanded sort order.) I don't see any way to satisfy this query > without an explicit sort step, which means it has to look at the whole > join output. > > If you're willing to make assumptions like "the required 10 rows will be > within the first 100 t1.value rows" then you could nest an ORDER BY > t1.value LIMIT 100 query inside something that did an ORDER BY with both > columns. But this fails if you have too many duplicate t1.value values, > and your test case suggests that you might have a lot of them. In any > case it would stop being fast if you make the inner LIMIT very large. > >regards, tom lane >
Re: [PERFORM] Planner issue on sorting joining of two tables with limit
I found my mistake. My supposition is working only if value column in t1 table is unique. But if I replace the index by unique one then plan is the same. On Mon, May 3, 2010 at 5:57 PM, Alexander Korotkov wrote: > Well, no, because that plan wouldn't produce the specified ordering; >> or at least it would be a lucky coincidence if it did. It's only >> sorting on t1.value. >> > I just don't find why it is coincidence. I think that such plan will always > produce result ordered by two columns, because such nested index scan always > produce this result. > > Let's consider some simple example in order to illustrate how this plan > works. > > t1 > id | value > ---+-- > 1 | 0.1 > 2 | 0.3 > 3 | 0.2 > > t2 > id | id1 | value > ---+-+-- > 1 | 2 | 0.2 > 2 | 1 | 0.9 > 3 | 2 | 0.6 > 4 | 1 | 0.7 > 5 | 1 | 0.4 > 6 | 3 | 0.2 > > 1) The outer index scan will find the row of t1 with least value using > test1_value_idx. It will be row (1, 0.1) > 2) The inner index scan will find all the rows in t2 where id1 = 1 using > test2_id1_value_idx. But index test2_id1_value_idx have second order by > value column and the index scan result will be ordered by value. That's why > inner index scan will find rows (5, 1, 0.4), (4, 1, 0.7) and (2, 1, 0.9). > And following query output will be produced: > > value1 | value2 > +--- > 0.1 | 0.4 > 0.1 | 0.7 > 0.1 | 0.9 > 3) The outer index scan will find the row of t1 with the second value using > test1_value_idx. It will be row (3, 0.2) > 4) The inner index scan will find all the rows in t2 where id1 = 3 using > test2_id1_value_idx. This row is (6, 3, 0.2). The following query output > will be produced: > > value1 | value2 > +--- > 0.2 | 0.2 > 5) The outer index scan will find the row of t1 with the third value using > test1_value_idx. It will be row (2, 0.3) > 6) The inner index scan will find all the rows in t2 where id1 = 2 using > test2_id1_value_idx. These rows are (1, 2, 0.2) and (3, 2, 0.6). The > following query output will be produced: > > value1 | value2 > +--- > 0.3 | 0.2 > 0.3 | 0.6 > > And the whole query result is: > value1 | value2 > +--- > 0.1 | 0.4 > 0.1 | 0.7 > 0.1 | 0.9 > 0.2 | 0.2 > 0.3 | 0.2 > 0.3 | 0.6 > > And this result is really ordered by t1.value, t2.value. > I can't find error in my reasoning :) > > The query without limit produce similar plan. > > EXPLAIN SELECT t1.value AS value1, t2.value AS value2 FROM test1 t1 JOIN > test2 t2 ON t2.id1 = t1.id ORDER BY t1.value > > Nested Loop (cost=0.00..62109.86 rows=995025 width=16) > -> Index Scan using test1_value_idx on test1 t1 (cost=0.00..19.19 > rows=200 width=12) > -> Index Scan using test2_id1_value_idx on test2 t2 (cost=0.00..248.27 > rows=4975 width=12) > Index Cond: (t2.id1 = t1.id) > > And I checked that the result is ordered by t1.value and t2.value. > > 2010/4/26 Tom Lane > >> =?KOI8-R?B?68/Sz9TLz9cg4czFy9PBzsTS?= writes: >> >> > So PostgreSQL planner can produce the plan I need but it doesn't produce >> > this plan when I specify particular second ordering column. >> >> Well, no, because that plan wouldn't produce the specified ordering; >> or at least it would be a lucky coincidence if it did. It's only >> sorting on t1.value. >> >> > So is there any >> > way to make planner produce desired plan when particular second ordering >> > column is specified? >> >> Not when the ordering columns come from two different tables. (If they >> were in the same table then scanning a two-column index could produce >> the demanded sort order.) I don't see any way to satisfy this query >> without an explicit sort step, which means it has to look at the whole >> join output. >> >> If you're willing to make assumptions like "the required 10 rows will be >> within the first 100 t1.value rows" then you could nest an ORDER BY >> t1.value LIMIT 100 query inside something that did an ORDER BY with both >> columns. But this fails if you have too many duplicate t1.value values, >> and your test case suggests that you might have a lot of them. In any >> case it would stop being fast if you make the inner LIMIT very large. >> >>regards, tom lane >> > >
Re: [PERFORM] Planner issue on sorting joining of two tables with limit
Alexander Korotkov wrote: > Alexander Korotkov wrote: >>> Well, no, because that plan wouldn't produce the specified >>> ordering; or at least it would be a lucky coincidence if it did. >>> It's only sorting on t1.value. >>> >> I just don't find why it is coincidence. I think that such plan >> will always produce result ordered by two columns, because such >> nested index scan always produce this result. Assuming a nested index scan, or any particular plan, is unwise. New data or just the "luck of the draw" on your next ANALYZE could result in a totally different plan which wouldn't produce the same ordering unless specified. > I found my mistake. My supposition is working only if value column > in t1 table is unique. But if I replace the index by unique one > then plan is the same. Yeah, maybe, for the moment. When you have ten times the quantity of data, a completely different plan may be chosen. If you want a particular order, ask for it. The planner will even take the requested ordering into account when choosing a plan, so the cutoff for switching to an in-memory hash table or a bitmap index scan might shift a bit based on the calculated cost of sorting data. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner issue on sorting joining of two tables with limit
"Kevin Grittner" writes: > Alexander Korotkov wrote: >>> I just don't find why it is coincidence. I think that such plan >>> will always produce result ordered by two columns, because such >>> nested index scan always produce this result. > Assuming a nested index scan, or any particular plan, is unwise. I think he's proposing that the planner should recognize that a plan of this type produces a result sorted by the additional index columns. I'm not convinced either that the sortedness property really holds, or that it would be worth the extra planning effort to check for; but it's not a fundamentally misguided proposal. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] partioning tips?
On 05/05/2010 01:25 PM, Richard Yen wrote: Problem is, I have foreign keys that link almost all of our tables together (as a business requirement/IT policy). However, I know (er, I have a gut feeling) that many people out there have successfully deployed table partitioning, so I'm hoping to solicit some advice with respect to this. I've looked at documentation, tried creating a prototype, etc...looks like foreign keys have to go. But do they? What have other people out there done to get their tables partitioned? Well, it's possible to work around the limitation on FKs, but probably not worth it. In general, the reasons you want to partition (being able to cheaply drop segments, no scans against the whole table, ever) are reasons why you wouldn't want an FK to a partition table in any case. The specific cases where it works to have FKs anyway are: 1) if you're making FKs between two partitioned tables whose partition ranges match exactly. In this case, you can just FK the individual partitions (there is a TODO, and some draft code from Aster, to make this happen automatically). 2) If the partitioned table has very wide rows, and it's large for that reason rather than because of having many rows. In this case, you can create an FK join table containing only the SKs for creating FKs to, just like a many-to-many join table. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] debugging handle exhaustion and 15 min/ 5mil row delete
El 07/05/2010 16:10, Mark Stosberg escribió: You can use TRUNCATE instead DELETE. TRUNCATE is more efficient and faster that DELETE. Thanks for the suggestion. However, TRUNCATE is not compatible with Slony, and we also have some rows which remain in table. Now, we need more information about your system to give you a certain solution: Are you using a RAID controller for you data? Yes. Do you have separated the xlog directory from the data directory? No. Which is your Operating System? FreeBSD. Which is you architecture? i386. Thanks for the feedback. I'm going to try batching the deletes for now, which is approach was worked well for some of our other long-running deletes. Mark Have you valorated to use a 64 bits version of FreeBSD for that? The 64 bits OS can help you very much on large databases because yo can use actually all available RAM that you have on the server. Many experts on this list recommende to separate the xlog directory on a RAID 1 configuration and the data directory on RAID 10 to obtain a better performance. The filesystems are very diverse, but I ´ve seen that ZFS is very useful on these cases. Which version of Slony-I are you using? Regards -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Dell Perc HX00 RAID controllers: What's inside?
Now that it's time to buy a new computer, Dell has changed their RAID models from the Perc6 to Perc H200 and such. Does anyone know what's inside these? I would hope they've stuck with the Megaraid controller... Also, I can't find any info on Dell's site about how these devices can be configured. I was thinking of ten disks, as OS: RAID1 WAL: RAID1 Database: RAID10 using 6 disks But it's not clear to me if these RAID controllers can handle multible arrays, or if you need a separate controller for each array. We're a small shop and I only get to do this every year or so, and everything changes in between purchases! Thanks, Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] 8K recordsize bad on ZFS?
Jignesh, All: Most of our Solaris users have been, I think, following Jignesh's advice from his benchmark tests to set ZFS page size to 8K for the data zpool. However, I've discovered that this is sometimes a serious problem for some hardware. For example, having the recordsize set to 8K on a Sun 4170 with 8 drives recently gave me these appalling Bonnie++ results: Version 1.96 --Sequential Output-- --Sequential Input- --Random- Concurrency 4 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP db111 24G 260044 33 62110 17 89914 15 1167 25 Latency6549ms4882ms 3395ms 107ms I know that's hard to read. What it's saying is: Seq Writes: 260mb/s combined Seq Reads: 89mb/s combined Read Latency: 3.3s Best guess is that this is a result of overloading the array/drives with commands for all those small blocks; certainly the behavior observed (stuttering I/O, latency) is in line with that issue. Anyway, since this is a DW-like workload, we just bumped the recordsize up to 128K and the performance issues went away ... reads up over 300mb/s. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance