Re: [PERFORM] Optimizing No matching record Queries
Dean Gibson (DB Administrator) wrote: The questions are: 1. Why in the planner scanning the entire idx_listing_entrydate, when I'd think it should be scanning the entire pk_listingstatus_listingstatusid ? It's looking at the ORDER BY and sees that the query needs the 10 most recent, so tries searching by date. That's sensible where you are going to have a lot of matches for fklistingsourceid. Which suggests that statistics for "fklistingsourceid" aren't high enough, like Greg suggested. If that doesn't help, the index on (fklistingsourceid,entrydate) that Stephen might well do so. 2. Why is "Index Scan using pk_listingstatus_listingstatusid on listingstatus listingsta1_ (cost=0.00..0.27 rows=1 width=4) (never executed)" ? Because nothing comes out of the first index-scan. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Dell Perc/6
Hello. I think I started that discussion. We ended up buying a Dell 2900 with PERC 6/i and 10 * 145GB SAS 3,5" 15KRpm discs. 6 of the SAS discs are in a raid 10 for the database, 2 in a mirror for the wal and the last 2 in a mirror for the OS. We get 350MB/s writing and 380MB/s reading to/from the raid 10 area using dd. The OS is Ubuntu and the filesystem for the raid 10 is ext3. The box is still under testing, but we plan to set it in production this week. Regards, - Tore. On Feb 12, 2008, at 17:32 , Craig James wrote: Does anyone have performance info about the new Dell Perc/6 controllers? I found a long discussion ("Dell vs HP") about the Perc/5, but nothing about Perc/6. What's under the covers? Here is the (abbreviated) info from Dell on this machine: PowerEdge 1950 IIIQuad Core Intel® Xeon® E5405, 2x6MB Cache, 2.0GHz, 1333MHz FSB Additional Processors Quad Core Intel® Xeon® E5405, 2x6MB Cache, 2.0GHz, 1333MHz FSB Memory8GB 667MHz (4x2GB), Dual Ranked DIMMs Hard Drive Configuration Integrated SAS/SATA RAID 5, PERC 6/i Integrated Thanks, Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Update with Subquery Performance
yes, I also thought of this method and tested it before I got your mail and this solution seems workable. Thanks for the help On Feb 12, 2008 9:18 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Linux Guru" <[EMAIL PROTECTED]> writes: > > Analyzing did not help, here is the out of EXPLAIN ANALYZE of update > query > > "Seq Scan on dummy (cost=0.00..56739774.24 rows=23441 width=275) > (actual > > time=18.927..577929.014 rows=22712 loops=1)" > > " SubPlan" > > "-> Aggregate (cost=2420.41..2420.43 rows=1 width=19) (actual > time= > > 25.423..25.425 rows=1 loops=22712)" > > " -> Seq Scan on dummy "temp" (cost=0.00..2416.01 rows=586 > > width=19) (actual time=0.049..17.834 rows=2414 loops=22712)" > > "Filter: ((product)::text = ($0)::text)" > > "Total runtime: 578968.885 ms" > > Yeah, that's just not going to be fast. An index on the product column > might help a bit, but the real issue is that you're repetitively > calculating the same aggregates. I think you need a separate temp > table, along the lines of > > create temp table dummy_agg as > select product, > (case when sum(pd) <> 0 then sum(gd)/sum(pd)*100 else 0 end) as s > from dummy > group by product; > > create index dummy_agg_i on dummy_agg(product); -- optional > > update dummy > set gp= (select s from dummy_agg where dummy_agg.product = dummy.product > ); > > The index would only be needed if you expect a lot of rows (lot of > different product values). > >regards, tom lane >
Re: [PERFORM] Dell Perc/6
On 13-Feb-08, at 5:02 AM, Tore Halset wrote: Hello. I think I started that discussion. We ended up buying a Dell 2900 with PERC 6/i and 10 * 145GB SAS 3,5" 15KRpm discs. 6 of the SAS discs are in a raid 10 for the database, 2 in a mirror for the wal and the last 2 in a mirror for the OS. We get 350MB/s writing and 380MB/s reading to/from the raid 10 area using dd. The OS is Ubuntu and the filesystem for the raid 10 is ext3. Wow that's fantastic. Just to be sure, did you make sure that you read and wrote 2x memory to take the cache out of the measurement ? Dave ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Join Query Perfomance Issue
> Can you send the table definitions of the tables involved in the > query, including index information? Might be if we look hard enough we > can find something. > > Peter Table "messungen_v_dat_2007_11_12" Column | Type | Modifiers | Description ---+--+---+- ganglinientyp | character(1) | | minute_tag| smallint | | messpunkt | integer | | Indexes: "messungen_v_dat_2007_11_12_ganglinientyp_key" UNIQUE, btree (ganglinientyp, minute_tag, messpunkt) "messungen_v_dat_2007_11_12_messpunkt_idx" btree (messpunkt) "messungen_v_dat_2007_11_12_messpunkt_minute_tag_idx" btree (ganglinientyp, minute_tag) Foreign-key constraints: "messungen_v_dat_2007_11_12_messpunkt_fkey" FOREIGN KEY (messpunkt) REFERENCES messpunkt(nr) Inherits: messungen_v_dat Has OIDs: no Table "messpunkt" Column | Type | Modifiers| Description +-++- nr | integer | not null default nextval('messpunkt_nr_seq'::regclass) | zs_nr | integer || mw_nr | integer || Indexes: "messpunkt_pkey" PRIMARY KEY, btree (nr) "messpunkt_zs_nr_key" UNIQUE, btree (zs_nr, mw_nr) "messpunkt_mw_idx" btree (mw_nr) "messpunkt_nr_idx" btree (nr) "messpunkt_zs_idx" btree (zs_nr) Foreign-key constraints: "messpunkt_mw_nr_fkey" FOREIGN KEY (mw_nr) REFERENCES de_mw(nr) "messpunkt_zs_nr_fkey" FOREIGN KEY (zs_nr) REFERENCES de_zs(zs) Has OIDs: no Table "de_mw" Column | Type | Modifiers | Description +--++- nr | integer | not null default nextval('de_mw_nr_seq'::regclass) | j_ges | smallint || mw_abh | integer || Indexes: "de_mw_pkey" PRIMARY KEY, btree (nr) "de_mw_j_ges_key" UNIQUE, btree (j_ges, mw_abh) "de_nw_nr_idx" btree (nr) Check constraints: "de_mw_check" CHECK (j_ges IS NOT NULL AND (j_ges = 0 AND (mw_abh = 0 OR mw_abh = 255 OR mw_abh IS NULL) OR j_ges > 0 AND j_ges <= 80 AND mw_abh <> 0)) Has OIDs: no ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Join Query Perfomance Issue
We have tried some recoding now, using a materialized view we could reduce the query to a join over too tables without any functions inside the query, for example: explain analyse SELECT '12.11.2007 18:04:00 UTC' AS zeit, 'M' AS ganglinientyp, zs_de, j_ges, j_lkw, v_pkw, v_lkw, p_bel FROM messungen_v_dat_2007_11_12 m LEFT JOIN messwerte_mv w on w.nr = m.messpunkt WHERE m.ganglinientyp = 'M' AND 992 = m.minute_tag; Nested Loop Left Join (cost=0.00..32604.48 rows=3204 width=14) (actual time=11.991..2223.227 rows=2950 loops=1) -> Index Scan using messungen_v_dat_2007_11_12_messpunkt_minute_tag_idx on messungen_v_dat_2007_11_12 m (cost=0.00..5371.09 rows=3204 width=4) (actual time=0.152..12.385 rows=2950 loops=1) Index Cond: ((ganglinientyp = 'M'::bpchar) AND (992 = minute_tag)) -> Index Scan using messwerte_mv_nr_idx on messwerte_mv w (cost=0.00..8.49 rows=1 width=18) (actual time=0.730..0.734 rows=1 loops=2950) Index Cond: (w.nr = m.messpunkt) Total runtime: 2234.143 ms (6 rows) To me this plan looks very clean and nearly optimal, BUT ~2seconds for the nested loop can't be that good, isn't it? The behavior of this query and the database is quite a mystery for me, yesterday i had it running in about 100ms, today i started testing with the same query and 2000-3000ms :( Could this be some kind of a postgresql server/configuration problem? This queries are very perfomance dependend, they are called a lot of times in a comlex physical real time simulation of traffic systems. 200ms would be ok here, but >1sec is perhaps not functional. The old version just used one big (partitioned) table without any joins, performing this query in 10-300ms, depended on the server load. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] Small DB Server Advice
We have a web application for which we intend to run the database on a dedicated server. We hope by the end of 2008 to have 10 companies accessing 10 instances of the database for this application. The dump file of each database is likely to be less than 100MB at the end of the year. The databases are read-heavy. I'm thinking of something along the following lines: (https://secure.dnuk.com/systems/r325hs-1u.php?configuration=7766) 4 x 147GB 15000 rpm SCSI in RAID 10 with 320-1 RAID CARD + 64MB cache BBU 2x Intel Xeon E5405 / 4x 2.00GHz / 1333MHz FSB / 12MB cache 6GB RAM Cost around 2320 GBP -- it would be great to get it under 2000 Needs to be in the UK. I would be grateful for any comments. I'm particularly out of date about the best processors to go for. DNUK also have Opteron as an option. Rory ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Small DB Server Advice
On Wed, 13 Feb 2008, Rory Campbell-Lange wrote: 4 x 147GB 15000 rpm SCSI in RAID 10 with 320-1 RAID CARD + 64MB cache BBU 2x Intel Xeon E5405 / 4x 2.00GHz / 1333MHz FSB / 12MB cache 6GB RAM Cost around 2320 GBP -- it would be great to get it under 2000 Needs to be in the UK. I would be grateful for any comments. I'm particularly out of date about the best processors to go for. DNUK also have Opteron as an option. That sounds pretty good. It should run postgres fairly well, especially if you have quite a few parallel queries coming in. You won't need a bigger BBU cache if it's read-heavy. You'll have eight CPU cores, which is good. And RAID 10 is good. As for Intel/AMD, I think they're neck-and-neck at the moment. Both are fast. Of course, we over here have no idea how much actual read traffic there will be, so you may be massively over-providing or it may be woefully inadequate, but this machine looks like a fairly good buy for the price. Matthew -- No trees were killed in the sending of this message. However a large number of electrons were terribly inconvenienced. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] Creating and updating table using function parameter reference
I want to create and update two tables in a function such as below, but using parameters as tablename is not allowed and gives an error. Is there any way I could achieve this? CREATE OR REPLACE FUNCTION test ( t1 text,t2 text ) RETURNS numeric AS $$ declare temp1 text; declare temp2 text; begin temp1=t1; temp2=t2; select product, (case when sum(pd) <> 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp into temp2 from temp1 as dummy group by dummy.product,dummy.totalclaimsgroup,dummy.avgmems,dummy.months; update temp1 as t set GPPP=(select gppp from temp2 as dummy where dummy.product=t.product), end $$ LANGUAGE plpgsql -- ERROR: syntax error at or near "$1" LINE 1: ...en sum(gd)/sum(pd)*100 else 0 end ) as gppp from $1 as dum... ^ QUERY: select product, (case when sum(pd) <> 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp from $1 as dummy group by dummy.product, dummy.totalclaimsgroup,dummy.avgmems,dummy.months CONTEXT: SQL statement in PL/PgSQL function "test" near line 10 ** Error ** ERROR: syntax error at or near "$1" SQL state: 42601 Context: SQL statement in PL/PgSQL function "test" near line 10
Re: [PERFORM] Creating and updating table using function parameter reference
A Dimecres 13 Febrer 2008 15:25, Linux Guru va escriure: > I want to create and update two tables in a function such as below, but > using parameters as tablename is not allowed and gives an error. Is there > any way I could achieve this? You're looking for EXECUTE: http://www.postgresql.org/docs/8.3/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN > > CREATE OR REPLACE FUNCTION test ( t1 text,t2 text ) RETURNS numeric AS $$ > declare temp1 text; > declare temp2 text; > begin > temp1=t1; > temp2=t2; > select > product, > (case when sum(pd) <> 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp > into temp2 from temp1 as dummy > group by dummy.product,dummy.totalclaimsgroup,dummy.avgmems,dummy.months; > > update temp1 as t set > GPPP=(select gppp from temp2 as dummy where dummy.product=t.product), > > end > $$ LANGUAGE plpgsql > > > -- > ERROR: syntax error at or near "$1" > LINE 1: ...en sum(gd)/sum(pd)*100 else 0 end ) as gppp from $1 as dum... > ^ > QUERY: select product, (case when sum(pd) <> 0 then sum(gd)/sum(pd)*100 > else 0 end ) as gppp from $1 as dummy group by dummy.product, > dummy.totalclaimsgroup,dummy.avgmems,dummy.months > CONTEXT: SQL statement in PL/PgSQL function "test" near line 10 > > ** Error ** > > ERROR: syntax error at or near "$1" > SQL state: 42601 > Context: SQL statement in PL/PgSQL function "test" near line 10 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Small DB Server Advice
On Wed, 13 Feb 2008, Magnus Hagander wrote: On Wed, 13 Feb 2008, Rory Campbell-Lange wrote: 4 x 147GB 15000 rpm SCSI in RAID 10 with 320-1 RAID CARD + 64MB cache BBU In my experience, battery backed cache is always worth the money. Even if you're mostly select, you will have some updates. And it'll also pick up other write activity onthe system... Of course. My point was that 64MB should be quite sufficient if most accesses are reads. We have a few machines here with 2GB BBU caches as we do LOTS of writes - that sort of thing probably isn't necessary here. Matthew -- I suppose some of you have done a Continuous Maths course. Yes? Continuous Maths? Whoah, it was like that, was it! -- Computer Science Lecturer ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Join Query Perfomance Issue
Thomas Zaksek <[EMAIL PROTECTED]> writes: > Nested Loop Left Join (cost=0.00..32604.48 rows=3204 width=14) (actual > time=11.991..2223.227 rows=2950 loops=1) >-> Index Scan using > messungen_v_dat_2007_11_12_messpunkt_minute_tag_idx on > messungen_v_dat_2007_11_12 m (cost=0.00..5371.09 rows=3204 width=4) > (actual time=0.152..12.385 rows=2950 loops=1) > Index Cond: ((ganglinientyp = 'M'::bpchar) AND (992 = minute_tag)) >-> Index Scan using messwerte_mv_nr_idx on messwerte_mv w > (cost=0.00..8.49 rows=1 width=18) (actual time=0.730..0.734 rows=1 > loops=2950) > Index Cond: (w.nr = m.messpunkt) > Total runtime: 2234.143 ms > (6 rows) > To me this plan looks very clean and nearly optimal, For so many rows I'm surprised it's not using a bitmap indexscan. What PG version is this? How big are these tables? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Dell Perc/6
On Feb 13, 2008 5:02 AM, Tore Halset <[EMAIL PROTECTED]> wrote: > Hello. > > I think I started that discussion. We ended up buying a Dell 2900 with > PERC 6/i and 10 * 145GB SAS 3,5" 15KRpm discs. 6 of the SAS discs are > in a raid 10 for the database, 2 in a mirror for the wal and the last > 2 in a mirror for the OS. We get 350MB/s writing and 380MB/s reading > to/from the raid 10 area using dd. The OS is Ubuntu and the filesystem > for the raid 10 is ext3. Those are decent numbers. Can you do a bonnie++ run and post the results (specifically interested in seeks)? merlin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Small DB Server Advice
Matthew wrote: On Wed, 13 Feb 2008, Rory Campbell-Lange wrote: 4 x 147GB 15000 rpm SCSI in RAID 10 with 320-1 RAID CARD + 64MB cache BBU 2x Intel Xeon E5405 / 4x 2.00GHz / 1333MHz FSB / 12MB cache 6GB RAM Cost around 2320 GBP -- it would be great to get it under 2000 Needs to be in the UK. I would be grateful for any comments. I'm particularly out of date about the best processors to go for. DNUK also have Opteron as an option. That sounds pretty good. It should run postgres fairly well, especially if you have quite a few parallel queries coming in. You won't need a bigger BBU cache if it's read-heavy. You'll have eight CPU cores, which is good. And RAID 10 is good. In my experience, battery backed cache is always worth the money. Even if you're mostly select, you will have some updates. And it'll also pick up other write activity onthe system... //Magnus ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Anyone using a SAN?
Hi all, We're considering setting up a SAN where I work. Is there anyone using a SAN, for postgres or other purposes? If so I have a few questions for you. - Are there any vendors to avoid or ones that are particularly good? - What performance or reliability implications exist when using SANs? - Are there any killer features with SANs compared to local storage? Any other comments are certainly welcome. Peter ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Anyone using a SAN?
On Wed, Feb 13, 2008 at 10:56:54AM -0600, Peter Koczan wrote: > Hi all, > > We're considering setting up a SAN where I work. Is there anyone using > a SAN, for postgres or other purposes? If so I have a few questions > for you. > > - Are there any vendors to avoid or ones that are particularly good? > > - What performance or reliability implications exist when using SANs? > > - Are there any killer features with SANs compared to local storage? > > Any other comments are certainly welcome. > > Peter > Peter, The key is to understand your usage patterns, both I/O and query. SANs can be easily bandwidth limited which can tank your database performance. There have been several threads in the mailing list about performance problems caused by the use of a SAN for storage. Cheers, Ken ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Join Query Perfomance Issue
For so many rows I'm surprised it's not using a bitmap indexscan. What PG version is this? How big are these tables? regards, tom lane Its PG 8.2.6 on Freebsd. messungen_v_dat_2007_11_12 ist about 4 million rows and messwerte is about 10 million rows. ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] HOT TOAST?
Folks, Does anyone know if HOT is compatible with pg_toast tables, or do TOASTed rows simply get excluded from HOT? I can run some tests, but if someone knows this off the top of their heads it would save me some time. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Anyone using a SAN?
On Feb 13, 2008 12:46 PM, Kenneth Marshall <[EMAIL PROTECTED]> wrote: > On Wed, Feb 13, 2008 at 10:56:54AM -0600, Peter Koczan wrote: > > Hi all, > > > > We're considering setting up a SAN where I work. Is there anyone using > > a SAN, for postgres or other purposes? If so I have a few questions > > for you. > > > > - Are there any vendors to avoid or ones that are particularly good? > > > > - What performance or reliability implications exist when using SANs? > > > > - Are there any killer features with SANs compared to local storage? > > > > Any other comments are certainly welcome. > > > > Peter > > > > Peter, > > The key is to understand your usage patterns, both I/O and query. > SANs can be easily bandwidth limited which can tank your database > performance. There have been several threads in the mailing list > about performance problems caused by the use of a SAN for storage. It's critical that you set up the SAN with a database in mind otherwise the performance will be bad. I tested a DB on a SAN designed to maximize storage space and performance was terrible. I never had the time or resources to reconfigure the SAN to test a more suitable spindle setup since the SAN was in heavy production use for file archiving. Alex ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Small DB Server Advice
On Wed, 13 Feb 2008, Rory Campbell-Lange wrote: (https://secure.dnuk.com/systems/r325hs-1u.php?configuration=7766) 4 x 147GB 15000 rpm SCSI in RAID 10 with 320-1 RAID CARD + 64MB cache BBU That's running the LSI Megaraid SCSI controller. Those are solid but not the best performers in their class, particularly on writes. But given your application description (1GB data in a year and read-heavy) that card running a 4-spindle RAID10 should be overkill. I'm particularly out of date about the best processors to go for. DNUK also have Opteron as an option. Current Intel chips benchmark better, occasionally you'll find people who claim the better multi-CPU memory model in the Opteron systems give them better performance at high loads but that's difficult to quantify. There's not a huge difference in any case. You're probably going to bottleneck on either disk or how fast DDR2 memory goes anyway and both sets of products are competative right now. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] HOT TOAST?
Josh Berkus <[EMAIL PROTECTED]> writes: > Does anyone know if HOT is compatible with pg_toast tables, or do TOASTed > rows > simply get excluded from HOT? The current TOAST code never does any updates, only inserts/deletes. But the HOT logic should be able to reclaim deleted rows early via pruning. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Dell Perc/6
On Feb 13, 2008, at 12:06, Dave Cramer wrote: On 13-Feb-08, at 5:02 AM, Tore Halset wrote: Hello. I think I started that discussion. We ended up buying a Dell 2900 with PERC 6/i and 10 * 145GB SAS 3,5" 15KRpm discs. 6 of the SAS discs are in a raid 10 for the database, 2 in a mirror for the wal and the last 2 in a mirror for the OS. We get 350MB/s writing and 380MB/s reading to/from the raid 10 area using dd. The OS is Ubuntu and the filesystem for the raid 10 is ext3. Wow that's fantastic. Just to be sure, did you make sure that you read and wrote 2x memory to take the cache out of the measurement ? Dave The box have 16GB of ram, but my original test file was only 25GB. Sorry. Going to 33GB lowered the numbers for writing. Here you have some samples. % sh -c "dd if=/dev/zero of=bigfile bs=8k count=400 && sync" 3276800 bytes (33 GB) copied, 103.722 seconds, 316 MB/s 3276800 bytes (33 GB) copied, 99.669 seconds, 329 MB/s % time dd if=bigfile of=/dev/null bs=8k 3276800 bytes (33 GB) copied, 85.4235 seconds, 384 MB/s 3276800 bytes (33 GB) copied, 85.4628 seconds, 383 MB/s Regards, - Tore. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Optimizing No matching record Queries
Thanks! for all your replies, I tried increasing the statistics on fklistingsourceid to 1000 it made any difference. Then I created an index on (fklistingsourceid,entrydate) it helped and it was fast. This index would fix this problem but in general I would like to know what if there are queries where it does "index scan backwards" and there is no "order by clause" and the query is still bad ? Would there be a case like that or the planner uses index scan backwards only when use order by desc also. Richard Huxton wrote: Dean Gibson (DB Administrator) wrote: The questions are: 1. Why in the planner scanning the entire idx_listing_entrydate, when I'd think it should be scanning the entire pk_listingstatus_listingstatusid ? It's looking at the ORDER BY and sees that the query needs the 10 most recent, so tries searching by date. That's sensible where you are going to have a lot of matches for fklistingsourceid. Which suggests that statistics for "fklistingsourceid" aren't high enough, like Greg suggested. If that doesn't help, the index on (fklistingsourceid,entrydate) that Stephen might well do so. 2. Why is "Index Scan using pk_listingstatus_listingstatusid on listingstatus listingsta1_ (cost=0.00..0.27 rows=1 width=4) (never executed)" ? Because nothing comes out of the first index-scan. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Dell Perc/6
On Feb 13, 2008, at 20:45, Tore Halset wrote: The box have 16GB of ram, but my original test file was only 25GB. Sorry. Going to 33GB lowered the numbers for writing. Here you have some samples. % sh -c "dd if=/dev/zero of=bigfile bs=8k count=400 && sync" 3276800 bytes (33 GB) copied, 103.722 seconds, 316 MB/s 3276800 bytes (33 GB) copied, 99.669 seconds, 329 MB/s % time dd if=bigfile of=/dev/null bs=8k 3276800 bytes (33 GB) copied, 85.4235 seconds, 384 MB/s 3276800 bytes (33 GB) copied, 85.4628 seconds, 383 MB/s Regards, - Tore. And here are the bonnie++ numbers. I am a bonnie++ newbie so I ran it with no options. Version 1.03c --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP harteigen32136M 83983 97 221757 40 106483 19 89603 97 268787 22 886.1 1 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 + +++ + +++ + +++ + +++ + +++ + +++ harteigen,32136M, 83983,97,221757,40,106483,19,89603,97,268787,22,886.1,1,16,+,+++,++ +++,+++,+,+++,+,+++,+,+++,+,+++ Regards, - Tore. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Optimizing No matching record Queries
"Pallav Kalva" <[EMAIL PROTECTED]> writes: > This index would fix this problem but in general I would like to know what if > there are queries where it does "index scan backwards" and there is no "order > by clause" and the query is still bad ? Would there be a case like that or the > planner uses index scan backwards only when use order by desc also. I think you're oversimplifying. Basically you were asking the planner for the most recent record for a given user. The planner had the choice either of a) going through all the records for a given user and picking the most recent, or b) scanning the records from most recent to oldest and looking for the given user. It was a choice between two evils. If there are a lot of records for the user then a) will be bad since it has to scan all of them to find the most recent and if there are no records for the user then b) will be bad because it'll have to go through all of the records to the beginning of time. The suggested index lets it scan the records for the given user from most recent to oldest without seeing any records for any other user. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Anyone using a SAN?
[Peter Koczan - Wed at 10:56:54AM -0600] > We're considering setting up a SAN where I work. Is there anyone using > a SAN, for postgres or other purposes? If so I have a few questions > for you. Some time ago, my boss was planning to order more hardware - including a SAN - and coincidentally, SANs were discussed at this list as well. The consensus on this list seemed to be that running postgres on SAN is not cost efficiently - one would get better performance for a lower cost if the database host is connected directly to the disks - and also, buying the wrong SAN can cause quite some problems. My boss (with good help of the local SAN-pusher) considered that the arguments against the SAN solution on this list was not really valid for an "enterprise" user. The SAN-pusher really insisted that through a state-of-the-art SAN theoretically it should be possible to achieve far better bandwidth as well as lower latency to the disks. Personally, I don't have the clue, but all my colleagues believes him, so I guess he is right ;-) What I'm told is that the state-of-the-art SAN allows for an "insane amount" of hard disks to be installed, much more than what would fit into any decent database server. We've ended up buying a SAN, the physical installation was done last week, and I will be able to tell in some months if it was a good idea after all, or not. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Join Query Perfomance Issue
On Feb 12, 2008 4:11 AM, Thomas Zaksek <[EMAIL PROTECTED]> wrote: > I tried turning off nestloop, but with terrible results: Yeah, it didn't help. I was expecting the query planner to switch to a more efficient join plan. > I also tried to increase work_men, now the config is > work_mem = 4MB Try setting it higher for JUST THIS query. i.e. set work_mem=128M; explain analyze select and see how that runs. Then play with it til you've got it down to what helps. Note that work_mem in postgresql.conf being too large can be dangerous, so it might be something you set for just this query for safety reasons. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Anyone using a SAN?
On 13-2-2008 22:06 Tobias Brox wrote: What I'm told is that the state-of-the-art SAN allows for an "insane amount" of hard disks to be installed, much more than what would fit into any decent database server. We've ended up buying a SAN, the physical installation was done last week, and I will be able to tell in some months if it was a good idea after all, or not. Your SAN-pusher should have a look at the HP-submissions for TPC-C... The recent Xeon systems are all without SAN's and still able to connect hundreds of SAS-disks. This one has 2+28+600 hard drives connected to it: http://tpc.org/results/individual_results/HP/hp_ml370g5_2p_X5460_tpcc_080107_es.pdf Long story short, using SAS you can theoretically connect up to 64k disks to a single system. And with the HP-example they connected 26 external enclosures (MSA70) to 8 internal with external SAS-ports. I.e. they ended up with 28+600 harddrives spread out over 16 external 4-port SAS-connectors with a bandwidth of 12Gbit per connector... Obviously its a bit difficult to share those 628 harddrives amongst several systems, but the argument your colleagues have for SAN isn't a very good one. All major hardware vendors nowadays have external SAS-enclosures which can hold 12-25 external harddrives (and can often be stacked to two or three enclosures) and can be connected to normal internal PCI-e SAS-raid-cards. Those controllers have commonly two external ports and can be used with other controllers in the system to combine all those connected enclosures to one or more virtual images, or you could have your software LVM/raid on top of those controllers. Anyway, the common physical limit of 6-16 disks in a single server-enclosure isn't very relevant anymore in an argument against SAN. Best regards, Arjen ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] HOT TOAST?
Tom, > The current TOAST code never does any updates, only inserts/deletes. > But the HOT logic should be able to reclaim deleted rows early via > pruning. OK, so for a heavy update application we should still see a vacuum reduction, even if most of the rows are 40k large? Time to run some tests ... -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Anyone using a SAN?
On Wed, 13 Feb 2008, Tobias Brox wrote: What I'm told is that the state-of-the-art SAN allows for an "insane amount" of hard disks to be installed, much more than what would fit into any decent database server. You can attach a surpringly large number of drives to a server nowadays, but in general it's easier to manage larger numbers of them on a SAN. Also, there are significant redundancy improvements using a SAN that are worth quite a bit in some enterprise environments. Being able to connect all the drives, no matter how many, to two or more machines at once trivially is typically easier to setup on a SAN than when you're using more direct storage. Basically the performance breaks down like this: 1) Going through the SAN interface (fiber channel etc.) introduces some latency and a potential write bottleneck compared with direct storage, everything else being equal. This can really be a problem if you've got a poor SAN vendor or interface issues you can't sort out. 2) It can be easier to manage a large number of disks in the SAN, so for situations where aggregate disk throughput is the limiting factor the SAN solution might make sense. 3) At the high-end, you can get SANs with more cache than any direct controller I'm aware of, which for some applications can lead to them having a more quantifiable lead over direct storage. It's easy (albeit expensive) to get an EMC array with 16GB worth of memory for caching on it for example (and with 480 drives). And since they've got a more robust power setup than a typical server, you can even enable all the individual drive caches usefully (that's 16-32MB each nowadays, so at say 100 disks you've potentially got another 1.6GB of cache right there). If you're got a typical server you can end up needing to turn off individual direct attached drive caches for writes, because they many not survive a power cycle even with a UPS, and you have to just rely on the controller write cache. There's no universal advantage on either side here, just a different set of trade-offs. Certainly you'll never come close to the performance/$ direct storage gets you if you buy that in SAN form instead, but at higher budgets or feature requirements they may make sense anyway. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Anyone using a SAN?
[Arjen van der Meijden] > Your SAN-pusher should have a look at the HP-submissions for TPC-C... > The recent Xeon systems are all without SAN's and still able to connect > hundreds of SAS-disks. Yes, I had a feeling that the various alternative solutions for "direct connection" hadn't been investigated fully. I was pushing for it, but hardware is not my thing. Anyway, most likely the only harm done by chosing SAN is that it's more expensive than an equivalent solution with direct connected disks. Well, not my money anyway. ;-) > Obviously its a bit difficult to share those 628 harddrives amongst > several systems, but the argument your colleagues have for SAN isn't a > very good one. As far as I've heard, you cannot really benefit much from this with postgres, one cannot have two postgres servers on two hosts sharing the same data (i.e. using one for failover or for CPU/memory-bound read queries). Having the SAN connected to several hosts gives us two benefits, if the database host goes down but not the SAN, it will be quite fast to start up a new postgres instance on a different host - and it will also be possible to take out backups real-time from the SAN without much performance-hit. Anyway, with a warm standby server as described on http://www.postgresql.org/docs/current/interactive/warm-standby.html one can achieve pretty much the same without a SAN. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Anyone using a SAN?
On Feb 13, 2008 5:02 PM, Greg Smith <[EMAIL PROTECTED]> wrote: > On Wed, 13 Feb 2008, Tobias Brox wrote: > > > What I'm told is that the state-of-the-art SAN allows for an "insane > > amount" of hard disks to be installed, much more than what would fit > > into any decent database server. > > You can attach a surpringly large number of drives to a server nowadays, > but in general it's easier to manage larger numbers of them on a SAN. > Also, there are significant redundancy improvements using a SAN that are > worth quite a bit in some enterprise environments. Being able to connect > all the drives, no matter how many, to two or more machines at once > trivially is typically easier to setup on a SAN than when you're using > more direct storage. SNIP > There's no universal advantage on either side here, just a different set > of trade-offs. Certainly you'll never come close to the performance/$ > direct storage gets you if you buy that in SAN form instead, but at higher > budgets or feature requirements they may make sense anyway. I agree with everything you've said here, and you've said it far more clearly than I could have. I'd like to add that it may still be feasable to have a SAN and a db with locally attached storage. Talk the boss into a 4 port caching SAS controller and four very fast hard drives or something else on the server so that you can run tests to compare the performance of a rather limited on board RAID set to the big SAN. For certain kinds of things, like loading tables, it will still be a very good idea to have local drives for caching and transforming data and such. Going further, the argument for putting the db onto the SAN may be weakened if the amount of data on the db server can't and likely won't require a lot of space. A lot of backend office dbs are running in the sub gigabyte range and will never grow to the size of the social security database. Even with dozens of apps, an in house db server might be using no more than a few dozen gigabytes of storage. Given the cost and performance of large SAS and SATA drives, it's not all unlikely that you can fit everything you need for the next five years on a single set of disks on a server that's twice as powerful as most internal db servers need. You can hide the cost of the extra drives in the shadow of the receipt for the SAN. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Anyone using a SAN?
Should this be summarized somewhere in our docs; just a few lines with the tradeoffs, direct storage = cheaper, faster, SAN = more configurable? --- Scott Marlowe wrote: > On Feb 13, 2008 5:02 PM, Greg Smith <[EMAIL PROTECTED]> wrote: > > On Wed, 13 Feb 2008, Tobias Brox wrote: > > > > > What I'm told is that the state-of-the-art SAN allows for an "insane > > > amount" of hard disks to be installed, much more than what would fit > > > into any decent database server. > > > > You can attach a surpringly large number of drives to a server nowadays, > > but in general it's easier to manage larger numbers of them on a SAN. > > Also, there are significant redundancy improvements using a SAN that are > > worth quite a bit in some enterprise environments. Being able to connect > > all the drives, no matter how many, to two or more machines at once > > trivially is typically easier to setup on a SAN than when you're using > > more direct storage. > > SNIP > > > There's no universal advantage on either side here, just a different set > > of trade-offs. Certainly you'll never come close to the performance/$ > > direct storage gets you if you buy that in SAN form instead, but at higher > > budgets or feature requirements they may make sense anyway. > > I agree with everything you've said here, and you've said it far more > clearly than I could have. > > I'd like to add that it may still be feasable to have a SAN and a db > with locally attached storage. Talk the boss into a 4 port caching > SAS controller and four very fast hard drives or something else on the > server so that you can run tests to compare the performance of a > rather limited on board RAID set to the big SAN. For certain kinds of > things, like loading tables, it will still be a very good idea to have > local drives for caching and transforming data and such. > > Going further, the argument for putting the db onto the SAN may be > weakened if the amount of data on the db server can't and likely won't > require a lot of space. A lot of backend office dbs are running in > the sub gigabyte range and will never grow to the size of the social > security database. Even with dozens of apps, an in house db server > might be using no more than a few dozen gigabytes of storage. Given > the cost and performance of large SAS and SATA drives, it's not all > unlikely that you can fit everything you need for the next five years > on a single set of disks on a server that's twice as powerful as most > internal db servers need. > > You can hide the cost of the extra drives in the shadow of the receipt > for the SAN. > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly -- Bruce Momjian <[EMAIL PROTECTED]>http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Anyone using a SAN?
Thanks for all your input, it is very helpful. A SAN for our postgres deployment is probably sufficient in terms of performance, because we just don't have that much data. I'm a little concerned about needs for user and research databases, but if a project needs a big, fast database, it might be wise to have them shell out for DAS. My co-workers and I are meeting with a vendor in two weeks (3Par, specifically), and I think I have a better idea of what I should be looking at. I'll keep you all up on the situation. Keep the ideas coming as I still would like to know of any other important factors. Thanks again. Peter ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings