Re: [PERFORM] Postgres low end processing.
Hi again all, I've tested postgres 7.3.4 on Linux version 2.4.17 and this is what I found : The initial instance took up 8372K and this fluctuated between +- 8372K and 10372K, plus +- 3500K for every connection. I did quite a few transactions on both connections, plus a few vacuums and a pg_dump and the total memory usage didn't seem to go over 16M I set all the _buffers, _mem, _fsm settings to the minimum, restarted every time, but this had absolutely no noticeable increase or decrease in total memory usage. (I used a program called gmemusage to get these stats.) On the same machine , I tested postgres 7.1.2 with basically the same conf options (not _fsm) and got the following : The initial instance was 1772K and fluctuated to +- 4000K, plus +- 3400K for every connection. Doing the same transactions, vacuum + pg_dump, total memory usage didn't really go over 11M, which was exactly what I needed. Although I've lived through some of the shortcomings of 7.1.2, it is still very stable, and works perfectly for what it is going to be used for. Again, here, I was only able to restrict things a little by changing the configuration options, but no major difference in memory usage. Regards Stef On Mon, 6 Oct 2003 09:55:51 +0200 Stef <[EMAIL PROTECTED]> wrote: => Thanks for the replies, => => On Fri, 3 Oct 2003 11:08:48 -0700 => Josh Berkus <[EMAIL PROTECTED]> wrote: => => 1. Make sure that the WAL files (pg_xlog) are on a seperate disk from the => => database files, either through mounting or symlinking. => => I'm not sure I understand how this helps? => => => 2. Tweak the .conf file for low vacuum_mem (1024?), but vacuum very => => frequently, like every 1-5 minutes. Spend some time tuning your => => fsm_max_pages to the ideal level so that you're not allocating any extra => => memory to the FSM. => => => => 3. If your concern is *average* CPU/RAM consumption, and not peak load => => activity, increase wal_files and checkpoint_segments to do more efficient => => batch processing of pending updates as the cost of some disk space. If peak => => load activity is a problem, don't do this. => => => => 4. Tune all of your queries carefully to avoid anything requiring a => => RAM-intensive merge join or CPU-eating calculated expression hash join, or => => similar computation-or-RAM-intensive operations. => => Thanks, I'll try some of these, and post the results. => The actual machines seem to be Pentium I machines, => with 32M RAM. I've gathered that it is theoretically => possible, so no to go try it. => => Regards => Stef => pgp0.pgp Description: PGP signature
Re: [PERFORM] Postgres low end processing.
On Tue, 7 Oct 2003, Stef wrote: > The initial instance took up 8372K and this fluctuated > between +- 8372K and 10372K, plus +- 3500K for > every connection. > Does that include/exlude the size of say, shared code & libraries? I know linux does copy-on-write forking.. so it may be less in reality... -- Jeff Trout <[EMAIL PROTECTED]> http://www.jefftrout.com/ http://www.stuarthamm.net/ ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] TPC-R benchmarks
On Thu, 2003-09-25 at 14:32, Jenny Zhang wrote: > I am running TPC-H with scale factor of 1 on RedHat7.2 with the kernel > 2.5.74. Q17 can always finish in about 7 seconds on my system. The > execution plan is: I just want to point out that we are the OSDL are not running a TPC-X anything. We have fair use implementations of these benchmarks but because of differences our performance tests can not be compared with the TPCC's benchmark results. > > Aggregate (cost=780402.43..780402.43 rows=1 width=48) >-> Nested Loop (cost=0.00..780397.50 rows=1973 width=48) > Join Filter: ("inner".l_quantity < (subplan)) > -> Seq Scan on part (cost=0.00..8548.00 rows=197 width=12) >Filter: ((p_brand = 'Brand#31'::bpchar) AND (p_container > = 'LG CASE'::bpchar)) > -> Index Scan using i_l_partkey on lineitem > (cost=0.00..124.32 rows=30 width=36) >Index Cond: ("outer".p_partkey = lineitem.l_partkey) > SubPlan >-> Aggregate (cost=124.40..124.40 rows=1 width=11) > -> Index Scan using i_l_partkey on lineitem > (cost=0.00..124.32 rows=30 width=11) >Index Cond: (l_partkey = $0) > (11 rows) > > Hope this helps, > Jenny > On Thu, 2003-09-25 at 12:40, Oleg Lebedev wrote: > > I am running TPC-R benchmarks with a scale factor of 1, which correspond > > to approximately 1 GB database size on PostgreSQL 7.3.4 installed on > > CygWin on Windows XP. I dedicated 128 MB of shared memory to my postrges > > installation. > > Most of the queries were able to complete in a matter of minutes, but > > query 17 was taking hours and hours. The query is show below. Is there > > any way to optimize it ? > > > > select > > sum(l_extendedprice) / 7.0 as avg_yearly > > from > > lineitem, > > part > > where > > p_partkey = l_partkey > > and p_brand = 'Brand#11' > > and p_container = 'SM PKG' > > and l_quantity < ( > > select > >0.2 * avg(l_quantity) > > from > >lineitem > > where > >l_partkey = p_partkey > > ); > > > > Thanks. > > > > Oleg > > > > * > > > > This e-mail may contain privileged or confidential material intended for the named > > recipient only. > > If you are not the named recipient, delete this message and all attachments. > > Unauthorized reviewing, copying, printing, disclosing, or otherwise using > > information in this e-mail is prohibited. > > We reserve the right to monitor e-mail sent through our network. > > > > * > > > ---(end of broadcast)--- > TIP 7: don't forget to increase your free space map settings -- Timothy D. Witham - Lab Director - [EMAIL PROTECTED] Open Source Development Lab Inc - A non-profit corporation 12725 SW Millikan Way - Suite 400 - Beaverton OR, 97005 (503)-626-2455 x11 (office)(503)-702-2871 (cell) (503)-626-2436 (fax) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] TPC-R benchmarks
Tom, I've found the problem with TPC-R query #19. And it, unfortunately, appears to be a problem in the PostgreSQL query planner. To sum up the below: it appears that whenever a set of WHERE conditions exceeds a certain level of complexity, the planner just ignores all applicable indexes and goes for a seq scan. While this may be unavoidable to some degree, it seems to me that we need to raise the threshold of complexity at which it does this. tpcr=# select version(); version - PostgreSQL 7.3.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3 20030226 (prerelease) (SuSE Linux) (1 row) I've tested a number of indexes on the query, and found the two most efficient on subsets of the query. Thus: explain analyze select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#33' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ); QUERY PLAN -- Aggregate (cost=10380.70..10380.70 rows=1 width=30) (actual time=161.61..161.61 rows=1 loops=1) -> Nested Loop (cost=0.00..10380.67 rows=13 width=30) (actual time=81.54..161.47 rows=17 loops=1) -> Index Scan using idx_part_1 on part (cost=0.00..9466.33 rows=62 width=4) (actual time=81.21..137.24 rows=98 loops=1) Index Cond: (p_brand = 'Brand#33'::bpchar) Filter: (((p_container = 'SM CASE'::bpchar) OR (p_container = 'SM BOX'::bpchar) OR (p_container = 'SM PACK'::bpchar) OR (p_container = 'SM PKG'::bpchar)) AND (p_size >= 1) AND (p_size <= 5)) -> Index Scan using idx_lineitem_3 on lineitem (cost=0.00..14.84 rows=1 width=26) (actual time=0.22..0.24 rows=0 loops=98) Index Cond: (("outer".p_partkey = lineitem.l_partkey) AND (lineitem.l_quantity >= 8::numeric) AND (lineitem.l_quantity <= 18::numeric)) Filter: (((l_shipmode = 'AIR'::bpchar) OR (l_shipmode = 'AIR REG'::bpchar)) AND (l_shipinstruct = 'DELIVER IN PERSON'::bpchar)) Total runtime: 161.71 msec This also works for a similar query: explain analyze select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#52' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 14 and l_quantity <= 14 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ); Aggregate (cost=11449.36..11449.36 rows=1 width=30) (actual time=195.72..195.72 rows=1 loops=1) -> Nested Loop (cost=0.00..11449.29 rows=28 width=30) (actual time=56.42..195.39 rows=48 loops=1) -> Index Scan using idx_part_1 on part (cost=0.00..9466.33 rows=139 width=4) (actual time=56.15..153.17 rows=166 loops=1) Index Cond: (p_brand = 'Brand#52'::bpchar) Filter: (((p_container = 'MED BAG'::bpchar) OR (p_container = 'MED BOX'::bpchar) OR (p_container = 'MED PKG'::bpchar) OR (p_container = 'MED PACK'::bpchar)) AND (p_size >= 1) AND (p_size <= 10)) -> Index Scan using idx_lineitem_3 on lineitem (cost=0.00..14.29 rows=1 width=26) (actual time=0.23..0.25 rows=0 loops=166) Index Cond: (("outer".p_partkey = lineitem.l_partkey) AND (lineitem.l_quantity >= 14::numeric) AND (lineitem.l_quantity <= 24::numeric)) Filter: (((l_shipmode = 'AIR'::bpchar) OR (l_shipmode = 'AIR REG'::bpchar)) AND (l_shipinstruct = 'DELIVER IN PERSON'::bpchar)) Total runtime: 195.82 msec (9 rows) If, however, I combine the two where clauses with an OR, the planner gets confused and insists on loading the entire tables into memory (even though I don't have that much memory): explain select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#33' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 a
Re: [PERFORM] TPC-R benchmarks
Josh Berkus <[EMAIL PROTECTED]> writes: > To sum up the below: it appears that whenever a set of WHERE conditions > exceeds a certain level of complexity, the planner just ignores all > applicable indexes and goes for a seq scan. It looks to me like the planner is coercing the WHERE clause into canonical OR-of-ANDs form (DNF). Which is often a good heuristic but it seems unhelpful for this query. > How can we fix this? Feel free to propose improvements to the heuristics in src/backend/optimizer/prep/prepqual.c ... regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend