On May 13, 2013, at 4:24 PM, Jeff Janes wrote: > On Mon, May 13, 2013 at 3:36 PM, Mike McCann <mcc...@mbari.org> wrote: > > Increasing work_mem to 355 MB improves the performance by a factor of 2: > > stoqs_march2013_s=# set work_mem='355MB'; > SET > stoqs_march2013_s=# explain analyze select * from stoqs_measuredparameter > order by datavalue; > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------------------------ > Sort (cost=422106.15..430560.68 rows=3381814 width=20) (actual > time=2503.078..2937.130 rows=3381814 loops=1) > Sort Key: datavalue > Sort Method: quicksort Memory: 362509kB > -> Seq Scan on stoqs_measuredparameter (cost=0.00..55359.14 rows=3381814 > width=20) (actual time=0.016..335.745 rows=3381814 loops=1) > Total runtime: 3094.601 ms > (5 rows) > > I tried changing random_page_cost to from 4 to 1 and saw no change. > > I'm wondering now what changes might get this query to run in less than one > second. > > > I think you are worrying about the wrong thing here. What is a web app going > to do with 3,381,814 rows, once it obtains them? Your current testing is not > testing the time it takes to stream that data to the client, or for the > client to do something meaningful with that data. > > If you only plan to actually fetch a few dozen of those rows, then you > probably need to incorporate that into your test, either by using a LIMIT, or > by using a mock-up of the actual application to do some timings. > > Also, what is the type and collation of the column you are sorting on? > non-'C' collations of text columns sort about 3 times slower than 'C' > collation does. > > > If all the data is in memory, then will faster CPU and memory be the things > that help? > > Yes, those would help (it is not clear to me which of the two would help > more), but I think you need to rethink your design of sending the entire > database table to the application server for each page-view. > > > Cheers, > > Jeff
Hi Jeff, The datavalue column is double precision: stoqs_march2013_s=# \d+ stoqs_measuredparameter Table "public.stoqs_measuredparameter" Column | Type | Modifiers | Storage | Description ----------------+------------------+----------------------------------------------------------------------+---------+------------- id | integer | not null default nextval('stoqs_measuredparameter_id_seq'::regclass) | plain | measurement_id | integer | not null | plain | parameter_id | integer | not null | plain | datavalue | double precision | not null | plain | Indexes: "stoqs_measuredparameter_pkey" PRIMARY KEY, btree (id) "stoqs_measuredparameter_measurement_id_parameter_id_key" UNIQUE CONSTRAINT, btree (measurement_id, parameter_id) "stoqs_measuredparameter_datavalue" btree (datavalue) "stoqs_measuredparameter_measurement_id" btree (measurement_id) "stoqs_measuredparameter_parameter_id" btree (parameter_id) Foreign-key constraints: "stoqs_measuredparameter_measurement_id_fkey" FOREIGN KEY (measurement_id) REFERENCES stoqs_measurement(id) DEFERRABLE INITIALLY DEFERRED "stoqs_measuredparameter_parameter_id_fkey" FOREIGN KEY (parameter_id) REFERENCES stoqs_parameter(id) DEFERRABLE INITIALLY DEFERRED Has OIDs: no Thanks for the suggestion and advice to examine the web app performance. We've actually taken quite a few steps to optimize how the web app works. The example query I provided is a simple worst-case one that we can use to help us decide on the proper hardware. An actual query performed by the web app is: stoqs_march2013_s=# explain analyze SELECT stoqs_measuredparameter.id, stoqs_march2013_s-# stoqs_parameter.name AS parameter__name, stoqs_march2013_s-# stoqs_parameter.standard_name AS parameter__standard_name, stoqs_march2013_s-# stoqs_measurement.depth AS measurement__depth, stoqs_march2013_s-# stoqs_measurement.geom AS measurement__geom, stoqs_march2013_s-# stoqs_instantpoint.timevalue AS measurement__instantpoint__timevalue, stoqs_march2013_s-# stoqs_platform.name AS measurement__instantpoint__activity__platform__name, stoqs_march2013_s-# stoqs_measuredparameter.datavalue AS datavalue, stoqs_march2013_s-# stoqs_parameter.units AS parameter__units stoqs_march2013_s-# FROM stoqs_parameter p1, stoqs_march2013_s-# stoqs_measuredparameter stoqs_march2013_s-# INNER JOIN stoqs_measurement ON (stoqs_measuredparameter.measurement_id = stoqs_measurement.id) stoqs_march2013_s-# INNER JOIN stoqs_instantpoint ON (stoqs_measurement.instantpoint_id = stoqs_instantpoint.id) stoqs_march2013_s-# INNER JOIN stoqs_parameter ON (stoqs_measuredparameter.parameter_id = stoqs_parameter.id) stoqs_march2013_s-# INNER JOIN stoqs_activity ON (stoqs_instantpoint.activity_id = stoqs_activity.id) stoqs_march2013_s-# INNER JOIN stoqs_platform ON (stoqs_activity.platform_id = stoqs_platform.id) stoqs_march2013_s-# INNER JOIN stoqs_measuredparameter mp1 ON mp1.measurement_id = stoqs_measuredparameter.measurement_id stoqs_march2013_s-# WHERE (p1.name = 'sea_water_sigma_t') stoqs_march2013_s-# AND (mp1.datavalue > 25.19) stoqs_march2013_s-# AND (mp1.datavalue < 26.01) stoqs_march2013_s-# AND (mp1.parameter_id = p1.id) stoqs_march2013_s-# AND (stoqs_instantpoint.timevalue <= '2013-03-17 19:05:06' stoqs_march2013_s(# AND stoqs_instantpoint.timevalue >= '2013-03-17 15:35:13' stoqs_march2013_s(# AND stoqs_parameter.name IN ('fl700_uncorr') stoqs_march2013_s(# AND stoqs_measurement.depth >= -1.88 stoqs_march2013_s(# AND stoqs_platform.name IN ('dorado') stoqs_march2013_s(# AND stoqs_measurement.depth <= 83.57) stoqs_march2013_s-# ORDER BY stoqs_activity.name ASC, stoqs_instantpoint.timevalue ASC; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------- Sort (cost=10741.41..10741.42 rows=1 width=1282) (actual time=770.211..770.211 rows=0 loops=1) Sort Key: stoqs_activity.name, stoqs_instantpoint.timevalue Sort Method: quicksort Memory: 25kB -> Hash Join (cost=3002.89..10741.40 rows=1 width=1282) (actual time=770.200..770.200 rows=0 loops=1) Hash Cond: (stoqs_instantpoint.activity_id = stoqs_activity.id) -> Nested Loop (cost=2983.69..10722.19 rows=3 width=954) (actual time=770.036..770.036 rows=0 loops=1) -> Nested Loop (cost=2983.69..9617.36 rows=191 width=946) (actual time=91.369..680.072 rows=20170 loops=1) -> Hash Join (cost=2983.69..8499.07 rows=193 width=842) (actual time=91.346..577.633 rows=20170 loops=1) Hash Cond: (stoqs_measuredparameter.parameter_id = stoqs_parameter.id) -> Nested Loop (cost=2982.38..8478.47 rows=4628 width=24) (actual time=91.280..531.408 rows=197746 loops=1) -> Nested Loop (cost=2982.38..4862.37 rows=512 width=4) (actual time=91.202..116.140 rows=20170 loops=1) -> Seq Scan on stoqs_parameter p1 (cost=0.00..1.30 rows=1 width=4) (actual time=0.002..0.011 rows=1 loops=1) Filter: ((name)::text = 'sea_water_sigma_t'::text) -> Bitmap Heap Scan on stoqs_measuredparameter mp1 (cost=2982.38..4854.40 rows=534 width=8) (actual time=91.194..109.846 rows=20170 loop s=1) Recheck Cond: ((datavalue > 25.19::double precision) AND (datavalue < 26.01::double precision) AND (parameter_id = p1.id)) -> BitmapAnd (cost=2982.38..2982.38 rows=534 width=0) (actual time=90.794..90.794 rows=0 loops=1) -> Bitmap Index Scan on stoqs_measuredparameter_datavalue (cost=0.00..259.54 rows=12292 width=0) (actual time=62.769..62.769 rows=23641 loops=1) Index Cond: ((datavalue > 25.19::double precision) AND (datavalue < 26.01::double precision)) -> Bitmap Index Scan on stoqs_measuredparameter_parameter_id (cost=0.00..2719.38 rows=147035 width=0) (actual time=27.412..2 7.412 rows=34750 loops=1) Index Cond: (parameter_id = p1.id) -> Index Scan using stoqs_measuredparameter_measurement_id on stoqs_measuredparameter (cost=0.00..6.98 rows=7 width=20) (actual time=0.008..0. 017 rows=10 loops=20170) Index Cond: (measurement_id = mp1.measurement_id) -> Hash (cost=1.30..1.30 rows=1 width=826) (actual time=0.012..0.012 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Seq Scan on stoqs_parameter (cost=0.00..1.30 rows=1 width=826) (actual time=0.007..0.010 rows=1 loops=1) Filter: ((name)::text = 'fl700_uncorr'::text) -> Index Scan using stoqs_measurement_pkey on stoqs_measurement (cost=0.00..5.78 rows=1 width=116) (actual time=0.004..0.004 rows=1 loops=20170) Index Cond: (id = stoqs_measuredparameter.measurement_id) Filter: ((depth >= (-1.88)::double precision) AND (depth <= 83.57::double precision)) -> Index Scan using stoqs_instantpoint_pkey on stoqs_instantpoint (cost=0.00..5.77 rows=1 width=16) (actual time=0.004..0.004 rows=0 loops=20170) Index Cond: (id = stoqs_measurement.instantpoint_id) Filter: ((timevalue <= '2013-03-17 19:05:06-07'::timestamp with time zone) AND (timevalue >= '2013-03-17 15:35:13-07'::timestamp with time zone)) -> Hash (cost=18.82..18.82 rows=30 width=336) (actual time=0.151..0.151 rows=7 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Hash Join (cost=1.09..18.82 rows=30 width=336) (actual time=0.035..0.145 rows=7 loops=1) Hash Cond: (stoqs_activity.platform_id = stoqs_platform.id) -> Seq Scan on stoqs_activity (cost=0.00..16.77 rows=177 width=66) (actual time=0.005..0.069 rows=177 loops=1) -> Hash (cost=1.07..1.07 rows=1 width=278) (actual time=0.014..0.014 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Seq Scan on stoqs_platform (cost=0.00..1.07 rows=1 width=278) (actual time=0.008..0.012 rows=1 loops=1) Filter: ((name)::text = 'dorado'::text) Total runtime: 770.445 ms (42 rows) We assume that steps taken to improve the worst-case query scenario will also improve these kind of queries. If anything above pops out as needing better planning please let us know that too! Thanks, Mike -- Mike McCann Software Engineer Monterey Bay Aquarium Research Institute 7700 Sandholdt Road Moss Landing, CA 95039-9644 Voice: 831.775.1769 Fax: 831.775.1736 http://www.mbari.org