[PERFORM] Suggestions for benchmarking 7.4RC2 against 7.3
Hi, I plan to put 7.4-RC2 in our production servers in next few hours. Since the hardware config & the performance related GUCs parameter are going to remain the same i am interested in seeing the performance improvements in 7.4 as compared 7.3 . For this i plan to use the OSDB 0.14 and compare the results for both the cases. Does any one has suggestions for comparing 7.4 against 7.3 ? Since i am using OSDB for second time only any tips/guidance on usage of that is also appreciated. H/W config: CPU: 4 X Intel(R) Xeon(TM) CPU 2.00GHz MEM : 2 GB I/O config : PGDATA on 1 RPM Ultra160 scsi , pg_xlog on a similar seperate SCSI GUC: shared_buffers = 1 max_fsm_relations = 5000 max_fsm_pages = 55099264 sort_mem = 16384 vacuum_mem = 8192 All other performance related parameter have default value eg: #effective_cache_size = 1000# typically 8KB each #random_page_cost = 4 # units are one sequential page fetch cost #cpu_tuple_cost = 0.01 # (same) #cpu_index_tuple_cost = 0.001 # (same) #cpu_operator_cost = 0.0025 # (same) BTW i get following error at the moment: - /usr/local/bin/osdb-pg-ui --postgresql=no_hash_index "osdb" "Invoked: /usr/local/bin/osdb-pg-ui --postgresql=no_hash_index" create_tables()0.78 secondsreturn value = 0 load()1.02 secondsreturn value = 0 create_idx_uniques_key_bt()0.64 secondsreturn value = 0 create_idx_updates_key_bt()0.61 secondsreturn value = 0 create_idx_hundred_key_bt()0.61 secondsreturn value = 0 create_idx_tenpct_key_bt()0.62 secondsreturn value = 0 create_idx_tenpct_key_code_bt()0.45 secondsreturn value = 0 create_idx_tiny_key_bt()0.46 secondsreturn value = 0 create_idx_tenpct_int_bt()0.46 secondsreturn value = 0 create_idx_tenpct_signed_bt()0.45 secondsreturn value = 0 create_idx_uniques_code_h()0.46 secondsreturn value = 0 create_idx_tenpct_double_bt()0.46 secondsreturn value = 0 create_idx_updates_decim_bt()0.45 secondsreturn value = 0 create_idx_tenpct_float_bt()0.46 secondsreturn value = 0 create_idx_updates_int_bt()0.46 secondsreturn value = 0 create_idx_tenpct_decim_bt()0.46 secondsreturn value = 0 create_idx_hundred_code_h()0.45 secondsreturn value = 0 create_idx_tenpct_name_h()0.46 secondsreturn value = 0 create_idx_updates_code_h()0.46 secondsreturn value = 0 create_idx_tenpct_code_h()0.45 secondsreturn value = 0 create_idx_updates_double_bt()0.46 secondsreturn value = 0 create_idx_hundred_foreign()0.41 secondsreturn value = 0 populateDataBase()11.54 seconds return value = 0 Error in test Counting tuples at (6746)osdb.c:294: ... empty database -- empty results perror() reports: Resource temporarily unavailable someone sighup'd the parent Any clue? -- Regards Mallah. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Suggestions for benchmarking 7.4RC2 against 7.3
the error mentioned in first email has been overcome by running osdb on the same machine hosting the DB server. regds mallah. Rajesh Kumar Mallah wrote: Hi, I plan to put 7.4-RC2 in our production servers in next few hours. Since the hardware config & the performance related GUCs parameter are going to remain the same i am interested in seeing the performance improvements in 7.4 as compared 7.3 . For this i plan to use the OSDB 0.14 and compare the results for both the cases. Does any one has suggestions for comparing 7.4 against 7.3 ? Since i am using OSDB for second time only any tips/guidance on usage of that is also appreciated. H/W config: CPU: 4 X Intel(R) Xeon(TM) CPU 2.00GHz MEM : 2 GB I/O config : PGDATA on 1 RPM Ultra160 scsi , pg_xlog on a similar seperate SCSI GUC: shared_buffers = 1 max_fsm_relations = 5000 max_fsm_pages = 55099264 sort_mem = 16384 vacuum_mem = 8192 All other performance related parameter have default value eg: #effective_cache_size = 1000# typically 8KB each #random_page_cost = 4 # units are one sequential page fetch cost #cpu_tuple_cost = 0.01 # (same) #cpu_index_tuple_cost = 0.001 # (same) #cpu_operator_cost = 0.0025 # (same) BTW i get following error at the moment: - /usr/local/bin/osdb-pg-ui --postgresql=no_hash_index "osdb" "Invoked: /usr/local/bin/osdb-pg-ui --postgresql=no_hash_index" create_tables()0.78 secondsreturn value = 0 load()1.02 secondsreturn value = 0 create_idx_uniques_key_bt()0.64 secondsreturn value = 0 create_idx_updates_key_bt()0.61 secondsreturn value = 0 create_idx_hundred_key_bt()0.61 secondsreturn value = 0 create_idx_tenpct_key_bt()0.62 secondsreturn value = 0 create_idx_tenpct_key_code_bt()0.45 secondsreturn value = 0 create_idx_tiny_key_bt()0.46 secondsreturn value = 0 create_idx_tenpct_int_bt()0.46 secondsreturn value = 0 create_idx_tenpct_signed_bt()0.45 secondsreturn value = 0 create_idx_uniques_code_h()0.46 secondsreturn value = 0 create_idx_tenpct_double_bt()0.46 secondsreturn value = 0 create_idx_updates_decim_bt()0.45 secondsreturn value = 0 create_idx_tenpct_float_bt()0.46 secondsreturn value = 0 create_idx_updates_int_bt()0.46 secondsreturn value = 0 create_idx_tenpct_decim_bt()0.46 secondsreturn value = 0 create_idx_hundred_code_h()0.45 secondsreturn value = 0 create_idx_tenpct_name_h()0.46 secondsreturn value = 0 create_idx_updates_code_h()0.46 secondsreturn value = 0 create_idx_tenpct_code_h()0.45 secondsreturn value = 0 create_idx_updates_double_bt()0.46 secondsreturn value = 0 create_idx_hundred_foreign()0.41 secondsreturn value = 0 populateDataBase()11.54 seconds return value = 0 Error in test Counting tuples at (6746)osdb.c:294: ... empty database -- empty results perror() reports: Resource temporarily unavailable someone sighup'd the parent Any clue? -- Regards Mallah. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] *very* slow query to summarize data for a month ...
Dennis Bjorklund <[EMAIL PROTECTED]> writes: > On Mon, 10 Nov 2003, Marc G. Fournier wrote: > > > > > explain analyze SELECT ts.company_id, company_name, SUM(ts.bytes) AS total_traffic > > FROM company c, traffic_logs ts > >WHERE c.company_id = ts.company_id > > AND month_trunc(ts.runtime) = '2003-10-01' > > GROUP BY company_name,ts.company_id; So depending on how much work you're willing to do there are some more dramatic speedups you could get: Use partial indexes like this (you'll need one for every month): create index i on traffic_log (company_id) where month_trunc(runtime) = '2003-10-01' then group by company_id only so it can use the index: select * from company join ( select company_id, sum(bytes) as total_traffic from traffic_log where month_trunc(runtime) = '2003-10-01' group by company_id ) as x using (company_id) order by company_name Actually you might be able to get the same effect using function indexes like: create index i on traffic_log (month_trunc(runtime), company_id) -- greg ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] *very* slow query to summarize data for a month ...
On Tue, 11 Nov 2003, Greg Stark wrote: > Actually you might be able to get the same effect using function indexes > like: > > create index i on traffic_log (month_trunc(runtime), company_id) had actually thought of that one ... is it something that is only available in v7.4? ams=# create index i on traffic_logs ( month_trunc(runtime), company_id ); ERROR: parser: parse error at or near "," at character 54 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Suggestions for benchmarking 7.4RC2 against 7.3
After a long battle with technology,[EMAIL PROTECTED] (Rajesh Kumar Mallah), an earthling, wrote: > the error mentioned in first email has been overcome > by running osdb on the same machine hosting the DB server. Yes, it seems unrealistic to try to run the "client" on a separate host from the database. I got the osdb benchmark running last week, and had to separate client from server. I had to jump through a fair number of hoops including copying data files over to the server. The benchmark software needs a bit more work... -- let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];; http://cbbrowne.com/info/lsf.html Nobody can fix the economy. Nobody can be trusted with their finger on the button. Nobody's perfect. VOTE FOR NOBODY. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Suggestions for benchmarking 7.4RC2 against 7.3
Rajesh, Chris, > I got the osdb benchmark running last week, and had to separate client > from server. I had to jump through a fair number of hoops including > copying data files over to the server. The benchmark software needs a > bit more work... What about the OSDL's TPC-derivative benchmarks? That's a much more respected database test, and probably less buggy than OSDB. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] *very* slow query to summarize data for a month ...
marc, > had actually thought of that one ... is it something that is only > available in v7.4? Yes. New feature. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] *very* slow query to summarize data for a month ...
On Tue, 11 Nov 2003, Dennis Bjorklund wrote: > On Mon, 10 Nov 2003, Marc G. Fournier wrote: > > > > > explain analyze SELECT ts.company_id, company_name, SUM(ts.bytes) AS total_traffic > > FROM company c, traffic_logs ts > >WHERE c.company_id = ts.company_id > > AND month_trunc(ts.runtime) = '2003-10-01' > > GROUP BY company_name,ts.company_id; > > What if you do > > ts.runtime >= '2003-10-01' AND ts.runtime < '2003-11-01' > > and add an index like (runtime, company_name, company_id)? Good thought, but even simplifying it to the *lowest* query possible, with no table joins, is painfully slow: explain analyze SELECT ts.company_id, SUM(ts.bytes) AS total_traffic FROM traffic_logs ts WHERE month_trunc(ts.runtime) = '2003-10-01' GROUP BY ts.company_id; QUERY PLAN -- Aggregate (cost=31630.84..31693.05 rows=829 width=16) (actual time=14862.71..26552.39 rows=144 loops=1) -> Group (cost=31630.84..31672.31 rows=8295 width=16) (actual time=9634.28..20967.07 rows=462198 loops=1) -> Sort (cost=31630.84..31651.57 rows=8295 width=16) (actual time=9634.24..12838.73 rows=462198 loops=1) Sort Key: company_id -> Index Scan using tl_month on traffic_logs ts (cost=0.00..31090.93 rows=8295 width=16) (actual time=0.26..6043.35 rows=462198 loops=1) Index Cond: (month_trunc(runtime) = '2003-10-01 00:00:00'::timestamp without time zone) Total runtime: 26659.35 msec (7 rows) -OR- explain analyze SELECT ts.company_id, SUM(ts.bytes) AS total_traffic FROM traffic_logs ts WHERE ts.runtime >= '2003-10-01' AND ts.runtime < '2003-11-01' GROUP BY ts.company_id; QUERY PLAN -- Aggregate (cost=81044.53..84424.21 rows=45062 width=16) (actual time=13307.52..29274.66 rows=144 loops=1) -> Group (cost=81044.53..83297.65 rows=450625 width=16) (actual time=10809.02..-673265.13 rows=462198 loops=1) -> Sort (cost=81044.53..82171.09 rows=450625 width=16) (actual time=10808.99..14069.79 rows=462198 loops=1) Sort Key: company_id -> Seq Scan on traffic_logs ts (cost=0.00..38727.35 rows=450625 width=16) (actual time=0.07..6801.92 rows=462198 loops=1) Filter: ((runtime >= '2003-10-01 00:00:00'::timestamp without time zone) AND (runtime < '2003-11-01 00:00:00'::timestamp without time zone)) Total runtime: 29385.97 msec (7 rows) Just as a side note, just doing a straight scan for the records, with no SUM()/GROUP BY involved, with the month_trunc() index is still >8k msec: QUERY PLAN Index Scan using tl_month on traffic_logs ts (cost=0.00..31096.36 rows=8297 width=16) (actual time=0.96..5432.93 rows=462198 loops=1) Index Cond: (month_trunc(runtime) = '2003-10-01 00:00:00'::timestamp without time zone) Total runtime: 8092.88 msec (3 rows) and without the index, >15k msec: QUERY PLAN Seq Scan on traffic_logs ts (cost=0.00..38719.55 rows=8297 width=16) (actual time=0.11..11354.45 rows=462198 loops=1) Filter: (date_trunc('month'::text, runtime) = '2003-10-01 00:00:00'::timestamp without time zone) Total runtime: 15353.57 msec (3 rows) so the GROUP BY is affecting the overall, but even without it, its still taking a helluva long time ... I'm going to modify my load script so that it dumps monthly totals to traffic_logs, and 'details' to a schema.traffic_logs table ... I don't need the 'per day totals' at the top level at all, only speed ... the 'per day totals' are only required at the 'per client' level, and by moving the 'per day' into a client schema will shrink the table significantly ... If it wasn't for trying to pull in that 'whole month' summary, it would be fine :( ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] *very* slow query to summarize data for a month ...
"Marc G. Fournier" <[EMAIL PROTECTED]> writes: > On Tue, 11 Nov 2003, Greg Stark wrote: > > > Actually you might be able to get the same effect using function indexes > > like: > > > > create index i on traffic_log (month_trunc(runtime), company_id) > > had actually thought of that one ... is it something that is only > available in v7.4? Hum, I thought you could do simple functional indexes like that in 7.3, but perhaps only single-column indexes. In any case, given your situation I would seriously consider putting a "month" integer column on your table anyways. Then your index would be a simple (month, company_id) index. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Suggestions for benchmarking 7.4RC2 against 7.3
Josh Berkus wrote: Rajesh, Chris, I got the osdb benchmark running last week, and had to separate client from server. I had to jump through a fair number of hoops including copying data files over to the server. The benchmark software needs a bit more work... What about the OSDL's TPC-derivative benchmarks? That's a much more respected database test, and probably less buggy than OSDB. Hmm... really sorry! my pg_dump | psql is almost finishing in next 20 mins. creating indexes at the moment :) Really sorry can't rollback and delay anymore becoz my website is *unavailable* for past 30 mins. I ran OSDB .15 version and pg_bench . Regds Mallah.
Re: [PERFORM] Suggestions for benchmarking 7.4RC2 against 7.3
RC2 is running in production without any apparent problems till now. Well its difficult to say at the moment how much speed gain is there unless the heavy duty batch SQL scripts are run by cron. Count(*) and group by on large tables are significantly (5x) faster and better error reporting has made it easier to spot the faulty data. eg in fkey violation. Will post the OSDB .15 versions' results on 7.3 & 7.4 soon. Regds Mallah. Christopher Browne wrote: After a long battle with technology,[EMAIL PROTECTED] (Rajesh Kumar Mallah), an earthling, wrote: the error mentioned in first email has been overcome by running osdb on the same machine hosting the DB server. Yes, it seems unrealistic to try to run the "client" on a separate host from the database. I got the osdb benchmark running last week, and had to separate client from server. I had to jump through a fair number of hoops including copying data files over to the server. The benchmark software needs a bit more work...
Re: [PERFORM] *very* slow query to summarize data for a month ...
On 11 Nov 2003, Greg Stark wrote: > "Marc G. Fournier" <[EMAIL PROTECTED]> writes: > > > On Tue, 11 Nov 2003, Greg Stark wrote: > > > > > Actually you might be able to get the same effect using function indexes > > > like: > > > > > > create index i on traffic_log (month_trunc(runtime), company_id) > > > > had actually thought of that one ... is it something that is only > > available in v7.4? > > Hum, I thought you could do simple functional indexes like that in 7.3, but > perhaps only single-column indexes. > > In any case, given your situation I would seriously consider putting a > "month" integer column on your table anyways. Then your index would be a > simple (month, company_id) index. In 7.3 and before, you had to use only column names as inputs, so you could cheat: alter table test add alp int; alter table test add omg int; update test set alp=0; update test set omg=13; and then create a functional index: create index test_xy on test (substr(info,alp,omg)); select * from test where substr(info,alp,omg)=='abcd'; ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] Value of Quad vs. Dual Processor machine
We are getting ready to spec out a new machine and are wondering about the wisdom of buying a quad versus a dual processor machine. Seing as how postgres in not a threaded application, and this server will only be used for log/transaction analysis (it will only ever have a few large queries running). Is there any performance to be gained, and if so is it worth the large cost? Any thoughts/experience are much appreciated... -- Chris Field [EMAIL PROTECTED] Affinity Solutions Inc. 386 Park Avenue South Suite 1209 New York, NY 10016 (212) 685-8748 ext. 32 signature.asc Description: This is a digitally signed message part
Re: [PERFORM] Server Configs
Dear Gurus, We are planning to add more db server hardware for the apps. The question is, what makes more sense regarding performance/scalability/price of the hardware... There are a couple of apps, currently on a dual-cpu Dell server. The usage of the apps is going to increase quite a lot, and considering the prices, we are looking at the following options: Option 1: == Have each app on a separate db server (looking at 4 of these). The server being a PowerEdge 2650, Dual 2.8GHz/512KB XEONS, 2GB RAM, PERC-3 RAID-5, split back plane (2+3), and 5 x 36GB HDDs (10K RPM). Note: These servers are 1/3 the price of the Quad-cpu 6650 server. Option 2: == Have two to three apps dbs hosted on a single server. The server being a PowerEdge 6650, 4 x 2GHz/1MB XEONS, 8GB RAM, PERC-3 RAID-5, split back plane (2+3), and 5 x 36GB HDDs (10K RPM). Note: This server is 3 times more the price of the option 1. Appreciate your guidance. Thanks, Anjan ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Value of Quad vs. Dual Processor machine
On Tue, 2003-11-11 at 18:32, Chris Field wrote: > We are getting ready to spec out a new machine and are wondering about > the wisdom of buying a quad versus a dual processor machine. Seing as > how postgres in not a threaded application, and this server will only be > used for log/transaction analysis (it will only ever have a few large > queries running). Is there any performance to be gained, and if so is > it worth the large cost? Any thoughts/experience are much > appreciated... Since you're asking the question, I'll assume you don't have CPU intensive queries or monstrous loads. I'd probably invest in a Quad system with 2 chips in it (2 empty sockets) and put the difference in funds into a few extra GB of Ram or improved IO. In 6 months or a year, if you start doing longer or more complex queries, toss in the other 2 chips. So long as you don't hit a memory limit, it'll be fine. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Value of Quad vs. Dual Processor machine
On Tue, 11 Nov 2003, Rod Taylor wrote: > On Tue, 2003-11-11 at 18:32, Chris Field wrote: > > We are getting ready to spec out a new machine and are wondering about > > the wisdom of buying a quad versus a dual processor machine. Seing as > > how postgres in not a threaded application, and this server will only be > > used for log/transaction analysis (it will only ever have a few large > > queries running). Is there any performance to be gained, and if so is > > it worth the large cost? Any thoughts/experience are much > > appreciated... > > Since you're asking the question, I'll assume you don't have CPU > intensive queries or monstrous loads. > > I'd probably invest in a Quad system with 2 chips in it (2 empty > sockets) and put the difference in funds into a few extra GB of Ram or > improved IO. > > In 6 months or a year, if you start doing longer or more complex > queries, toss in the other 2 chips. So long as you don't hit a memory > limit, it'll be fine. Note that you want to carefully look at the difference in cost of the motherboard versus the CPUs. It's often the motherboard that raises the cost, not the CPUs so much. Although with Xeons, the CPUs are not cheap. The second issue is that Intel (and AMD probably) only guarantee proper performance from chips int he same batch, so you may wind up replacing the two working CPUs with two new ones to go with the other two you'll be buying, to make sure that they work together. My guess is that more CPUs aren't gonna help this problem a lot, so look more at fast RAM and lots of it, as well as a fast I/O subsystem. 2 CPUs should be plenty. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Value of Quad vs. Dual Processor machine
On 2003-11-11T17:40:14-0700, scott.marlowe wrote: > 2 CPUs should be plenty. for everyone? No, I must have been thinking of someone else :-) /Allan -- Allan Wind P.O. Box 2022 Woburn, MA 01888-0022 USA signature.asc Description: Digital signature
Re: [PERFORM] Value of Quad vs. Dual Processor machine
> On Tue, 11 Nov 2003, Rod Taylor wrote: > >> On Tue, 2003-11-11 at 18:32, Chris Field wrote: >> > We are getting ready to spec out a new machine and are wondering about >> > the wisdom of buying a quad versus a dual processor machine. Seing as >> > how postgres in not a threaded application, and this server will only >> be >> > used for log/transaction analysis (it will only ever have a few large >> > queries running). Is there any performance to be gained, and if so is >> > it worth the large cost? Any thoughts/experience are much >> > appreciated... >> >> Since you're asking the question, I'll assume you don't have CPU >> intensive queries or monstrous loads. >> >> I'd probably invest in a Quad system with 2 chips in it (2 empty >> sockets) and put the difference in funds into a few extra GB of Ram or >> improved IO. >> >> In 6 months or a year, if you start doing longer or more complex >> queries, toss in the other 2 chips. So long as you don't hit a memory >> limit, it'll be fine. > > Note that you want to carefully look at the difference in cost of the > motherboard versus the CPUs. It's often the motherboard that raises the > cost, not the CPUs so much. Although with Xeons, the CPUs are not cheap. > > The second issue is that Intel (and AMD probably) only guarantee proper > performance from chips int he same batch, so you may wind up replacing the > two working CPUs with two new ones to go with the other two you'll be > buying, to make sure that they work together. > > My guess is that more CPUs aren't gonna help this problem a lot, so look > more at fast RAM and lots of it, as well as a fast I/O subsystem. > > 2 CPUs should be plenty. I agree that the additional cpus won't help as much since I haven't found any benefits in terms of individual query speed for a quad vs. an smp on benchmarks I've run on test machines I was considering purchasing. Quads are also expensive - on similar architectures the quad was 20k vs 7k for the dual. > > > ---(end of broadcast)--- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) > ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Value of Quad vs. Dual Processor machine
On Tue, 2003-11-11 at 17:32, Chris Field wrote: > We are getting ready to spec out a new machine and are wondering about > the wisdom of buying a quad versus a dual processor machine. Seing as > how postgres in not a threaded application, and this server will only be > used for log/transaction analysis (it will only ever have a few large > queries running). Is there any performance to be gained, and if so is > it worth the large cost? Any thoughts/experience are much > appreciated... Xeon or Opteron? The faster Opterons *really* blaze, especially in 64-bit mode. As others have said, though, RAM and I/O are most important. -- - Ron Johnson, Jr. [EMAIL PROTECTED] Jefferson, LA USA "As I like to joke, I may have invented it, but Microsoft made it popular" David Bradley, regarding Ctrl-Alt-Del ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Value of Quad vs. Dual Processor machine
we are looking at Xeon, We are currently running it on a quad sun v880 compiled to be 64bit and have been getting dreadful performance. I don't think we really have much to gain from going 64bit. - Original Message - From: "Ron Johnson" <[EMAIL PROTECTED]> To: "PgSQL Performance ML" <[EMAIL PROTECTED]> Sent: Tuesday, November 11, 2003 8:24 PM Subject: Re: [PERFORM] Value of Quad vs. Dual Processor machine > On Tue, 2003-11-11 at 17:32, Chris Field wrote: > > We are getting ready to spec out a new machine and are wondering about > > the wisdom of buying a quad versus a dual processor machine. Seing as > > how postgres in not a threaded application, and this server will only be > > used for log/transaction analysis (it will only ever have a few large > > queries running). Is there any performance to be gained, and if so is > > it worth the large cost? Any thoughts/experience are much > > appreciated... > > Xeon or Opteron? The faster Opterons *really* blaze, especially > in 64-bit mode. As others have said, though, RAM and I/O are most > important. > > -- > - > Ron Johnson, Jr. [EMAIL PROTECTED] > Jefferson, LA USA > > "As I like to joke, I may have invented it, but Microsoft made it > popular" > David Bradley, regarding Ctrl-Alt-Del > > > ---(end of broadcast)--- > TIP 5: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faqs/FAQ.html > > ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Value of Quad vs. Dual Processor machine
One thing I learned after spending about a week comparing the Athlon (2 ghz, 333 mhz frontside bus) and Xeon (2.4 ghz, 266 mhz frontside bus) platforms was that on average the select queries I was benchmarking ran 30% faster on the Athlon (this was with data cached in memory so may not apply to the larger data sets where I/O is the limiting factor.) I benchmarked against the Opteron 244 when it came out and it came in about the same as the Athlon (makes sense since both were 333 mhz memory). The results within +/- 5-10% that of the Athlon. From testing against a couple of other machines I noticed that the memory bus speeds were almost directly proportional to the query times under these conditions. Not sure how these compare against the quad sun but the AMD chips returned the select queries faster than the Xeons from the informal investigations I did. Definitely try it before you buy it if possible. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Field Sent: Tuesday, November 11, 2003 6:13 PM To: Ron Johnson; PgSQL Performance ML Subject: Re: [PERFORM] Value of Quad vs. Dual Processor machine we are looking at Xeon, We are currently running it on a quad sun v880 compiled to be 64bit and have been getting dreadful performance. I don't think we really have much to gain from going 64bit. - Original Message - From: "Ron Johnson" <[EMAIL PROTECTED]> To: "PgSQL Performance ML" <[EMAIL PROTECTED]> Sent: Tuesday, November 11, 2003 8:24 PM Subject: Re: [PERFORM] Value of Quad vs. Dual Processor machine > On Tue, 2003-11-11 at 17:32, Chris Field wrote: > > We are getting ready to spec out a new machine and are wondering > > about the wisdom of buying a quad versus a dual processor machine. > > Seing as how postgres in not a threaded application, and this server > > will only be used for log/transaction analysis (it will only ever > > have a few large queries running). Is there any performance to be > > gained, and if so is it worth the large cost? Any > > thoughts/experience are much appreciated... > > Xeon or Opteron? The faster Opterons *really* blaze, especially in > 64-bit mode. As others have said, though, RAM and I/O are most > important. > > -- > - > Ron Johnson, Jr. [EMAIL PROTECTED] > Jefferson, LA USA > > "As I like to joke, I may have invented it, but Microsoft made it > popular" David Bradley, regarding Ctrl-Alt-Del > > > ---(end of > broadcast)--- > TIP 5: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faqs/FAQ.html > > ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] *very* slow query to summarize data for a month ...
"Marc G. Fournier" <[EMAIL PROTECTED]> writes: > Just as a side note, just doing a straight scan for the records, with no > SUM()/GROUP BY involved, with the month_trunc() index is still >8k msec: Well so the problem isn't the query at all, you just have too much data to massage online. You can preprocess the data offline into a more managable amount of data for your online reports. What I used to do for a similar situation was to do hourly queries sort of like this: insert into data_aggregate (day, hour, company_id, total_bytes) (select trunc(now(),'day'), trunc(now(), 'hour'), company_id, sum(bytes) from raw_data where time between trunc(now(),'hour') and trunc(now(),'hour')+'1 hour'::interval group by company_id ) [this was actually on oracle and the data looked kind of different, i'm making this up as i go along] Then later the reports could run quickly based on data_aggregate instead of slowly based on the much larger data set accumulated by the minute. Once I had this schema set up it was easy to follow it for all of the rapidly growing data tables. Now in my situation I had thousands of records accumulating per second, so hourly was already a big win. I originally chose hourly because I thought I might want time-of-day reports but that never panned out. On the other hand it was a win when the system broke once because I could easily see that and fix it before midnight when it would have actually mattered. Perhaps in your situation you would want daily aggregates or something else. One of the other advantages of these aggregate tables was that we could purge the old data much sooner with much less resistance from the business. Since the reports were all still available and a lot of ad-hoc queries could still be done without the raw data anyways. Alternatively you can just give up on online reports. Eventually you'll have some query that takes way more than 8s anyways. You can pregenerate the entire report as a batch job instead. Either send it off as a nightly e-mail, store it as an html or csv file for the web server, or (my favourite) store the data for the report as an sql table and then have multiple front-ends that do a simple "select *" to pull the data and format it. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match