subject:"\[PERFORM\] Air\-traffic benchmark"

Re: [PERFORM] Air-traffic benchmark

2010-02-04 Thread Simon Riggs

On Thu, 2010-01-07 at 13:38 +0100, Lefteris wrote: > Reported query times are (in sec): > MonetDB 7.9s > InfoBright 12.13s > LucidDB 54.8s It needs to be pointed out that those databases are specifically optimised for Data Warehousing, whereas Postgres core is optimised for concurrent write worklo

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Craig Ringer

On 8/01/2010 2:11 AM, Nikolas Everett wrote: This table is totally unnormalized. Normalize it and try again. You'll probably see a huge speedup. Maybe even 10x. My mantra has always been less data stored means less data to scan means faster scans. Sometimes one intentionally denormalizes st

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Tom Lane

Craig Ringer writes: > Can Pg even read partial records ? I thought it all operated on a page > level, where if an index indicates that a particular value is present on > a page the whole page gets read in and all records on the page are > checked for the value of interest. No? The whole page

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Craig Ringer

On 7/01/2010 11:45 PM, Gurgel, Flavio wrote: The table is very wide, which is probably why the tested databases can deal with it faster than PG. You could try and narrow the table down (for instance: remove the Div* fields) to make the data more "relational-like". In real life, speedups in t

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Greg Smith

hubert depesz lubaczewski wrote: Well, this query basically has to be slow. Correct approach to this problem is to add precalculated aggregates... The point of this data set and associated queries is to see how fast the database can do certain types of queries on its own. Some other types of

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread hubert depesz lubaczewski

On Thu, Jan 07, 2010 at 01:38:41PM +0100, Lefteris wrote: > airtraffic=# EXPLAIN ANALYZE SELECT "DayOfWeek", count(*) AS c FROM > ontime WHERE "Year" BETWEEN 2000 AND 2008 GROUP BY "DayOfWeek" ORDER > BY c DESC; Well, this query basically has to be slow. Correct approach to this problem is to add

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Lefteris

On Thu, Jan 7, 2010 at 11:57 PM, Lefteris wrote: > Hi Greg, > > thank you for your help. The changes I did on the dataset was just > removing the last comma from the CSV files as it was interpreted by pg > as an extra column. The schema I used, the load script and queries can > be found at: > > ht

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Lefteris

Hi Greg, thank you for your help. The changes I did on the dataset was just removing the last comma from the CSV files as it was interpreted by pg as an extra column. The schema I used, the load script and queries can be found at: http://homepages.cwi.nl/~lsidir/postgres/ (I understood that if I

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Greg Smith

Lefteris wrote: So we all agree that the problem is on the scans:) So the next question is why changing shared memory buffers will fix that? i only have one session with one connection, do I have like many reader workers or something? I wouldn't expect it to. Large sequential scans like th

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Nikolas Everett

This table is totally unnormalized. Normalize it and try again. You'll probably see a huge speedup. Maybe even 10x. My mantra has always been less data stored means less data to scan means faster scans. On Thu, Jan 7, 2010 at 12:57 PM, Gurgel, Flavio wrote: > - "Matthew Wakeling" escreve

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Gurgel, Flavio

- "Scott Marlowe" escreveu: > You do know that indexes in postgresql are not "covering" right? > I.e. > after hitting the index, the db then has to hit the table to see if > those rows are in fact visible. So there's no such thing in pgsql, > at > the moment, as an index only scan. That w

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Scott Marlowe

On Thu, Jan 7, 2010 at 10:57 AM, Gurgel, Flavio wrote: > - "Matthew Wakeling" escreveu: >> On Thu, 7 Jan 2010, Gurgel, Flavio wrote: >> Postgres does not change a query plan according to the shared_buffers >> >> setting. It does not anticipate one step contributing to another step >> in >> th

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Gurgel, Flavio

- "Matthew Wakeling" escreveu: > On Thu, 7 Jan 2010, Gurgel, Flavio wrote: > Postgres does not change a query plan according to the shared_buffers > > setting. It does not anticipate one step contributing to another step > in > this way. It does however make use of the effective_cache_size s

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Craig James

Alvaro Herrera wrote: No amount of tinkering is going to change the fact that a seqscan is the fastest way to execute these queries. Even if you got it to be all in memory, it would still be much slower than the other systems which, I gather, are using columnar storage and thus are perfectly sui

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Matthew Wakeling

On Thu, 7 Jan 2010, Gurgel, Flavio wrote: If one single query execution had a step that brought a page to the buffercache, it's enough to increase another step speed and change the execution plan, since the data access in memory is (usually) faster then disk. Postgres does not change a query

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Lefteris

On Thu, Jan 7, 2010 at 4:57 PM, Ivan Voras wrote: > 2010/1/7 Lefteris : >> On Thu, Jan 7, 2010 at 3:51 PM, Ivan Voras wrote: >>> On 7.1.2010 15:23, Lefteris wrote: >>> I think what you all said was very helpful and clear! The only part that I still disagree/don't understand is the share

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Ivan Voras

2010/1/7 Lefteris : > On Thu, Jan 7, 2010 at 3:51 PM, Ivan Voras wrote: >> On 7.1.2010 15:23, Lefteris wrote: >> >>> I think what you all said was very helpful and clear! The only part >>> that I still disagree/don't understand is the shared_buffer option:)) >> >> Did you ever try increasing share

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Gurgel, Flavio

- "Lefteris" escreveu: > > Did you ever try increasing shared_buffers to what was suggested > (around > > 4 GB) and see what happens (I didn't see it in your posts)? > > No I did not to that yet, mainly because I need the admin of the > machine to change the shmmax of the kernel and also beca

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Grzegorz Jaśkiewicz

On Thu, Jan 7, 2010 at 3:05 PM, Lefteris wrote: > On Thu, Jan 7, 2010 at 3:51 PM, Ivan Voras wrote: >> On 7.1.2010 15:23, Lefteris wrote: >> >>> I think what you all said was very helpful and clear! The only part >>> that I still disagree/don't understand is the shared_buffer option:)) >> >> Did

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Lefteris

On Thu, Jan 7, 2010 at 3:51 PM, Ivan Voras wrote: > On 7.1.2010 15:23, Lefteris wrote: > >> I think what you all said was very helpful and clear! The only part >> that I still disagree/don't understand is the shared_buffer option:)) > > Did you ever try increasing shared_buffers to what was sugges

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Ivan Voras

On 7.1.2010 15:23, Lefteris wrote: > I think what you all said was very helpful and clear! The only part > that I still disagree/don't understand is the shared_buffer option:)) Did you ever try increasing shared_buffers to what was suggested (around 4 GB) and see what happens (I didn't see it in

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Jochen Erwied

Thursday, January 7, 2010, 2:47:36 PM you wrote: > so I understand from all of you that you don't consider the use of 25k > for sorting to be the cause of the slowdown? Probably I am missing Maybe you are reading the plan wrong: - the sort needs only 25kB of memory, and finishes in sub-second ti

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Lefteris

On Thu, Jan 7, 2010 at 3:14 PM, Alvaro Herrera wrote: > Lefteris escribió: >> Yes, I am reading the plan wrong! I thought that each row from the >> plan reported the total time for the operation but it actually reports >> the starting and ending point. >> >> So we all agree that the problem is on

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Alvaro Herrera

Alvaro Herrera escribió: > No amount of tinkering is going to change the fact that a seqscan is the > fastest way to execute these queries. Even if you got it to be all in > memory, it would still be much slower than the other systems which, I > gather, are using columnar storage and thus are per

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Alvaro Herrera

Lefteris escribió: > Yes, I am reading the plan wrong! I thought that each row from the > plan reported the total time for the operation but it actually reports > the starting and ending point. > > So we all agree that the problem is on the scans:) > > So the next question is why changing shared

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Lefteris

Yes, I am reading the plan wrong! I thought that each row from the plan reported the total time for the operation but it actually reports the starting and ending point. So we all agree that the problem is on the scans:) So the next question is why changing shared memory buffers will fix that? i o

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread A. Kretschmer

In response to Lefteris : > Hi Arjen, > > so I understand from all of you that you don't consider the use of 25k > for sorting to be the cause of the slowdown? Probably I am missing > something on the specific sort algorithm used by PG. My RAM does fill > up, mainly by file buffers from linux, but

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread A. Kretschmer

In response to Lefteris : > Thank you all for your answers! > > Andrea, I see the other way around what you are saying: > > Sort (cost=7407754.12..7407754.13 rows=4 width=2) (actual > time=371188.821..371188.823 rows=7 loops=1) > Seq Scan on ontime (cost=0.00..7143875.40 rows=52775727 width=2)

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Lefteris

Hi Arjen, so I understand from all of you that you don't consider the use of 25k for sorting to be the cause of the slowdown? Probably I am missing something on the specific sort algorithm used by PG. My RAM does fill up, mainly by file buffers from linux, but postgres process remains to 0.1% cons

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Arjen van der Meijden

On 7-1-2010 13:38 Lefteris wrote: I decided to run the benchmark over postgres to get some more experience and insights. Unfortunately, the query times I got from postgres were not the expected ones: Why were they not expected? In the given scenario, column databases are having a huge advantag

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Lefteris

Thank you all for your answers! Andrea, I see the other way around what you are saying: Sort (cost=7407754.12..7407754.13 rows=4 width=2) (actual time=371188.821..371188.823 rows=7 loops=1) Seq Scan on ontime (cost=0.00..7143875.40 rows=52775727 width=2) (actual time=190938.959..346180.079 rows

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread A. Kretschmer

In response to Lefteris : > > airtraffic=# EXPLAIN ANALYZE SELECT "DayOfWeek", count(*) AS c FROM > ontime WHERE "Year" BETWEEN 2000 AND 2008 GROUP BY "DayOfWeek" ORDER > BY c DESC; >QUERY > PLAN >

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Gurgel, Flavio

Hello - "Lefteris" escreveu: > Hi all, > > following the simple but interesting air-traffic benchmark published > at: > http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/ Quite interesting test, if you have the time to download all

[PERFORM] Air-traffic benchmark

2010-01-07 Thread Lefteris

Hi all, following the simple but interesting air-traffic benchmark published at: http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/ I decided to run the benchmark over postgres to get some more experience and insights. Unfortunately, the

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

Re: [PERFORM] Air-traffic benchmark

[PERFORM] Air-traffic benchmark

34 matches

Site Navigation

Mail list logo

Footer information