Try partitioning. It should sort you out. -Peter-
On 5/25/09, Ramiro Diaz Trepat <ram...@diaztrepat.name> wrote: > > Hello list,I will try to make this as brief as possible. > I have a brother who is a scientist studding atmospheric problems. He was > trying to handle all of his data with flat files and MatLab, when I stepped > in and said, wait, you need a RDBMS to handle all this data. > So, he basically has 2 very simple tables, one describing an atmosphere > pixel (latitude, longitude and height) and a couple of other simple values. > The other table has values of different variables meassured at each pixel. > The table with the atmosphere pixels, currently has about 140MM records, > and the one the values about 1000MM records. They should both grow to > about twice this size. > I have not started yet to deal with the largest table, I am only doing test > with the smaller one (140MM rows), and they much slower than what we were > expecting. > A simple query with no joins like this: > > select trajectory from atmospherepoint where moment='1979-05-02 11:45:00' > and height >= 12000 and height <= 14000 and longitude >= 180 and longitude > <= 190 and latitude >= 0 and latitude <= 10; > > is taking almost 4 minutes in a decent multi core server. Also, the moment > equality test should yield no more than 2MM rows, so I thought that should > make things a lot faster. > > The explain returns the suspicious "Seq Scan" up front: > > Seq Scan on atmospherepoint (cost=0.00..5461788.08 rows=134 width=8) > Filter: ((moment = '1979-05-02 11:45:00'::timestamp without time zone) > AND (height >= 12000::double precision) AND (height <= 14000::double > precision) AND (longitude >= 180::double precision) AND (longitude <= > 190::double precision) AND (latitude >= 0::double precision) AND (latitude > <= 10::double precision)) > > but I have created indices for every column in the table and I have also > runned ANALIZE and VACUUM: > > Table "public.atmospherepoint2" > Column | Type | Modifiers > ------------+-----------------------------+----------- > id | integer | not null > trajectory | integer | not null > moment | timestamp without time zone | not null > longitude | real | not null > latitude | real | not null > height | real | not null > Indexes: > "atmospherepoint2_pkey" PRIMARY KEY, btree (id) > "atm_height_idx" btree (height) > "atm_latitude_idx" btree (latitude) > "atm_longitude_idx" btree (longitude) > "atm_moment_idx" btree (moment) > "atm_trajectory_idx" btree (trajectory) > Foreign-key constraints: > "atmospherepoint2_trajectory_fkey" FOREIGN KEY (trajectory) REFERENCES > trajectory2(id) > > > I will welcome a lot any advice or pointer on how to tune these tables to > work faster. > The tables don't change at all once the data has been loaded, they are only > used for queries. > Thank you very much. > > > r. > > >