[GENERAL] very large tables
Thank you all very much for your thorough replies. To be honest, many of the things you mention I had not heard of before. It seems that the clustering by index must be the way to go. I have to consult now if most queries will benefit from it. I do not control directly the installation parameters of the database, I do not have root access to that server but I may be able to find another one where I can controll these parameteres. The configuratoin parameters I should look to increase are the ones related to shared memory then? I know it is some sort of worst combination, to have such a large database with an inexperienced dba as myself :) r. On Mon, May 25, 2009 at 12:17 PM, Alban Hertroys < dal...@solfertje.student.utwente.nl> wrote: > On May 25, 2009, at 10:58 AM, Ramiro Diaz Trepat wrote: > > The table with the atmosphere pixels, currently has about 140MM records, >> and the one the values about 1000MM records. They should both grow to >> about twice this size. >> > > Did you tune postgres to use the available resources? By default it comes > with rather modest settings so it will run on low-spec hardware without > trouble. For a dataset this size modest hardware is obviously out of the > question, and so the default config will likely not be suitable. > > >> A simple query with no joins like this: >> >> select trajectory from atmospherepoint where moment='1979-05-02 11:45:00' >> and height >= 12000 and height <= 14000 and longitude >= 180 and longitude >> <= 190 and latitude >= 0 and latitude <= 10; >> > > I think the BETWEEN operator would make that query a bit easier on the > eyes, like so: > > select trajectory from atmospherepoint where moment='1979-05-02 11:45:00' > and height between 12000 and 14000 and longitude between 180 and 190 and > latitude between 0 and 10; > > is taking almost 4 minutes in a decent multi core server. Also, the >> moment equality test should yield no more than 2MM rows, so I thought that >> should make things a lot faster. >> >> The explain returns the suspicious "Seq Scan" up front: >> >> Seq Scan on atmospherepoint (cost=0.00..5461788.08 rows=134 width=8) >> Filter: ((moment = '1979-05-02 11:45:00'::timestamp without time zone) >> AND (height >= 12000::double precision) AND (height <= 14000::double >> precision) AND (longitude >= 180::double precision) AND (longitude <= >> 190::double precision) AND (latitude >= 0::double precision) AND (latitude >> <= 10::double precision)) >> > > That query is expecting 134 rows, how accurate is that? An explain analyse > will show you that (and will take 4 minutes in this case). > > I think you'll need lots of memory if you want to keep those indexes in > memory. If I do a rough byte-count of the datatypes involved times the > number of records I think just the index data (w/o metadata) takes up 40B * > 140M rows = 5.6GB. Scanning an index on disk is likely to be slow, which may > well be one of the reasons PG decides on a sequential scan. > > What is the distribution of the data you're querying for? If it's all over > the place then the planner made a pretty good guess a sequential scan is > most optimal; random disk I/O isn't very optimal, sequential I/O is usually > much faster. > > Before solving that you'll want to figure out whether this is a typical > query, or that distributing the data differently will hurt other queries. If > it's a typical query, then clustering your data on an index and/or > partitioning will help. If it's not, it still will help this type of query > but it may hurt other types of queries. You'll want to optimise into the > right direction. > > As other people said already, partitioning is likely to help here. Your > data seems very suitable for partitioning too. I think I'd divide the > coordinates in a number of latitude/longitude slices (a grid seems suitable > to me, but I don't deal with atmospheric data). With 32 slices you'll have > <500k records per slice on average, whether that's sufficiently small you'll > have to test. > > Alternatively, if your data is clustered (on disk) by say > longitude/latitude it may already help to define partial indexes on > longitude latitude ranges, for example: > > CREATE INDEX atm_section1_idx (longitude, latitude, height) WHERE longitude > BETWEEN 180 AND 190 AND latitude BETWEEN 0 AND 10; > > This is very similar to partitioning your data, but it doesn't require > splitting the data up across multiple partitions (tables). This could be > faster if you would have queries spanning multiple table-partitions, but if > your data isn't clustered appropriately finding it on disk will require more > random I/O than a partitioned layout. > Choices choices... The only way to find out what works best is to test, I'm > afraid. > > but I have created indices for every column in the table and I have also >> runned ANALIZE and VACUUM: >> >> Table "public.atmospherepoint2" >> Column |Type | Modifiers >> +-
[GENERAL] Excel and postgresql functions
I've built some functions just for encapsulation reasons and avoiding to pass the same parameter over and over. I'd like to access the result from Excel but it seems (and I'm not pretty sure it is the definitive answer) excel can suck data just coming from views and tables. - can someone confirm Excel/Access 2003 can't return result sets coming from functions? - if Excel/Access can't return results from a function is there a way to masquerade the function (accepting parameters) behind a table/view. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Excel and postgresql functions
On Tue, 26 May 2009 11:41:50 +0200 Ivan Sergio Borgonovo wrote: > I've built some functions just for encapsulation reasons and > avoiding to pass the same parameter over and over. > > I'd like to access the result from Excel but it seems (and I'm not > pretty sure it is the definitive answer) excel can suck data just > coming from views and tables. > > - can someone confirm Excel/Access 2003 can't return result sets > coming from functions? > - if Excel/Access can't return results from a function is there a > way to masquerade the function (accepting parameters) behind a > table/view. For all the people forced to fight with a software that has to be tricked and not instructed... The fault is Excel. - Data -> Import External Data -> New Database Query - Choose the odbc connection you created for Postgresql - Don't add any query. - Edit directly "SQL" - save an easily "greppable" query (eg. select 'ermenegildo';) - Ignore complaint. - Close Microsoft Query. - Say OK to "Import data" (just select a good "top left corner for your table) - Tools -> Macro -> Visual Basic Script Editor - search your query and change it with any valid SQL, add ? if you need parameters taken from cells - save - you'll be asked which cell contain the parameter (I didn't try to use more than one parameter) Now you can have an arbitrary query returned in an Excel sheet. So yes... PostgreSQL can happily work with Excel. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How should I deal with disconnects during insert?
Hi, Sergey Samokhin writes: > Problem I'm talking about in this letter is related to how we deal > with extremely big amounts of data to be inserted into DB. Recently I > switched to using buffers. You might appreciate this blog entry: http://www.depesz.com/index.php/2007/07/05/how-to-insert-data-to-database-as-fast-as-possible/ Regards, -- dim -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] question on serial key
On 2009-05-22, Brandon Metcalf wrote: > g == gryz...@gmail.com writes: > > g> you should use it, whenever you need db to keep its own key internally. > g> Advantage of sequence is also the fact, that you can have the sequence > g> value used on different columns/tables . > > g> My rule of thumb is , in that case: as long as it is a short type (not > g> of toastable, or/and variable length), and as long as it won't change, > g> and is unique - I can use it. Otherwise, I use sequence to connect > g> rows internally for database. > g> First rule, is because of index access, and the way btree works. > g> Second is, because update of value will update other rows too - and > g> HOT won't help you here, so that's not efficient. And also, forcing it > g> to be unique is harder than. > > g> Hth. > > That does help. So, in my example of a table consisting of rows for > each periodic table element, the atomic number would suffice as a > unique key since, well, it's unique and not going to change. Right? Well, until some fool* wants to insert a row for deuterium into your table, and finds that spot taken by hydrogen. If you can guarantee that you chosen natural key is sufficient and i s not going to give trouble in the case of marriages, isomerism, twin birth, isotopes, centegenarians, or some other condition that makes a mockery of your chosen key then go for it. *assuming you want it to be fool-proof. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] quoting values magic
On 2009-05-22, Brandon Metcalf wrote: > Assume I have an UPDATE statement that looks like > > UPDATE foo > SET > pattern = '$pattern', > shape = '$shape', > length = $length, > comment = '$comment' > WHERE foo_id = $foo_id > > and length is defined as NUMERIC. Is there any kind of magic that > would allow me to use the SQL above as is even if $length is not > defined? no, but you can get the same effect in a different way. > In other words, I'd like to avoid having to modify the SQL > to include or not include "length = $length" based on whether or not > $length is defined as it's acceptable for it to be NULL in foo. > I can't say "length = '$length'" as '' is not valid input for NUMERIC. > > Hope that makes sense? option 1: $length = "length" If your language of choice (it appears to be similar to shell, PHP, or Perl) allows you to store the string value "length" in your $length variable then the existing values of length will be retained in the update. option 2: case when '$length' = '' ... you can use case like this: UPDATE foo SET pattern = '$pattern', shape = '$shape', length = case when '$length'='' then length else '$length' end, comment = '$comment' WHERE foo_id = $foo_id here you can substitute any value you choose for the empty string, 0 or NULL may (or may not) be more apropriate. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Re: Re: Re: Can not decompress a compressed string under plpy!
On 2009-05-21, Timmy wrote: > Yes, most of your guess is correct. > I'm using postgresql 8.3.x and ms windows 2000. > The compressed string is saved to the table in binary > format using the psycopg. I had set the table field to bytea data type. > I want to use the plpythonu to decompress the stored > string. > Althout I imported the zlib module, it still failed. > I can successfully get back the decompressed string if > I use the psycopg either inside or outside the plpythonu. > But if without psycopg, plpythonu itself can not decompress the string. > I can use the psycopg inside the plpythonu but I must need to write security > information (including user > name,password,server ip,port number,...) to the connection string inside > plpythonu too. > This is a problem. try running stripslashes on the returned value. bytea values are returned double-slashed. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] quoting values magic
j == ja...@xnet.co.nz writes: j> On 2009-05-22, Brandon Metcalf wrote: j> > Assume I have an UPDATE statement that looks like j> > j> > UPDATE foo j> > SET j> > pattern = '$pattern', j> > shape = '$shape', j> > length = $length, j> > comment = '$comment' j> > WHERE foo_id = $foo_id j> > j> > and length is defined as NUMERIC. Is there any kind of magic that j> > would allow me to use the SQL above as is even if $length is not j> > defined? j> no, but you can get the same effect in a different way. j> > In other words, I'd like to avoid having to modify the SQL j> > to include or not include "length = $length" based on whether or not j> > $length is defined as it's acceptable for it to be NULL in foo. j> > I can't say "length = '$length'" as '' is not valid input for NUMERIC. j> > j> > Hope that makes sense? j> option 1: $length = "length" j> If your language of choice (it appears to be similar to shell, PHP, or Perl) j> allows you to store the string value "length" in your $length variable j> then the existing values of length will be retained in the update. j> option 2: case when '$length' = '' ... j> you can use case like this: j>UPDATE foo j> SET j>pattern = '$pattern', j>shape = '$shape', j>length = case when '$length'='' then length else '$length' end, j>comment = '$comment' j> WHERE foo_id = $foo_id j> here you can substitute any value you choose for the empty string, j> 0 or NULL may (or may not) be more apropriate. The issue here is that these reduce back to my original problem. For example, if I use a CASE statement and I fall through to the ELSE, then the SQL is attempting to insert a "''" in a NUMERIC field which is not valid. That is, it's trying to do UPDATE foo SET length='' WHERE foo_id=$foo_id This fails. It's exactly this problem I'm trying to avoid by some slick use of SQL. I can always generate the SQL on the fly based on the value of $length, but I'd like to avoid this if possible. Maybe I'm misunderstanding your suggestion. -- Brandon -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Need beginning and ending date value for a particular week in the year
PG 8.1.17 For a given week number (2009w22) I need to calculate the beginning and ending date that makes up that particular week in the year. I want to use the beginning/ending date as part of a CHECK constraint on an inherited table, with each child table based on a week of the year. This is a function in the system someone wrote to get the current week of the year, with our week starting on a Monday, not Sunday: CREATE OR REPLACE FUNCTION get_weeknum (TIMESTAMP WITH TIME ZONE) RETURNS VARCHAR AS $$ DECLARE tstamp ALIAS FOR $1; dow INTEGER; BEGIN -- to_char(2005-01-01, "w"IW), for example, returns 2005w53 but we need 2004w53 so -- we return "w"IW for the thursday of the week of the specified date SELECT INTO dow to_char(tstamp, 'D'); -- "D" returns sun=1 - sat=7 but we need dow to be mon=1 - sun=7 IF dow = 1 THEN dow = 7; ELSE dow = dow - 1; END IF; RETURN to_char(tstamp - interval '1 Day' * (dow - 4), '"w"IW'); END; $$ LANGUAGE 'plpgsql'; So for the calculated week value (i.e. 2009w22) I need to be able to calculate the first and last day of the week (05/25/2009 and 05/31/2009). Is there a clean / fairly easy way to do this? I can think of doing some string comparisons and walking through date values to figure it out but was wondering if there was a rather simplistic way to do this that I am overlooking.
Re: [GENERAL] quoting values magic
On May 26, 2009, at 6:37 PM, Brandon Metcalf wrote: j> option 2: case when '$length' = '' ... j> you can use case like this: j>UPDATE foo j> SET j>pattern = '$pattern', j>shape = '$shape', j>length = case when '$length'='' then length else '$length' end, j>comment = '$comment' j> WHERE foo_id = $foo_id j> here you can substitute any value you choose for the empty string, j> 0 or NULL may (or may not) be more apropriate. The issue here is that these reduce back to my original problem. For example, if I use a CASE statement and I fall through to the ELSE, then the SQL is attempting to insert a "''" in a NUMERIC field which is not valid. That is, it's trying to do No it doesn't, read that statement again ;) If $length = 'foo' it reads (leaving out the extra fields): UPDATE foo SET length = CASE WHEN 'foo'='' THEN length ELSE 'foo' END WHERE foo_id = $foo_id; Which evaluates to: UPDATE foo SET length = 'foo' WHERE foo_id = $foo_id; Whereas if $length = '' it reads: UPDATE foo SET length = CASE WHEN ''='' THEN length ELSE '' END WHERE foo_id = $foo_id Which evaluates to: UPDATE foo SET length = length WHERE foo_id = $foo_id Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4a1c2f7010091048315763! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] quoting values magic
d == dal...@solfertje.student.utwente.nl writes: d> On May 26, 2009, at 6:37 PM, Brandon Metcalf wrote: d> > j> option 2: case when '$length' = '' ... d> > d> > j> you can use case like this: d> > d> > j>UPDATE foo d> > j> SET d> > j>pattern = '$pattern', d> > j>shape = '$shape', d> > j>length = case when '$length'='' then length else d> > '$length' end, d> > j>comment = '$comment' d> > j> WHERE foo_id = $foo_id d> > d> > j> here you can substitute any value you choose for the empty string, d> > j> 0 or NULL may (or may not) be more apropriate. d> > d> > d> > The issue here is that these reduce back to my original problem. For d> > example, if I use a CASE statement and I fall through to the ELSE, d> > then the SQL is attempting to insert a "''" in a NUMERIC field which d> > is not valid. That is, it's trying to do d> No it doesn't, read that statement again ;) Oops. Indeed, you are correct. -- Brandon -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Need beginning and ending date value for a particular week in the year
So for the calculated week value (i.e. 2009w22) I need to be able to calculate the first and last day of the week (05/25/2009 and 05/31/2009). Is there a clean / fairly easy way to do this? I can think of doing some string comparisons and walking through date values to figure it out but was wondering if there was a rather simplistic way to do this that I am overlooking. well, a few things that come to mind.figure out how many days are in the first partial week prior to week 1 of this year, that would be something like 7 - day_of_week(-01-01), then take '1 day'::interval * (week-1)*7 + that offset, and add that to -01-01 ... add '6 day'::interval to get the last day of the week... -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Need beginning and ending date value for a particular week in the year
On May 26, 2009, at 8:03 PM, Keaton Adams wrote: PG 8.1.17 For a given week number (2009w22) I need to calculate the beginning and ending date that makes up that particular week in the year. I want to use the beginning/ending date as part of a CHECK constraint on an inherited table, with each child table based on a week of the year. What I ended up doing was generating a table with: weekint yearint start date end date The beauty is that you can constrain your data with foreign keys to that table. I'm not sure that would work for partitioning though (I think that's where you're heading?), now that I think of it. Nevertheless, you could copy that data into your table and put a check constraint on that instead. I recall using generate_series() and EXTRACT(week FROM ...) to populate the table in one pass for several years, but I don't have the exact incantation at hand now. I'd have to experiment a bit to get that back again, I don't have access to it anymore. You may like to know that Postgres 8.3 has native support for ISO8601 week calculations (http://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT ). Maybe you should upgrade. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4a1c33e310093700910733! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] quoting values magic
Brandon Metcalf writes: > d == dal...@solfertje.student.utwente.nl writes: > d> On May 26, 2009, at 6:37 PM, Brandon Metcalf wrote: > d> > The issue here is that these reduce back to my original problem. For > d> > example, if I use a CASE statement and I fall through to the ELSE, > d> > then the SQL is attempting to insert a "''" in a NUMERIC field which > d> > is not valid. That is, it's trying to do > d> No it doesn't, read that statement again ;) > Oops. Indeed, you are correct. I think there is a problem though. If you have case when '$length'='' then length else '$length' end then what the parser is going to see is a CASE expression with a variable (known to be NUMERIC) in one arm and an unknown-type literal constant in the other arm. So it's going to decide that the literal must be NUMERIC too, and that type coercion will fail if the literal is really just ''. Some experimentation suggests that you might get away with case when '$length'='' then length else '$length'::text::numeric end so that the text-to-numeric conversion is delayed to runtime. However this is a bit fragile (it's dependent on some undocumented details of the constant-expression-folding behavior) and it also requires hardwiring knowledge that length is indeed numeric into your SQL command. On the whole I'd suggest going with NULL, not empty string, as your representation of a missing update value if at all possible. Then the previously-suggested COALESCE solution will work, and you aren't relying on any shaky assumptions about when and how the parser will try to enforce validity of the datatype value. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Code tables, conditional foreign keys?
"A deep unwavering belief is a sure sign that you're missing something." -- Unknown I had no intention of sparking an ideological discussion. I read Joe's article reference previously - a simple case for using a normalized database. I would pretty much agree with his assessment from beginning to end. However, he really doesn't address my scenario at all. I'm not trying to mash my database together into a single table, I'm trying to deal with the fact that we have hundreds of tables with nearly identical syntax, but further, deal with the concept of "code tables". See our product has to work in many venues, and each venue has their own set of code-table data that they'd like to support. Worse, they often use similar values for the different things, so the "natural key" is just not natural. Sometimes I've seen venues "re-using" the old code table value from previous years to mean new things in current/future years. Yes, this is a bad, bad, bad idea but it was still there and it's still my job to deal with it. Surrogate keys are used to make sure that 15 to mean "BS College Degree" in venue A aren't confused with 15 to mean "No High School Education" in another venue. They cover a similar value, EG: applicant's educational level. Some values don't translate at all, (EG: differing representations of vocational arts) so using our own code table set and then translating doesn't work consistently, either. So we have multiple, distinct sets of data to be used within a single field. Either that, or we create massive data tables with every possible different set of otherwise similar data, each of which has a foreign key to a table with a slightly different name, which is, far and away, even uglier. (EG: applicants.ca_edlevel, applicants.or_edlevel applicants.nv_edlevel, applicants.southca_edlevel...) educational level is one example, there are hundreds that we have to deal with! So back to the first question: is there a way to have a conditional foreign key? On Saturday 23 May 2009 17:22:36 Lew wrote: > Conrad Lender wrote: > > I didn't intend any disrespect to Joe Celko. I have read a number of his > > articles, which tend to be well written and informative. Last year, when > > I posted to comp.databases asking for advice on whether to refactor that > > table, he wrote "You will have to throw it all out and start over with a > > relational design", "Throw away the idiot who did the EAV. This is not a > > good design -- in fact, it is not a design at all", and "This is basic > > stuff!!" Then he copied the same EAV example that was linked earlier by > > Rodrigo, claiming that "someone like me" had suggested it. With all the > > respect I have for Mr. Celko, that was hardly helpful, as that example > > and the situation I had described were quite different. It also did not > > encourage me to follow his advice and start from scratch (and fire my > > boss, who was the mentioned "idiot"). > > If we fired every boss who actually is an idiot there would be about half > the number of bosses. > > All kidding aside, why is the boss specifying a database architecture? > That is not the boss's job. > > > I understand the problems that can arise from bad design choices, and I > > know that Celko is vehemently opposed to anything that resembles EAV, > > For good reasons. > > > but I felt that in our case "throwing it all away" would be excessive. > > Perhaps not. I had a situation some years ago where a supervisor would not > let me normalize a database and consequently the project nearly failed. > Fortunately, the company assigned a new team lead/project manager who did > the normalization or it would have been a disaster. Trying to make a bad > approach work is often, if not always, more expensive than replacing it > with a good approach. > > > We had safeguards to ensure referential integrity, and keeping the > > values in the same table allowed us to let users manage them all with > > the same form. So I guess it's like Stefan Keller said in a different > > thread today: "Know when to break the rules." > > Managing all the values in the same form is not intrinsically connected to > whether one stores the values in an EAV layout. > > Telling oneself that one should know when to break the rules is not the > same as knowing when to break the rules. They are the rules for good > reason. > > All I'm saying is that EAV is a very problematic approach. I've been on > projects that tried to use it, and while that didn't make me an expert on > the matter by any means, it gave me some cause to trust Mr. Celko's opinion > on the matter. > > -- > Lew > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clea
Re: [GENERAL] Need beginning and ending date value for a particular week in the year
This looks great and is a much easier solution to the problem than what I had planned. Thanks! Keaton mydb=# select ( date('2009-01-01') + ('1 day'::interval * (21-1)*7)) - ('1 day'::interval * (select 7 - extract(dow from timestamp '2009-01-01'))); ?column? - 2009-05-18 00:00:00 (1 row) mydb=# select ( date('2009-01-01') + ('1 day'::interval * (21-1)*7)) - ('1 day'::interval * (select 7 - extract(dow from timestamp '2009-01-01'))) + '6 days'::interval; ?column? - 2009-05-24 00:00:00 (1 row) mydb=# select ( date('2009-01-01') + ('1 day'::interval * (22-1)*7)) - ('1 day'::interval * (select 7 - extract(dow from timestamp '2009-01-01'))); ?column? - 2009-05-25 00:00:00 (1 row) mydb=# select ( date('2009-01-01') + ('1 day'::interval * (22-1)*7)) - ('1 day'::interval * (select 7 - extract(dow from timestamp '2009-01-01'))) + '6 days'::interval; ?column? - 2009-05-31 00:00:00 (1 row) mydb=# select ( date('2009-01-01') + ('1 day'::interval * (1-1)*7)) - ('1 day'::interval * (select 7 - extract(dow from timestamp '2009-01-01'))); ?column? - 2008-12-29 00:00:00 (1 row) mydb=# select ( date('2009-01-01') + ('1 day'::interval * (1-1)*7)) - ('1 day'::interval * (select 7 - extract(dow from timestamp '2009-01-01'))) + '6 days'::interval; ?column? - 2009-01-04 00:00:00 (1 row) On 5/26/09 12:18 PM, "John R Pierce" wrote: > So for the calculated week value (i.e. 2009w22) I need to be able to > calculate the first and last day of the week (05/25/2009 and > 05/31/2009). Is there a clean / fairly easy way to do this? I can > think of doing some string comparisons and walking through date values > to figure it out but was wondering if there was a rather simplistic > way to do this that I am overlooking. well, a few things that come to mind.figure out how many days are in the first partial week prior to week 1 of this year, that would be something like 7 - day_of_week(-01-01), then take '1 day'::interval * (week-1)*7 + that offset, and add that to -01-01 ... add '6 day'::interval to get the last day of the week...
Re: [GENERAL] Code tables, conditional foreign keys?
On 26/05/09 20:48, Benjamin Smith wrote: > "A deep unwavering belief is a sure sign that you're missing > something." -- Unknown > > I had no intention of sparking an ideological discussion. I know, my apologies for going off-topic. I just had a deja-vu when I saw Celko's article about EAV disasters mentioned again in a very similar context. > So back to the first question: is there a way to have a > conditional foreign key? I can only suggest what we've done in this situation. We had a table setup similar to yours (tables like "customer" referencing many small sets like customer type or education level, with the exact same structure). All of the small sets were combined in one table (which is what reminded people of EAV design). Using your original example ... create table codetables ( id serial primary key, name varchar unique not null ); create table codevalues ( id serial primary key, codetables_id integer not null references codetables(id), value varchar not null, unique(codetables_id, value) ); create table customers ( customer_types_id integer not null references codevalues(id), customer_taxcode_id integer references codevalues(id), ) ... you need to make sure that customer_types_id references the correct codetable set within codevalues. To do this, we added CHECK constraints in our tables: CREATE TABLE customer ( ... customer_type_id INTNOT NULL, -- this is the standard FK to codevalues CONSTRAINT fk_customer_type_id FOREIGN KEY (customer_type_id) REFERENCES codevalues (id), -- this makes sure that the correct set is referenced CONSTRAINT check_customer_type CHECK (belongs_to_codetable('customer_type', customer_type_id)) ); CREATE FUNCTION belongs_to_codetable (VARCHAR(255), INT) RETURNS BOOLEAN AS ' SELECT EXISTS ( SELECT 1 FROM codetables ct JOIN codevalues cv ON cv.codetables_id = ct.id AND ct.name = $1 AND cv.id = $2 ) ' LANGUAGE 'SQL'; We used different names, so this is untested, but in principle it should do what you require. Whether this is a good design or not... I'm still not sure. Joe Celko would grill me for doing something like this. - Conrad -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] quoting values magic
t == t...@sss.pgh.pa.us writes: t> Brandon Metcalf writes: t> > d == dal...@solfertje.student.utwente.nl writes: t> > d> On May 26, 2009, at 6:37 PM, Brandon Metcalf wrote: t> > d> > The issue here is that these reduce back to my original problem. For t> > d> > example, if I use a CASE statement and I fall through to the ELSE, t> > d> > then the SQL is attempting to insert a "''" in a NUMERIC field which t> > d> > is not valid. That is, it's trying to do t> > d> No it doesn't, read that statement again ;) t> > Oops. Indeed, you are correct. t> I think there is a problem though. If you have t> case when '$length'='' then length else '$length' end t> then what the parser is going to see is a CASE expression with a t> variable (known to be NUMERIC) in one arm and an unknown-type literal t> constant in the other arm. So it's going to decide that the literal t> must be NUMERIC too, and that type coercion will fail if the literal t> is really just ''. t> Some experimentation suggests that you might get away with t> case when '$length'='' then length else '$length'::text::numeric end I think this is what I meant to say :) If $length contains a number, then the resulting statement will be $length = '5.8', for example, and this will fail for type NUMERIC. -- Brandon -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Need beginning and ending date value for a particular week in the year
Alban Hertroys wrote: > I recall using generate_series() and EXTRACT(week FROM ...) to populate > the table in one pass for several years, but I don't have the exact > incantation at hand now. I'd have to experiment a bit to get that back > again, I don't have access to it anymore. There's something similar in the Wiki: http://wiki.postgresql.org/wiki/Date_and_Time_dimensions -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] How to speed up the first-time-searching in pgsql?
Hi all, I have a table which has more than 10millions records in pgsql which is running on window 2003. During night, nobody search the database. In morning, when people start to the application, it will take more than 30 seconds to get database back. After several times of same searching, the results can be returned in 8 seconds. Is this related to some kinds of process priority setting in window 2003? If so, how do I set the pgsql processes in highest priority? Thanks in advance. ouyang
Re: [GENERAL] How to speed up the first-time-searching in pgsql?
zxo102 ouyang wrote: Hi all, I have a table which has more than 10millions records in pgsql which is running on window 2003. During night, nobody search the database. In morning, when people start to the application, it will take more than 30 seconds to get database back. After several times of same searching, the results can be returned in 8 seconds. Is this related to some kinds of process priority setting in window 2003? If so, how do I set the pgsql processes in highest priority? are there other applications running on this same server? my first guess is, other programs are doing disk IO when the postgres database is idle, and pushing the postgres stuff out of the in-memory cache. Windows is pretty aggressive about reclaiming idle memory, and there's not much you can do about it. maybe schedule something to run in the AM shortly before the regular database users show up which will do queries that force a full table scan on each table, this will cause all the imporant data to swap back in. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] How to initiate a new log file?
While debugging applications interacting with pg, the log file (under pg_log) gets quite long. The log file typically has a name such as postgresql-2009-05-26_00.log; sometimes an new file will be generated on the same day and the portion of the name that is all zeroes will change to a larger number for each change during that day. I would like to know what triggers the name change and how can I force a change. Thanks, Ray -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to initiate a new log file?
ray wrote: While debugging applications interacting with pg, the log file (under pg_log) gets quite long. The log file typically has a name such as postgresql-2009-05-26_00.log; sometimes an new file will be generated on the same day and the portion of the name that is all zeroes will change to a larger number for each change during that day. I would like to know what triggers the name change and how can I force a change. by default, a new log is created at midnight (hence the _00) and when the server is restarted (which would have a timestamp related to when it was restarted) you can force a new logfile with... SELECT |pg_rotate_logfile|(); -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to speed up the first-time-searching in pgsql?
On Tue, May 26, 2009 at 5:57 PM, zxo102 ouyang wrote: > Hi all, > I have a table which has more than 10millions records in pgsql which is > running on window 2003. During night, nobody search the database. > In morning, when people start to the application, it will take more than 30 > seconds to get database back. After several times of same searching, the > results can be returned in 8 seconds. > Is this related to some kinds of process priority setting in window > 2003? If so, how do I set the pgsql processes in highest priority? There are two types of "look how we've optimized our OS and now your database server runs like crap" common scenarios. The most common is that the one where the OS has simply stopped caching your database files because other things are happening. Not much you can do about that one. The other is the VM in your OS slowly swapping out all of pgsql's shared_buffers because they appear idle, making even more room to cache files on the machine for processes that are not pgsql. Which is why the other poster has asked what other things this server does. If the OS is busy swapping out idle chunks of memory for more cache, you can stop it in linux by adjusting the vm.swappiness setting. No clue how to do that in windows, but googling on swappiness and windows might help. It's a good idea to put a database on its own server for these reasons. Also, in the morning, have a cron job crank up that does "select * from mybigtable" for each big table to load it into cache. This is possibly made worse if you've lowered your random_page_cost to near 1, and have effective_cache_size cranked up. Those settings are likely right for your setup, but first thing in the morning they're wrong. Actual random page cost really is 10 or more, and the effective cache size means nothing because the kernel cache is full of stuff that's NOT pgsql files. In which case the tendency towards index access and not seq scan is really gonna cost you. Hence the need for the select * from bigtable queries to prime the pump. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to speed up the first-time-searching in pgsql?
On Tue, 26 May 2009, Scott Marlowe wrote: Also, in the morning, have a cron job crank up that does "select * from mybigtable" for each big table to load it into cache. Just to clarify: on 8.3 and later versions, doing this doesn't do what some people expect. Sequential scans like that will continuously re-use a 256KB section of the PostgreSQL shared_buffers space, so this won't cause all of that to get paged back in if the problem is related to it being swapped out. It will pass everything through the OS buffer cache though and prime it usefully, which might be all that's actually needed. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to speed up the first-time-searching in pgsql?
Greg Smith writes: > On Tue, 26 May 2009, Scott Marlowe wrote: >> Also, in the morning, have a cron job crank up that does "select * from >> mybigtable" for each big table to load it into cache. > Just to clarify: on 8.3 and later versions, doing this doesn't do what > some people expect. Sequential scans like that will continuously re-use a > 256KB section of the PostgreSQL shared_buffers space, so this won't cause > all of that to get paged back in if the problem is related to it being > swapped out. It will pass everything through the OS buffer cache though > and prime it usefully, which might be all that's actually needed. Bearing in mind that this is a Windows server ... I seem to recall that the conventional wisdom is still to keep shared_buffers relatively small on Windows. So priming the OS cache is exactly what it's about. (Keeping that down should also help avoid the other scenario Scott was worried about, where shared memory itself gets paged out.) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to speed up the first-time-searching in pgsql?
On Tue, May 26, 2009 at 7:43 PM, Tom Lane wrote: > Greg Smith writes: >> On Tue, 26 May 2009, Scott Marlowe wrote: >>> Also, in the morning, have a cron job crank up that does "select * from >>> mybigtable" for each big table to load it into cache. > >> Just to clarify: on 8.3 and later versions, doing this doesn't do what >> some people expect. Sequential scans like that will continuously re-use a >> 256KB section of the PostgreSQL shared_buffers space, so this won't cause >> all of that to get paged back in if the problem is related to it being >> swapped out. It will pass everything through the OS buffer cache though >> and prime it usefully, which might be all that's actually needed. > > Bearing in mind that this is a Windows server ... I seem to recall that > the conventional wisdom is still to keep shared_buffers relatively small > on Windows. So priming the OS cache is exactly what it's about. > (Keeping that down should also help avoid the other scenario Scott was > worried about, where shared memory itself gets paged out.) Yeah, I thought it was pretty obvious I was talking OS cache up there. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Regular expression and array
I wont go into details about why im using this field as an array but how would I select all the rows that have the first name 'Tom' out of the 'names' field? CREATE TABLE test ( id integer, names character varying[] ); INSERT INTO test VALUES (1, '{"''Josh Berkus''","''Peter Eisentraut''","''Marc Fournier''"}'); INSERT INTO test VALUES (2, '{"''Tom Lane''","''Bruce Momjian''","''Dave Page''"}'); INSERT INTO test VALUES (3, '{"''Jan Wieck''","''Oleg Bartunov''","''Joe Conway''"}'); -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general