[SQL] Question regarding indices
Hello List, I have a small question about the order of values in a query. Assume I have a table with the following fields: uid INT, data BIGINT, hits INT And an unique index on (uid, data). I use libpq C API to query data from the table. The query is something like this: SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN (2033,2499,590,19,201,659) Would the speed of the query be influenced if I would sort the data? I can imagine that just querying a bunch of bigint would not make a big difference but what about several thousand of values? Would sorting them and sending the SQL query with ordered data influence the speed of the query? // Steve -- GRATIS: Spider-Man 1-3 sowie 300 weitere Videos! Jetzt freischalten! http://portal.gmx.net/de/go/maxdome -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Question about PQexecParams
Hello list,
I would like to call a function from my C application by using libpq and
PQexecParams. My problem is that I don't know how to specify that I want to
send an array to the function.
Assume the function is called lookup_data and takes the following parameters:
lookup_data(integer,integer,bigint[])
I would like to specify the OID with my query. How would I do that? Assume I
would like to query 3 values for bigint:
const char *paramValues[5];
Oid paramTypes[5];
int paramLengths[5];
int paramFormats[5];
int32_t ivalue1 = htonl(value1);
paramValues[0] = (char *)&ivalue1;
paramTypes[0] = INT4OID;
paramLengths[0] = sizeof(ivalue1);
paramFormats[0] = 1;
int32_t ivalue2 = htonl(value2);
paramValues[1] = (char *)&ivalue2;
paramTypes[1] = INT4OID;
paramLengths[1] = sizeof(ivalue2);
paramFormats[1] = 1;
etc...
How would I tell libpq that the next 3 values are an array of bigint?
I tried to use INT8OID and specify the query like below but that did not work:
SELECT * FROM lookup_data($1,$2,{$3,$4,$5})
Probably I have to set the query to be:
SELECT * FROM lookup_data($1,$2,{$3})
Or:
SELECT * FROM lookup_data($1,$2,$3)
But what would I set for paramTypes? How can I say that the values are an array
of bigint? I assume that I can set paramValues to be an array and paramLengths
to be sizeof one value multiplied by the amount of elements in the array.
I am somehow lost and don't know how to call the function and pass an array to
libpq.
Can any one help me with this?
// Steve
--
GMX DSL SOMMER-SPECIAL: Surf & Phone Flat 16.000 für nur 19,99 Euro/mtl.!*
http://portal.gmx.net/de/go/dsl
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Question regarding indices
Original-Nachricht > Datum: Sat, 11 Sep 2010 10:05:18 -0400 > Von: Michael Gould > An: Steve > Betreff: Re: [SQL] Question regarding indices > Steve, > Hello Michael, > If I remember correctly the sort only works on the final result set and so > doing a sort before the select isn't going to help > I think you got me wrong. It's not that I want the RESULT to be sorted. I only asked if sorting the values in the query would have an effect on the speed of the query. Or to ask it the other way around: Is an index in PostgreSQL sorted or not? If it is sorted and the PostgreSQL parser is looking for entries in the index then having the values pre-sorted would probably be beneficial. Right? // Steve > Sent from Samsung mobile > > Steve wrote: > > >Hello List, > > > >I have a small question about the order of values in a query. Assume I > have a table with the following fields: > > uid INT, > > data BIGINT, > > hits INT > > > >And an unique index on (uid, data). I use libpq C API to query data from > the table. The query is something like this: > >SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN > (2033,2499,590,19,201,659) > > > >Would the speed of the query be influenced if I would sort the data? I > can imagine that just querying a bunch of bigint would not make a big > difference but what about several thousand of values? Would sorting them and > sending the SQL query with ordered data influence the speed of the query? > > > > > >// Steve > >-- > >GRATIS: Spider-Man 1-3 sowie 300 weitere Videos! > >Jetzt freischalten! http://portal.gmx.net/de/go/maxdome > > > >-- > >Sent via pgsql-sql mailing list ([email protected]) > >To make changes to your subscription: > >http://www.postgresql.org/mailpref/pgsql-sql -- GRATIS: Spider-Man 1-3 sowie 300 weitere Videos! Jetzt freischalten! http://portal.gmx.net/de/go/maxdome -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Question regarding indices
Original-Nachricht > Datum: Sat, 11 Sep 2010 11:04:16 -0400 > Von: Tom Lane > An: "Steve" > CC: [email protected] > Betreff: Re: [SQL] Question regarding indices > "Steve" writes: > > I have a small question about the order of values in a query. Assume I > have a table with the following fields: > > uid INT, > > data BIGINT, > > hits INT > > > And an unique index on (uid, data). I use libpq C API to query data from > the table. The query is something like this: > > SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN > (2033,2499,590,19,201,659) > > > Would the speed of the query be influenced if I would sort the data? I > can imagine that just querying a bunch of bigint would not make a big > difference but what about several thousand of values? Would sorting them and > sending the SQL query with ordered data influence the speed of the query? > > It's unlikely to make enough difference to be worth the trouble. > Making a quick sort is ultra easy in C. Anyway... is there a difference in the speed of the query with pre-sorted values or not? If there is one then I will go and sort the values. Right now I have a quick sort implemented but I will probably do a lazy quick sort and then a final insertion sort, because insertion is faster on a slightly ordered dataset than quick sort. Probably I am pulling hairs here about the speed but I really want to minimize the time it needs for PostgreSQL to return the data. I personally am happy with the speed when using PostgreSQL but the application I use has an MySQL driver too and I got some users claiming that MySQL is faster than PostgreSQL, transfers less data over the wire, etc... and I want to optimize the PostgreSQL part to be on the same level as the MySQL part. So everything that helps to squeeze the last nanosecond out of the PostgreSQL part is welcome. I already switched to binary transmission in order to minimize the data send over the wire when using PostgreSQL and I have added an function to do +/- what the MySQL proprietary "INSERT ON DUPLICATE KEY UPDATE" does. I hate when users compare apples with oranges but what can I do? You can not explain them that PostgreSQL is different and more standard compliant and that the chance to loose data is lower with PostgreSQL then with all that what MySQL is d oing (MyISAM tables, etc...). It's pointless to explain to them. It's like trying to explain a mole how the sun is shining. So all I want is to explore the available capabilities of PostgreSQL to get the best out of the libpq engine as possible. If you have any recommendation at what I should look in order to get better speed, then let me know. > regards, tom lane > // Steve -- Achtung Sicherheitswarnung: GMX warnt vor Phishing-Attacken! http://portal.gmx.net/de/go/sicherheitspaket -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Question regarding indices
Original-Nachricht > Datum: Sat, 11 Sep 2010 11:08:00 -0400 > Von: Lew > An: [email protected] > Betreff: Re: [SQL] Question regarding indices > On 09/11/2010 08:29 AM, Steve wrote: > > I have a small question about the order of values in a query. > > Assume I have a table with the following fields: > >uid INT, > >data BIGINT, > >hits INT > > And an unique index on (uid, data). I use libpq C API to query > > data from the table. The query is something like this: > > SELECT uid,data,hits FROM mytable WHERE uid=2 > > AND data IN (2033,2499,590,19,201,659) > > > > Would the speed of the query be influenced if I would sort the data? > > What do you mean by "sort the data"? Which data? > I mean sorting the values in the brackets. Instead of: SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN (2033,2499,590,19,201,659) I would then send this here: SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN (19,201,590,659,2033,2499) Off course this is a small dataset but the query usually has thousands of elements and not only the above 6 elements. > > I can imagine that just querying a bunch of bigint would not make a > > big difference but what about several thousand of values? Would sorting > > them and sending the SQL query with ordered data influence the speed of > the query? > > Send the query from where to where? > Sending the query from my application to the PostgreSQL server. > Are you referring to a sort of the items in the IN subselect? > Yes. > My guess is > that sorting that won't matter but it's only a WAG. > What is "WAG"? > -- > Lew > SteveB -- Achtung Sicherheitswarnung: GMX warnt vor Phishing-Attacken! http://portal.gmx.net/de/go/sicherheitspaket -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Question about PQexecParams
Original-Nachricht > Datum: Sun, 12 Sep 2010 01:52:04 +0400 > Von: Dmitriy Igrishin > An: Steve > CC: [email protected] > Betreff: Re: [SQL] Question about PQexecParams > Hey Steve, > > 2010/9/11 Steve > > > Hello list, > > > > I would like to call a function from my C application by using libpq and > > PQexecParams. My problem is that I don't know how to specify that I want > to > > send an array to the function. > > > > Assume the function is called lookup_data and takes the following > > parameters: lookup_data(integer,integer,bigint[]) > > > > I would like to specify the OID with my query. How would I do that? > Assume > > I would like to query 3 values for bigint: > > > > const char *paramValues[5]; > > Oid paramTypes[5]; > > int paramLengths[5]; > > int paramFormats[5]; > > > > int32_t ivalue1 = htonl(value1); > > paramValues[0] = (char *)&ivalue1; > > paramTypes[0] = INT4OID; > > paramLengths[0] = sizeof(ivalue1); > > paramFormats[0] = 1; > > > > int32_t ivalue2 = htonl(value2); > > paramValues[1] = (char *)&ivalue2; > > paramTypes[1] = INT4OID; > > paramLengths[1] = sizeof(ivalue2); > > paramFormats[1] = 1; > > > > etc... > > > > How would I tell libpq that the next 3 values are an array of bigint? > > > > I tried to use INT8OID and specify the query like below but that did not > > work: > > SELECT * FROM lookup_data($1,$2,{$3,$4,$5}) > > > Incorrect. > > > > > Probably I have to set the query to be: > > SELECT * FROM lookup_data($1,$2,{$3}) > > > Incorrect. > > > > > Or: > > SELECT * FROM lookup_data($1,$2,$3) > > > Correct. > Thanks. > You may specify a data type by OID (1016 for bigint[], > please refer to > http://www.postgresql.org/docs/9.0/static/catalog-pg-type.html > to obtain information about types) or attach an explicit cast to a > parameter symbol to force treating it as bigint[] (or any specified type), > e.g. > SELECT * FROM lookup_data($1, $2, $3::bigint[]) > Thanks. > > But what would I set for paramTypes? How can I say that the values are > an > > array of bigint? I assume that I can set paramValues to be an array and > > paramLengths to be sizeof one value multiplied by the amount of elements > in > > the array. > > > Please note, that in this case, you must pass to paramValues[2] a textual > representation > of bigint[], e.g. '{1,2,3}'. > Its not necessary to specify a length of text-format parameters (its > ignored). > The length is essential only if you transmit data in a binary format. > Really? I must use a textual representation of the array? Why? I searched the Internet up and down and as far as I can tell, there is a possibility to send the array in binary. I have to add a special header to the array and do off course that host to network translation and then I can send the array in binary. Unfortunately I can not find enough information about the format of the whole header + array. The header looks to be easy to create (just 3 times 4 bytes for 1) number of dimensions (aka ndims), 2) if the array has null elements (aka hassnull), 3) array element oid (aka typeid. In my case INT8OID aka 20)) and then followed by the content of the array. And here I have a problem. I don't know how that data following the header should look like? I think that each value is in a block of 8 bytes (converted from host to network). But I am not sure (the examples I have seen are all for int4 and not for bigint). I am confused by the two examples I have found so far. One of them is dividing those 8 bytes into two 4 bytes blocks and adds so mething they call "dims" and "lbound". I have no clue what that is? I think the PostgreSQL function "array_recv()" is responsible for the format but I can not find any documentation about the format of a binary array representation. Maybe you know a place where I can read about how to send an array of int64_t to the PostgreSQL backend in binary? I know that I could go the textual representation path, but I really want to send the data in binary. And I don't want/can libpqtypes (which would btw make the task ultra easy). Maybe I can not see the forest because of the trees but I really can not find any documentation how to create a correct struct representing an array datatype. Can it be that this part is not documented at all? > > > > > I am somehow lost and don't know how to call the function and pass an > array > > to libpq. > > > > Can any one help me with this? > > > Hope this helps. > Yes. You helped me a bit. But I am still not there where I want/need to be. > Regards, > Dmitriy > // Steve -- Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief! Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Question on string value expression wildcarding
I have LIKE expressions: (a) name LIKE 'kp.dhs.%' (b) name LIKE 'kp.dhs%' where the name column contains strings prefixed with "kp.dhs.". I'm using postgresql 7.0.2. Expression (a) fails to match any names while (b) matches all strings prefixed with "kp.dhs", including (as expected) those prefixed with "kp.dhs.". So I take it that ".%" has some special meaning in wildcarding, but my (limited) SQL references don't mention this case. Is this To Be Expected SQL behavior? If so, what expression can be used to match only strings prefixed with "kp.dhs."? Thanks! -- Steve Wampler- SOLIS Project, National Solar Observatory [EMAIL PROTECTED]
[SQL] Using a postgres table to maintain unique id?
Hi, I have a distributed application that needs unique 64-bit "id" values. The current implementation keeps the current value for this id in a file on one system and uses a process to provide access to this file via CORBA. However, I also use Postgres (7.0.2) throughout this application and it seems cleaner to me to keep the current id value in a table and just use postgres to provide access (with a trigger function to increment the id on access). Is this reasonable? Is it fast? (I need 10 or more IDs generated each second.) Can I avoid having the table gradually fill with "old" rows for this entry, and this avoid the need to run VACUUM ANALYZE periodically? Any tips on how to implement the trigger would be appreciated. (Is it possible to generate an int8 sequence value?) Thanks! -- Steve Wampler- SOLIS Project, National Solar Observatory [EMAIL PROTECTED]
Re: [SQL] Using a postgres table to maintain unique id?
Poet/Joshua Drake wrote: > > >However, I also use Postgres (7.0.2) throughout this > >application and it seems cleaner to me to keep the current > >id value in a table and just use postgres to provide access > >(with a trigger function to increment the id on access). > > Why not a sequence? Can someone show me how to create (and use) an int8 sequence? > >Is this reasonable? Is it fast? (I need 10 or more IDs > >generated each second.) Can I avoid having the table > >gradually fill with "old" rows for this entry, and this > >avoid the need to run VACUUM ANALYZE periodically? > > The only problem I have had with this type of thing is when a number gets > deleted, it does not get recycled. Fortunately, I don't want any number to ever get recycled - the id needs to be unique throughout the 25+ year lifetime of the project. The table would have a single row with a single column. Selecting that table cell would return the current value, but leave the value incremented in the table cell (it's ok if it increments the value before returning). -- Steve Wampler- SOLIS Project, National Solar Observatory [EMAIL PROTECTED]
Re: [SQL] Using a postgres table to maintain unique id?
Marten Feldtmann wrote: > > Throw away all the "hardwired"-stuff and do it with software. I > once described an algorithm in one of this lists how to create > unique values for clients without minimum interaction with the > database. > > The result: query once in the beginning of your application, > generate your id's "offline" at the maximum speed you may > have and store your last generated id when your client > finished. Superior to all the "hardwired"-database solutions ! Yes, but... (1) The application I have is composed of about 50 processes running on 3 different OS/architectures (Linux/intel, Solaris/sparc, and VxWorks/ppc). The IDs I need must be unique across all processes (I suppose one solution would be to provide each ID with a unique prefix based on the process that is running, but...) (2) Some of these systems are real-time boxes that might get rebooted at any moment, or might hang for hardware-related reasons [I'd like to able to say that all of the processes could detect imminent failure, but unfortunately, I can't]. So determining when a client "finishes" is not always possible, which prevents (he claims) the above solution from claiming ID uniqueness. However, it might be sufficient to provide a process on the postgres DB machine (if *that* machine dies, *everything* stops...) that serves IDs via CORBA to all the other applications and (internally) uses the "software" approach given above. This process could "sync" with the database every N seconds or so (where N might be < 1.0). This, while still not guaranteeing uniqueness, would at least come pretty close... It would still be nice to avoid having to VACUUM ANALYZE this table, though, and it "feels" as though it is duplicating functionality already provided by postgres DB backends. I'll think about this solution - thanks! -- Steve Wampler- SOLIS Project, National Solar Observatory [EMAIL PROTECTED]
Re: [SQL] Using a postgres table to maintain unique id?
Thomas Swan wrote: > > At 11/13/2000 06:22 PM -0800, Michael Teter wrote: > > > > From what I can tell (both from docs and doing a > > describe on sequences in my database), a postgresql > > sequence is an int4, not an int8, and thus you are > > limited to a max of 2.1 billion values. > > > > If you require an int8 sequence, you'll probably have > > to manage your own and just use an int8 column. > > > I had originally started using int8 and creating custom sequences. However, > as mentioned in a previous post, there is an inherent performance penalty in > using int8 over int4. Tom Lane advised me that the int8 routines are an > emulated or synthesized data type. in the test I did on our 7.0.2 server I > notice about a 25-30% decrease in performance when using complex joins on > tables containing referential keys, primary keys all in the int8 data type. > > This might be something to think about as well. Thanks. Because of these and other comments people have made, I've gone back to using a flat_file-with-server approach instead of adding a table to my postgres DB. While an int4 *might* work, it doesn't handle the "worst-case" scenario (which is up around 15 billion values). Thanks to everyone for your comments and suggestions! -- Steve Wampler- SOLIS Project, National Solar Observatory [EMAIL PROTECTED]
[SQL] Selecting Most Recent Row
Ok here is the problem. Table: Widgets Fields: Unique_Key, DateStamp, Batch_Number, Instructions. Basic Select Statement: select Unique_Key from Widgets where Batch_Number>='inputedvalue' Problem: Some Batch_Numbers might be duplicated over a period of time. I want to select the most recent of these Batch Numbers. -- Steve Meynell Candata Systems
Re: [SQL] Selecting Most Recent Row
Joel, Thank you very much. I gave that a try and it worked perfectly. It definately was the distinct keyword I was missing. Thanks Again, Steve Joel Burton wrote: > > Will DateStamp being the date of insertion? If so, is it that you want > the record for the most recent (largest) datestamp for each > batch_number? > > something like > > SELECT DISTINCT ON (batch_number) unique_key, datestamp, > batch_number, instructions FROM widgets ORDER BY batch_number, > datestamp desc; > > (sort by batch then by date (last first) and show the first (aka > 'distinct') row, considering only the batch_number for distinctness)
[SQL] Help Retrieving Latest Record
Hi, I am not sure if this went through the first time so here it is again. Ok what I am trying to do is select out of the database the latest record meeting a certain criteria. Example: Number | Fruit | Date 15 Apples July 20, 1999 20 Oranges June 7, 2000 13 Pears Jan 31, 2000 17 Apples April 10, 1999 Now what I need to do is select the oranges out because the date is the latest one, something like: select * from basket where max(date); This would yield me: 20 Oranges June 7, 2000 I know this doesn't work but I need something like it. or something like select * from basket where max(date) and fruit='Apples'; This would yield me: 15 Apples July 20, 1999 Thank you in advance, -- Steve Meynell Candata Systems
[SQL] Help retrieving lastest record
Ok what I am trying to do is select out of the database the latest record meeting a certain criteria. Example: Number | Fruit | Date 15 Apples July 20, 1999 20 Oranges June 7, 2000 13 Pears Jan 31, 2000 17 Apples April 10, 1999 Now what I need to do is select the oranges out because the date is the latest one, something like: select * from basket where max(date); This would yield me: 20 Oranges June 7, 2000 I know this doesn't work but I need something like it. or something like select * from basket where max(date) and fruit='Apples'; This would yield me: 15 Apples July 20, 1999 Thank you in advance, -- Steve Meynell Candata Systems
[SQL] Heres a good one...
Ok here is what looks like a good one that has stumped me. Let me set it up... I have two tables in my database test. They are called journal and distrib. journal looks like this (condensed) Table "journal" Attribute | Type | Modifier +-+-- objectid | integer | posting_date | char(4) | gl_update_flag | char(1) | And distrib look like this (condensed) Table "distrib" Attribute | Type | Modifier --+-+-- objectid | integer | distrib_objectid | integer | source_objectid | integer | The dataset for each is as follows journal: test=# select * from journal; objectid | posting_date | gl_update_flag --+--+ 100| March 31 | H 101| March 31 | H 102| April 02 | Y 103| April 02 | H 104| March 14 | H 105| February 01 | H 106| February 01 | H (7 rows) distrib: test=# select * from distrib; objectid | distrib_objectid | source_objectid --+--+- 1 |103 | 100 2 |104 | 100 3 |101 | 102 4 |101 | 105 (4 rows) Now the trick here is... I want to update the gl_update_flag in journal to Y all of the records where the gl_update_flag is H now and the posting date is before or on March 31 and where the objectid from journal matches either the distrib_objectid or the source_objectid from the distrib table we need to also update the opposite journal entry to Y as well. And can this be done in one command? An example from the above data set would be Journal objectid 100 would change and so would 103 and 104. And Journal objectid 101 would update 101 and 102 and 105. And Journal objectid 106 would only update 106. Any Ideas? Thanks in Advance, Steve ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[SQL] Re: Heres a good one...
First, thank you very much for trying but you were a bit off the mark but close. Second, posting_date being of type char(4) is quite legal and irrelevant for this exercise but for argument sake lets say char(15) but apart from all that... Your sql statement was close but it only update 4 out of the possible 6 that is should have updated... journal.objectid# 103 should have been updated because of journal.objectid# 100 was and journal.objectid# 106 should have been because it alone met the before March 31 and update flag = H criteria. But it is a start for me thanks. And your second sql statement just didn't work. It said 'ERROR: parser: parse error at or near "select"' Steve Anuradha Ratnaweera wrote: > First, posting_date in journal can _NOT_ be of type char(4)! I guess it is > a "date". > > Try > > update journal set gl_update_flag='Y' from distrib where > journal.gl_update_flag = 'H' and journal.posting_date <= '2001-03-31' and > (journal.objectid = distrib.distrib_objectid or journal.objectid = > distrib.source_objectid) > > or > > update journal set gl_update_flag='Y' where gl_update_flag = 'H' and > posting_date <= '2001-03-31' and ((objectid in select distrib_objectid > from distrib) or (objectid in select source_objectid from distrib)) > ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[SQL] Doing a regexp-based search/replace?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello: I've got a table containing property_id's with values of the form ###-. I would like to discard the slash onwards (and I can't use a substr() because I am not guaranteed if a) the - portion exists, b) what position it exists from. If this were a text file, I would use a sed expression such as: cat textfile | sed 's/-.*$//' I've been looking for a way to do this with PostgreSQL but so far haven't found a function that seems to be suitable. I thought maybe I could do it with translate, but translate doesn't appear to work with regular expressions. So far I've tried things like: select translate(property_id, '-.*', '') from mytable; I need to do this, because the -.* portion of my property_id was entered in error, and I would like to do an update on the entire table and just have the left-hand side of the property_id column remaining. Any ideas? Thank you in advance. - ---< LINUX: The choice of a GNU generation. >- Steve Frampton <[EMAIL PROTECTED]> http://www.LinuxNinja.com GNU Privacy Guard ID: D055EBC5 (see http://www.gnupg.org for details) GNU-PG Fingerprint: EEFB F03D 29B6 07E8 AF73 EF6A 9A72 F1F5 D055 EBC5 -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.0 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE7zv1TmnLx9dBV68URAisEAJ4nNYz4lxpgWojULE/Xo9SUb5IexgCfS9At J6kAVn/3vFHeJkl9bjr4AcQ= =W4xQ -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Granting database level permissions...
have a look at pg_hba.conf in your data dir. it's all in there. Steve "Thomas Swan" <[EMAIL PROTECTED]> wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... > Is it possible to grant database level access to a user in PostgreSQL? > > I have created a user and database, and I want the user to have full > control over that database. I have been through the online docs and > news groups but have not found a statement or answer to the question. > > Thomas > > > > ---(end of broadcast)--- > TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[SQL] list of tables ?
can anyone point me in the right direction ? i need to list all the tables in a database. i've looked at pgadmin_tables which is empty and pga_schema whihc contains a sinlge row i don't want to parse ... is there an easier way t get a list of tables ? i'm on 7.2 ta, Steve Brett ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] list of tables ? -update to question ...
sorry ... i didn't make myself clear ... i have of course come across \dt before ... what i meant was via sql as in 'select tablelist from ' Steve > -Original Message- > From: Stephane Schildknecht [mailto:[EMAIL PROTECTED]] > Sent: 11 July 2002 15:06 > To: Steve Brett > Subject: Re: [SQL] list of tables ? > > > Le jeu 11/07/2002 à 16:00, Steve Brett a écrit : > > can anyone point me in the right direction ? > > > > i need to list all the tables in a database. > > > > i've looked at pgadmin_tables which is empty and pga_schema > whihc contains a > > sinlge row i don't want to parse ... > > > > is there an easier way t get a list of tables ? > > > > i'm on 7.2 > > > > ta, > > > > Steve Brett > \d > ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] list of tables ?
thanks. Steve > -Original Message- > From: Achilleus Mantzios [mailto:[EMAIL PROTECTED]] > Sent: 11 July 2002 15:10 > To: Steve Brett > Cc: Pgsql-Sql (E-mail) > Subject: Re: [SQL] list of tables ? > > > On Thu, 11 Jul 2002, Steve Brett wrote: > > > can anyone point me in the right direction ? > > > > i need to list all the tables in a database. > > > > i've looked at pgadmin_tables which is empty and pga_schema > whihc contains a > > sinlge row i don't want to parse ... > > > > is there an easier way t get a list of tables ? > > > > i'm on 7.2 > > select * from pg_tables; > > > > > ta, > > > > Steve Brett > > > > > > > > ---(end of > broadcast)--- > > TIP 4: Don't 'kill -9' the postmaster > > > > -- > Achilleus Mantzios > S/W Engineer > IT dept > Dynacom Tankers Mngmt > tel:+30-10-8981112 > fax:+30-10-8981877 > email: [EMAIL PROTECTED] > [EMAIL PROTECTED] > ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] how do i import my sql query result to a file
\? will get you a list of the commands in psql. Steve > -Original Message- > From: Joseph Syjuco [mailto:[EMAIL PROTECTED]] > Sent: 18 July 2002 22:47 > To: [EMAIL PROTECTED] > Subject: [SQL] how do i import my sql query result to a file > > > how do i import results of my select query to a file > thanks > > > > > ---(end of > broadcast)--- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Ran out of connections
You probably didn't need to reboot - I suspect you could have probably restarted PostgreSQL and Apache (quick version) or killed the extra postgres processes. I suspect you need to look carefully at your code and method of connecting (ie. are you using mod-perl, plain old cgi perl, PHP or what). A problem with Apache 1.x is that of connection pooling/persistence. If you don't use persistent connections you suffer the overhead of opening and closing the database connection as required by your web process. You can use mod-perl and keep your connections open but if your pool of web processes that have open connections exceeds the number of connections allowed by PostgreSQL you will have problems (pooling of database connections between processes is problematic so each Apache process has its own connection). Be sure that there isn't a bug causing a cgi to abort leaving a stranded connection. I don't have experience with local unix socket connections where the client has died but when I have windoze users reboot when they had an open connection it will wait till the TCP/IP connection times out (frequently for an hour) before the PostgreSQL connection is closed. If the timeout is similar for local unix sockets then a failure in the cgi could leave open connections and you will run out quickly. If you are doing lots of database backed work you may want to check out AOLserver (http://www.aolserver.com/). It has a multi-threaded architecture featuring connection pooling and persistence "out of the box." Oh, it's free/open-source as well. Of course you can also get pooling/persistence with enterprise Java solutions such as JBoss (www.jboss.org). Cheers, Steve On Wednesday 04 December 2002 2:08 pm, Mike Diehl wrote: > Hi all. > > Twice this week, I've come to work to find my Postgres server out of > connections... effectively freezing my web server. > > Today, before I rebooted the entire system, I did a ps -auxw and kept the > file to study. I didn't find too many clients running. But I did find a > whole LOT of postgres processes running, idle. BTW, one of the postgres > processes was doing a vacuum analyze. I'm running 7.2. > > Can anyone tell me how to fix this? The out put of the ps command can be > seen at http://dominion.dyndns.org/~mdiehl/ps.txt > > Thanx in advance, ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] Ran out of connections
Doing anything unusual? Forking processes, opening multiple connections within a single CGI? Have you seen any evidence that a process that opens a connection is failing to complete normally? -Steve On Wednesday 04 December 2002 3:52 pm, Mike Diehl wrote: > On Wednesday 04 December 2002 03:25 pm, Roberto Mello wrote: > > On Wed, Dec 04, 2002 at 03:08:35PM -0700, Mike Diehl wrote: > > > Can anyone tell me how to fix this? The out put of the ps command > > > can be seen at http://dominion.dyndns.org/~mdiehl/ps.txt > > > > Are you using PHP by chance? I've seen this behavior under > > Apache+PHP before. My "fix" (workaround rather) was to disable > > persistent connections. > > Nope. I'm using Perl and cgi. I've got some perl that runs via cron, and > some more that runs via apache. I'm not even using ModPerl. > > It did occur to me that since some of my scripts communicate with other > devices, that I may have some IO blocking, or zombies, but the ps output > didn't indicate that. I can't see that many scripts running. Usually, I > see one postgres process for each script/cgi that is running. Not in this > case. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] Need help paging through record sets
Sort of depends on the nature of your application. You can use offset to get specific chunks: select * from foo order by date limit 100 offset 100; You should be aware, however, that on a very large table this can be quite inefficient as you will have to do the select and sort on the large table just to get the next chunk. (With only 600 tuples you will probably have everything in memory anyway so it's not really a problem.) Also, if the data is actively updated you could get a tuple added or deleted between page views which would mess up the offsets and cause someone to miss an item or get a duplicate. If you want to page through a small subset of a large file you can use cursors or temporary tables but you will have to be sure your connection persistence, session management and such can accomodate such an arrangement. Cheers, Steve On Friday 20 December 2002 12:53 pm, [EMAIL PROTECTED] wrote: > Hello everyone, and thanks for reading my first newbie post. :-) > > I am a neopyhte PHP and postgreSQL user, with a website at > www.the-athenaeum.org. We store (among other things) artworks, which > people can view in a list, sorted by artist name, date, medium, etc. > > We now have enough works that I need to rewrite the PHP listings script > (and its embedded SQL) so that users can page through records. As an > example, if a user is looking at works by date ascending, they may want to > see 100 records at a time. Since we have 600+ records, there would be 7 > pages. They'd start on the first page (of course!) and there would be > links to pages 2 through 7 as well, just like with results pages of a > Google search. They could, from page 1, click any of the other pages to go > immdiately to that set of 100 records for display. > > I see this kind of thing all over the place, and in looking it up, I see > most solutions use "SELECT TOP x", which postgreSQL doesn't seem to have. > I know how to use LIMIT, but that always starts from the top. I could add > a piece to the WHERE clause, say something like "WHERE date > 01-02-1853", > but how do I know where the cutoffs are several pages along, without > retrieving the whole record set? > > I suppose the optimal solution for me would be to sort all of the records, > then be able to select a range from that sorted record set. So, if they > click the link to page 3, I'd like to do this (in pseudocode): > > 1. SORT records by the date field, descending > 2. Retrieve only records 200-299 from the sorted list > > Is there a way to do that? How is it done elsewhere? > > Thanks in advance for your help, > Chris McCormick, webmaster > The Athenaeum - Interactive Humanities Online > www.the-athenaeum.org > > > > > ---(end of broadcast)--- > TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] empty arrays
Caution!
In 7.2.x your statement is interpreted by the parser to be a single element
with an empty string which is converted to a zero. If you do this instead:
create table test_table ( test_column integer[], another_column integer );
CREATE
steve=# insert into test_table (another_column) values (1);
INSERT 7140977 1
steve=# select * from test_table;
test_column | another_column
-+
| 1
(1 row)
You will see that there is no item in the array which you can confirm with:
select array_dims(test_column) from test_table;
array_dims
(1 row)
(Which seemingly oddly shows nothing instead of 0 since there is not yet
really any array to have dimensions. Per the docs, an array can be null but
it cannot contain SQL null elements. What the docs don't say is that
requesting an element beyond either end of an array does return a SQL null.)
Here's the gotcha: when you upgrade to 7.3.x your statement will generate an
error since an empty string is no longer arbitrarily assumed to be a zero
(either in integer fields or arrays) so if you attempt the same you will get:
create table test_table ( test_column integer[], another_column integer );
CREATE TABLE
tati=> insert into test_table (test_column) values ( '{ }' );
ERROR: pg_atoi: zero-length string
If you want to insert a null array you are best off saying so:
insert into test_table (test_column) values (null);
INSERT 17053 1
tati=> select * from test_table;
test_column | another_column
-+
|
(1 row)
Cheers,
Steve
On Monday 30 December 2002 9:40 am, [EMAIL PROTECTED] wrote:
> using: psql (PostgreSQL) 7.2.1
>
> why does an empty array return an array of length 1 rather than array of
> length 0? one would think that the results below would have returned { }
> instead of {0}.
>
> simple test using psql:
>
> # create table test_table ( test_column integer[] );
> CREATE
> # insert into test_table (test_column) values ( '{ }' );
> INSERT 43475 1
> # select * from test_table;
>
> test_column
>
> {0}
> (1 row)
>
>
> i want to be able to store a zero-length array field in the database. how
> do i specify this with sql?
>
> Regards,
>
> Floyd Shackelford
> 4 Peaks Technology Group, Inc.
> VOICE: 334.735.9428
> FAX: 916.404.7125
> EMAIL: [EMAIL PROTECTED]
> ICQ #: 161371538
>
> Shackelford Motto: ACTA NON VERBA - Actions, not words
>
> Alabama StateMotto: AUDEMUS JURA NOSTRA DEFENDERE - We Dare Defend Our
> Rights
>
>
>
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] index on to_char(created, 'YYYY') doesn't work
I don't have "real-life" experience with partial indexes but... You probably won't see an increase in speed unless the index use can get you down to a really small fraction of your total row count (I don't know just how small but remember being surprised at the number but realizing, on reflection, that it made sense. It was something like 10% or less). One thing you could try is to use a partial index (one containing only the rows in which you are interested). Here's a really dumb example: create index foo on session (username) where username is null and to_char(created, '') = '2002'; Why dumb? Because the index will only contain nulls. You could probably choose a more intelligent index based on the other queries you do. Still, this index could increase your query speed considerably (as long as your where in creating the index matches the where in your query - if you change your query to 2003 you will be back to a sequential scan). BTW, I tried to create an index on the to_char function and had no luck - seems like it should work but it doesn't on 7.2.3 or 7.3.1. Cheers, Steve On Wednesday 15 January 2003 4:08 am, Andreas Joseph Krogh wrote: > On Wednesday 15 January 2003 16:12, you wrote: > > On Wed, 15 Jan 2003, Andreas Joseph Krogh wrote: > > > -BEGIN PGP SIGNED MESSAGE- > > > Hash: SHA1 > > > > > > On Wednesday 15 January 2003 11:37, you wrote: > > > > The following does not work: > > > > > > > > create index session_u_idx on session (to_char(created, '')); > > > > ERROR: parser: parse error at or near "''" at character 57 > > > > > > > > Can I make a function to do this and index using the result of that > > > > funtion? Do anyone have an example of such a function? > > > > > > I tried the following function: > > > - - > > > create function drus (timestamp) returns varchar AS' > > > DECLARE > > > str_created VARCHAR; > > > created ALIAS FOR $1; > > > BEGIN > > > str_created:= to_char(created, ''''); > > > RETURN str_created; > > > END; > > > ' LANGUAGE 'plpgsql'; > > > > add > > WITH (iscachable) > > Thank you, not _that_ works:-) > But now this doesn't work: > create index session_u_idx on session (drus(created), username); > > Can't I have multicolumn-indexes with functions? Any idea how to rewrite > that so it works? > Here is my session table: > CREATE TABLE session ( > session_id varchar(256) NOT NULL PRIMARY KEY, > created timestamp DEFAULT 'now' NOT NULL, > last_accessed timestamp NOT NULL, > destroyed timestamp NOT NULL, > username varchar -- Allow sessions from not logged in users > ); > > Here is my query I wish to optimize using indexes: > SELECT to_char(created, 'IW') as week, count(session_id) from session WHERE > username IS NULL and to_char(created, '') = '2002' group by week ORDER > BY week; > > Any hints on optimizing this query, index-usage etc? ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] SQL to list databases?
psql -E causes psql to show it's "behind the scenes" queries to try: psql -lE (that's a lower case ell before the E) Cheers, Steve On Thursday 23 January 2003 10:56 am, Ben Siders wrote: > Is there a query that will return all the databases available, similar > to what psql -l does? > > > > ---(end of broadcast)--- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] bytea
That would be convenient but there are some difficulties. Say you have a
function that worked something like:
insert.values (, filetobytea("/home/me/myfile"),...);
It would be a nice ability to have but the server may not be on the same
machine (or even the same type/os/filesystem) as the client so sending a
pathname to the server is of little use. Such an ability must be handled by
the client which can read/translate the local file into the appropriate byte
sequence to send to the server.
I'm toying with a client program in C program that would include the ability
to take a file on the client machine and store it in a bytea field but
haven't come up with a good way to handle the necessary buffer space (ie. I
don't see any way to "stream" data to the server - it appears I need to
allocate sufficient buffer space for the entire file plus the entire escaped
version of the file before I can generate the query which causes me some
other problems.) Any suggestions would be appreciated.
Cheers,
Steve
On Wednesday 05 February 2003 2:25 am, Adrian Chong wrote:
> Hi Christoph,
>
> Thanks for your reply. But what I want to do is loading a file of a
> particular path with a sql statement in psql. Why I need to care about how
> the file looks like? Thanks.
>
> Adrian
>
> - Original Message -
> From: "Christoph Haller" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Cc: <[EMAIL PROTECTED]>
> Sent: Wednesday, February 05, 2003 6:15 PM
> Subject: Re: [SQL] bytea
>
> > > I have a table containing a field of type bytea:
> > >
> > > CREATE TABLE a_table (
> > > a_field bytea
> > > );
> > >
> > > How can I import a file in a SQL script? What function I can use?
> >
> > The documentation says as in PostgreSQL 7.2.1 (I doubt this changed
> > significantly since)
> >
> > Octets of certain values must be escaped (but all octet values may be
> > escaped) when used as part of a string literal in an SQL
> > statement. In general, to escape an octet, it is converted into the
> > three-digit octal number equivalent of its decimal octet value, and
> > preceded by two backslashes.
> >
> > In general it goes like this
> > INSERT INTO a_table ( a_field ) VALUES ( '\\000\\001\\002\\003' ) ;
> > to load the first four ASCII characters.
> > You did not mention how your file looks like.
> > There is also a C function available called PQescapeBytea
> > which does all the required escaping to store memory areas in bytea
> > columns.
> > Refer to Command Execution Functions within libpq - C Library for
> > details.
> >
> > Regards, Christoph
> >
> >
> >
> > ---(end of broadcast)---
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to [EMAIL PROTECTED] so that your
> > message can get through to the mailing list cleanly
>
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
[SQL] Replacing a simple nested query?
I've got a simple nested query: select * from attributes where id in (select id from attributes where (name='obsid') and (value='oid00066')); that performs abysmally. I've heard this described as the 'classic WHERE IN' problem. Is there a better way to obtain the same results? The inner select identifies a set of ids (2049 of them, to be exact) that are then used to locate records that have the same id (about 30-40K of those, including the aforementioned 2049). Thanks! -Steve -- Steve Wampler -- [EMAIL PROTECTED] Quantum materiae materietur marmota monax si marmota monax materiam possit materiari? ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] Replacing a simple nested query?
On Sun, 2003-07-13 at 14:50, Steve Wampler wrote: > I've got a simple nested query: > > select * from attributes where id in (select id from > attributes where (name='obsid') and (value='oid00066')); > > that performs abysmally. I've heard this described as the > 'classic WHERE IN' problem. > > Is there a better way to obtain the same results? The inner > select identifies a set of ids (2049 of them, to be exact) > that are then used to locate records that have the same id > (about 30-40K of those, including the aforementioned 2049). For the record, Joe Conway and Hannu Krosing both provided the same solution: select at.* from attributes_table at, attributes a where at.id = a.id and a.name='obsid' and a.value='oid00066'; which is several orders of infinity faster than than my naive approach above: - lab.devel.configdb=# explain analyze select * from attributes_table where id in (select id from attributes where (name='obsid') and (value='oid00066')) order by id; NOTICE: QUERY PLAN: Index Scan using id_index on attributes_table (cost=0.00..8773703316.10 rows=241201 width=59) (actual time=136297.91..3418016.04 rows=32799 loops=1) SubPlan -> Materialize (cost=18187.48..18187.48 rows=15 width=25) (actual time=0.01..1.68 rows=1979 loops=482402) -> Index Scan using name_index on attributes_table (cost=0.00..18187.48 rows=15 width=25) (actual time=0.27..251.95 rows=2049 loops=1) Total runtime: 3418035.38 msec -- lab.devel.configdb=# explain analyze select at.* from attributes_table at, attributes a where at.id = a.id and a.name='obsid' and a.value='oid00066'; NOTICE: QUERY PLAN: Nested Loop (cost=0.00..18739.44 rows=217 width=84) (actual time=0.76..1220.65 rows=32799 loops=1) -> Index Scan using name_index on attributes_table (cost=0.00..18187.48 rows=15 width=25) (actual time=0.47..507.31 rows=2049 loops=1) -> Index Scan using id_index on attributes_table at (cost=0.00..35.80 rows=12 width=59) (actual time=0.11..0.31 rows=16 loops=2049) Total runtime: 1235.42 msec --- My thanks to both Joe and Hannu! Steve -- Steve Wampler -- [EMAIL PROTECTED] Quantum materiae materietur marmota monax si marmota monax materiam possit materiari? ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] Table Partitioning and Rules
OK, so basically you are trying to keep a hundered some odd attributes on everyone in the US. It's possible that a 150 column table is properly normalized (I have a similar situation) but it is rare. Suppose it is really properly normalized. You can still benefit from indexes on just some of the columns by choosing those most commonly used in queries. You may also want to research partial indexes (create index foo ... where bar=baz) which can under certain circumstances be far smaller and faster than full indexes. Breaking the table up won't improve the full table scan - in fact as each sub-table's file grows it will probably fragment on the disk much worse than a single growing file would which will, along with all the overhead of joining all the tables, make things worse. Review your structure carefully. Plan on $$$ for the hardware. Cheers, Steve On Thursday 17 July 2003 12:22 pm, Girish Bajaj wrote: > The problem is that Im worried about sequential scans. This particular > table can have upto 150 cols and 250 million records. Now we have a > reporting requirement that someone could select on ANY col and filter on > any col as well. Meaning someone could so a SELECT on col number > 1,2,310,1000 from contact where col num 75='X' and col num 139 = 'Y'. > > I cant possibly index all the cols in the table. So I thought Id best > manage the data by splitting up the table into multiple partitions and > eventually depending on application logic, only scan those tables that are > necessary to scan sequentially instead of the whole big table. > > Im getting a little confused here cause eventually I would want to join in > this 250 million gigantic table as well.. and that would be a real big > problem causing loads of sequential scans wouldn't it? > > Thanks, > Girish > > > -Original Message- > From: Josh Berkus [mailto:[EMAIL PROTECTED] > Sent: Thursday, July 17, 2003 2:03 PM > To: Richard Huxton; Girish Bajaj; [EMAIL PROTECTED] > Subject: Re: [SQL] Table Partitioning and Rules > > Girish, > > > > Essentially Im trying to store a persons information in a table in the > > > database. Since we could have millions of people, with duplicates! Ive > > > decided we need to partition the table into segments where all people > > with > > > > the LastName starting from A to G will be in one table. H-N will be in > > > another table and O-Z in the third. Ive created a VIEW that does a > > > UNION > > on > > > > all the tables. > > This sounds hideously inefficient and a management headache besides. I > think > PostgreSQL will accept up to 2 billion rows in any one table, and splitting > stuff into 3 tables will not improve your performance ... quite the > opposite. > > Change your database design. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Calc
On Wednesday 15 October 2003 12:59 pm, Muhyiddin A.M Hayat wrote: > Dear all, > > I have below table > > +-++-+ > > | id |db|cr | > > +-++-+ > > |1 || 200 | > |2 | 100| | > |3 | 100| | > |4 | 150| | > |5 || 200 | > > I Would like to calc that balance, and look like > > +-++-+---+ > > | id |db|cr | > | bal| > > +-++-+---+ > > |1 || 200 > | |200 | 2 | 100| > | |100 | 3 | 100| > | |0| 4 | 150 > | | |-150 | 5 | > | | 200 |50 | > > What can I do to get result like that Something like this: select id,db,cr,(select sum(cr-db) from calc sub where sub.id <= calc.id) from calc; This of course assumes that ID indicates the correct order of the entries and it will blow up if you allow nulls for cr or db (which you shouldn't since that would literally be interpreted as "they withdrew 10 and deposited an unknown amount"). If you have null values already and they should be interpreted as 0 just do this: select id, db, cr, (select sum(coalesce(cr,0)-coalesce(db,0)) from calc sub where sub.id <= calc.id) from calc; I assume no responsibility for potential lack of scalability of this query. :) It's quite possible a faster solution exists - we'll see what comes up on the list. Cheers, Steve ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Calc
On Thursday 16 October 2003 10:37 am, Muhyiddin A.M Hayat wrote: > > Something like this: > > > > select id,db,cr,(select sum(cr-db) from calc sub where sub.id <= > > calc.id) from calc; > > > > This of course assumes that ID indicates the correct order of the > > entries and it will blow up if you allow nulls for cr or db > > (which you shouldn't since that would literally be interpreted as > > "they withdrew 10 and deposited an unknown amount"). If you have > > null values already and they should be interpreted as 0 just do > > this: > > > > select id, db, cr, (select sum(coalesce(cr,0)-coalesce(db,0)) > > from calc sub where sub.id <= calc.id) from calc; > > > > I assume no responsibility for potential lack of scalability of > > this query. :) It's quite possible a faster solution exists - > > we'll see what comes up on the list. > > If data from "View" without ID, how can I do? > > My View: > > trx_date | trx_time | descriptions| > payment_method | debet | credit | creator > +--+--+ >- ---+---+--+- > 2003-10-09 | 21:55:02 | Resto Biling : 13,800, Paid : 10,000 | > Visa > > | 3800 | | middink > > 2003-10-16 | 03:28:30 | Payment - Thank You | > Visa > > | | 4.00 | middink > > 2003-10-08 | 18:17:40 | Payment - Thank You | > Cash > > | | 5.00 The id field only keeps the transactions in the correct order so you can sum the previous transactions. You can do the same thing with your date and time fields (though using a single datetime field would be easier). A caution, though: you will have trouble if two transactions share the same date and time. Cheers, Steve ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[SQL] Copying rows between tables?
I know I can select from one table into a new table, but is there a way to select from one table into an existing table (aside from copying everything to a file, editing the file and then copying from that file)? Without destroying existing entries, of course... I have an application where old records are archived into 'archive' tables. Occasionally there is a need to copy some of these old records into the 'active' table. Thanks for any pointers! Steve -- Steve Wampler -- [EMAIL PROTECTED] The gods that smiled on your birth are now laughing out loud. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] User defined types -- Social Security number...
On Monday 01 March 2004 8:54 am, Dana Hudes wrote: > I would represent an SSN as numeric(9,0). > an int 32 would work though. > 2**31 is > 9 > > On Sun, 29 Feb 2004, Christopher Browne wrote: > > The world rejoiced as [EMAIL PROTECTED] (Michael Chaney) wrote: > > > Look, you're thinking way too hard on this. An SSN is a > > > 9-digit number, nothing more. There are some 9-digit numbers > > > which aren't valid SSN's, and you might want to get fancy and > > > create a constraint for that. > > > > > > Regardless, you are making a *major* mistake of confusing data > > > storage with rendering. It is common to *render* an SSN as > > > xxx-xx- and its cousin the FETID (Federal Employers Tax ID) > > > as xx-xxx. To store the dashes makes no sense. They're in > > > the same place each time, it's wasted data. I missed the start of this thread but will chime in with a comment anyway. My rule is to select an appropriate numeric type of data if you will be doing numeric types of things to it, character types if you will be doing character manipulations, etc. I don't know of any good reasons to need to know SSN/6.9, sqrt(SSN), SSN+7.86 but there are plenty of good reasons to need the first three characters (the "area number"), the middle two characters (the "group number", and the last 4 characters (the "serial number", often (ab)used as a password for banking and other purposes). While the dashes certainly don't need to be stored, they are not in arbitrary positions as they delimit the parts of the SSN noted above. One might even want to store the SSN in three columns, the first linked to a table of valid area numbers, the second error checked so "00" is not valid and so on or get even more fancy and error check against: http://www.ssa.gov/employer/highgroup.txt. It all depends on one's specific requirements. Google and you will find SSN info pages such as: http://proagency.tripod.com/usasssearch.html http://www.ssa.gov/foia/stateweb.html Cheers, Steve ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] "=" operator vs. "IS"
> I'm just curious - why is it not possible to use the "=" operator > to compare values with NULL? I suspect that the SQL standard > specified it that way, but I can't see any ambiguity in an > expression like "AND foo.bar = NULL". Is it because NULL does not > "equal" any value, and the expression should be read as "foo.bar is > unknown"? Or is there something else I'm missing? You've got it. NULL is the _absence_ of a known value so any comparison or operation on it yields an unknown result. So why can't you use = NULL? Consider the a list of names and ages where Jack's and Jill's ages are null. Now we run a query to list people who are of the same age. Should Jack and Jill be listed as being the same age? Of course not. You can't compare whether the two unknown values are equal any more than you could determine whether or not they are over 18. The SQL spec and PostgreSQL properly use and enforce this interpretation of NULL. The correct way to ask your questions is ...where foo.bar is null... Cheers, Steve ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [SQL] locks and triggers. give me an advice please
> often, I am turning triggers off and on to perform a mass operation > on a table, and i am interested how should i care of another user > operations. > > the scene is: > table t1 with user defined triggers > and many tables reference t1, (so FK triggers defined on t1) > > the operation i want to perform on t1 makes a great load to a > server and have no use in triggers at all. > the best way to perform this operation is to delete all records, > modify, and insert them back without changing any adjuscent table. > (this way takes a few seconds.) > so i turn off triggers on t1 completely (updating > pg_class.reltriggers) operate > and turn on triggers on t1. > > it works fine. > > the question is: > > what should i do to prevent other users of data modification on the > t1 and the adjuscent tables while triggers is off ? If I understand your question correctly you should use a transaction and lock the table; begin transaction; lock t1 in access exclusive mode; Turn off triggers and do your updates. (Note, "truncate t1" is faster than "delete from t1" followed by a "vacuum full" and you might consider running "reindex table t1" after your mass update or if appropriate drop your indexes, load the data, then recreate them.) Re-establish triggers. commit; --end of transaction unlocks the table Cheers, Steve ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Cast NULL into Timestamp?
On Friday 10 December 2004 11:24 am, Wei Weng wrote: > I have a table > > create table temp > ( > tempdatetimestamp, > tempnamevarchar(10) > ); > > And I tried to insert the following: > > insert into table temp (tempname, tempdate) > select distinct 'tempname', null from some_other_relevant_table; > > And I got an error that says "column "tempdate" is of type > timestamp ... but expression is of type text... will need to > rewrite or cast the expression". > > I really need the distinct. Is there anyway to cast this NULL into > a timestamp or any other workarounds? How about: insert into table temp (tempname) select distinct 'tempname' from some_other_relevant_table; Unless there's something you have left out in describing your setup this will leave the tempdate column null. Cheers, Steve ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[SQL] Looking up table names by REFERENCES
Hello all. I'm trying to write a recursive procedure to allow me to handle some data backup routines and deal with conflicts when restoring from within the program. Essentially, what I'd like to be able to do is if a table called "image" has a column called "file_id" which references a column called "file_id" in another table called "file" I want to be able to determine that pragmatically. If I wanted to backup all of the information about images in the database, I would need to backup all of the information about the file(s) each image corresponds to. For instance, I can get a list of all (user) table names with: SELECT relname AS table_name, oid FROM pg_class WHERE NOT relname ~ 'pg_.*' AND NOT relname ~ 'pga_.*' AND NOT relname ~ '.*_pkey' AND NOT relname ~ '.*_id_key' ORDER BY relname; and I can get a list of column names and their types (for the "image" table) with: SELECT a.attname AS field, t.typname AS type FROM pg_class c, pg_attribute a, pg_type t WHERE c.relname = 'image' and a.attnum > 0 and a.attrelid = c.oid and a.atttypid = t.oid ORDER BY a.attnum; Surely there's a simple way I can trace REFERENCES in a particular column across tables? Any help would be most appreciated, especially if I could be cc'd directly. Cheers Steve Castellotti
[SQL] Pagination with Output Variables?
I have a MSSQL Stored Procedure that returns a particular section of a
resultset depending on the desired "page". This procedure also has a
parameter marked as OUTPUT which indicates the total number of records in
the actual resultset. Is there a way to do something similar in PG or does
TotalRecords need to be part of the type returned by the function.
MSSQL Procedure
CREATE PROCEDURE PageCustomers (
@Page int,
@PerPage int,
@TotalRecords int OUTPUT
) AS
--Do stuff to determine number of records
SELECT @TotalRecords = COUNT(*) FROM tmpCustomers
--Get the actual data
SELECT CustomerID, CustomerName, EmailAddress FROM tmpCustomers WHERE ...
Would the only way to do this in PG to have my function return a type like
this?
CREATE TYPE "PageCustomers_type"
AS ("TotalRecords" int4, "CustomerID" int4, "CustomerName" varchar(100),
"EmailAddress" varchar(100));
Can a function return two type results? So that the first type would be just
the total number of records, and the second type would be the resultset of
customers?
Thanks,
Steve
---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://archives.postgresql.org
[SQL] Junk queries with variables?
I really have to be missing something here and this probably a *really* noob question. I don't have a problem running little junk queries in the pgAdmin query window(SELECT blah FROM blah, INSERT INTO blah, etc...), but I can't figure out how to run queries with variables outside of a function. I just want to use variables without having to go about creating and dropping a function for every stupid little query I need to write. Example: amount int4 := 1000; earliest_date timestamp := current_timestamp; SELECT ... I always get the error: "ERROR: syntax error at or near "amount" at character 1". What have I done wrong, or am I missing? Thanks, Steve ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] Junk queries with variables?
> Michael's given you one option - another to look at is PREPARE/EXECUTE > PREPARE my_query(int4) AS INSERT INTO foo VALUES ($1); > EXECUTE my_query(1); > EXECUTE my_query(7); > ... This doesn't seem to be quite what I'm looking for. PREPARE according to the docs is for a one line statement. I'm looking for the ability to do multiple operations, basically using it for general DB queries that aren't really needed more than once. This again comes from my MSSQL background where you can go to town and just start writing out TSQL to do one off queries. Thanks, Steve ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [SQL] Junk queries with variables?
> > I don't know about pgAdmin, but in psql you can use \set: > > \set id 1 > SELECT * FROM foo WHERE id = :id; > > \set name '\'Some Name\'' > SELECT * FROM foo WHERE name = :name; > Whenever I try the above I get an error at the backslash. Do I need to create a different language for this? Right now I only have plpgsql available. Thanks, Steve ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Junk queries with variables?
> > In pgadmins SQL-window SQL is the 'language' of choice. Or it is > rather the > only language. Thus if you intend to program plTk or PL/pgSQL, there's no > way around defining a function. > > (At first you have to define a new language in your schema) I'm a little confused, is there no way around this, or are you saying I need to use CREATE LANGUAGE to define a new language to use? Currently the only language I have for the DB is plpgsql. Thanks, Steve ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] Make COUNT(*) Faster?
Chris Browne wrote: > None of those transactions have COMMITted, so there are some 78 tuples > "in limbo" spread across 16 transactions. > > If there were some "single secret place" with a count, how would you > suggest it address those 78 tuples and 16 transactions that aren't yet > (and maybe never will be) part of the count? Hmmm, I understand this and don't doubt it, but out of curiousity, how does the current SELECT COUNT(*) handle this? It doesn't lock the entire table while counting (I assume) so the current implementation is really just an approximate count in the above scenario anyway. Or even when not, since the true 'count' is likely to have changed by the time the user does anything with the result of SELECT COUNT(*) on any active table (and on an inactive table, pg_class.reltuples is nearly as good as SELECT COUNT(*) and far faster to get to.) I assume this has been beaten well past death, but I don't see why it wouldn't be possible to keep pg_class.reltuples a bit more up-to-date instead of updating it only on vacuums. -- Steve Wampler -- [EMAIL PROTECTED] The gods that smiled on your birth are now laughing out loud. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SQL] Make COUNT(*) Faster?
Bruno Wolff III wrote: > No, it is an exact count. Yes, for the transaction, but it's an approximation of the number of tuples in the table - which is probably what the people who worry about its cost are more interested in (an approximate count for the table). I'm also claiming that a true count for any active table is meaningless and am *not* suggesting that effort be spent on trying to produce such a true count. >>I assume this has been beaten well past death, but I don't see why it >>wouldn't be possible to keep pg_class.reltuples a bit more up-to-date >>instead of updating it only on vacuums. > > > Because it costs resources to keep track of that and people don't usually need > exact tuple counts for whole tables. Yes, we agree completely! (Which is why I said 'a bit more' instead of 'exactly' above.) My uses for COUNT(*) are to get 'reasonable' approximate counts of the table sizes - not true counts, but approximate values. Unfortunately, pg_class.reltuples gets too far off too fast for me to use it as a consistent guide to current table size. If you Folks Who Know believe that simply keeping pg_class.reltuples 'closer' to the actual table size is too expensive, I'll accept that [after all, I have to right now anyway], but I'm surprised that it is, given all the other work that must go on at the start/close of a transaction. I also understand that 'reasonable' and 'closer' are vague terms. In the example scenerio where there were around 80 rows in an indeterminate state, my claim is that, in a table of around a million rows, it doesn't matter whether some portion of those indeterminate rows are included in an approximation of the table size or not (though it might in a table of 100 'true' rows - but the decision to ask for a true 'transaction' count (slow) or an approximate table size (fast) should be left to the user in either case). So, leave COUNT(*) alone. But it would be very handy to have a way to get an approximate table size that is more accurate than is provided by a pg_class.reltuples that is only updated on vacuums. -- Steve Wampler -- [EMAIL PROTECTED] The gods that smiled on your birth are now laughing out loud. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Make COUNT(*) Faster?
Tom Lane wrote: > Steve Wampler <[EMAIL PROTECTED]> writes: > >>So, leave COUNT(*) alone. But it would be very handy to have a >>way to get an approximate table size that is more accurate than is >>provided by a pg_class.reltuples that is only updated on vacuums. > > If you want something cheap, you could use the same technique the > planner uses nowadays: take RelationGetNumberOfBlocks() (which is > guaranteed accurate) and multiply by reltuples/relpages. I don't > see anyplace where RelationGetNumberOfBlocks is directly exposed to > users now, but it'd be trivial to code up a couple of C functions to > provide this functionality. Yes - this would be an excellent approximation for my needs! The solution that Dawid Kuroczko suggested (just call "explain select * on ..." and parse the result) would be equivalent these days, right? (I think in the 7.x versions the planner was just using pg_class.reltuples, which wouldn't have helped.) If true, I can handle that parsing myself easily enough without exposing RelationGetNumberOfBlocks. Thanks (Tom and Dawid)! -- Steve Wampler -- [EMAIL PROTECTED] The gods that smiled on your birth are now laughing out loud. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Make COUNT(*) Faster?
Christopher Browne wrote: >>I'm also claiming that a true count for any active table is >>meaningless and am *not* suggesting that effort be spent on trying >>to produce such a true count. > > > That's a pretty big assumption that would in fact be WRONG. Please reread the message from Bruno and reconcile the above statement with his assertion (which I believe) that there is *no* single true count for an active table. [I'm defining 'active' as currently undergoing insert/copy/delete/update actions]. > We have managers interested in counting the number of objects we have > around (As a domain registry, what objects would you imagine those > might be :-)), and they're keen on possibly even being able to > reconcile those counts from day to day based on transaction activity. If Bruno is correct, then they need to do this reconcilation from within a single transaction (the same one that does the COUNT(*)) - or else they are working on an 'inactive' table [one not currently accepting changes]. If neither condition holds, then isn't the result they are using from COUNT(*) currently is *already* an approximation? > Leaping into some sort of vague guesstimation would destroy the > ability to do any kind of analysis of activity, and I daresay enrage > them. No doubt! Let's hope the above conditions hold. > There may be times that a really rough guess can suffice; there are > other times when exactness is absolutely vital. But, as others have said, COUNT(*) does not return a true count for a table, but rather just a true count for the *current transaction*. So COUNT(*)'s from different simultaneous transactions may very well produce different values. > Creating a "fast but WRONG COUNT(*)" which prevented getting the exact > answer that the present implementation provides would be a severe > misfeature. Agreed - note that I did not suggest replacing the current COUNT(*) with an inexact version, but wanted (and now have) a quick way to get a reasonable approximation of the current table size. -- Steve Wampler -- [EMAIL PROTECTED] The gods that smiled on your birth are now laughing out loud. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[SQL] Problem with "NOT IN (subquery)
Hi, I have a table named "famille" whose structure and content is : famille_code | famille_mere_famille_code | famille_libelle | famille_niveau --+---+---+- --- 00 | | Mhre | 0 I1 | 00| Composants| 1 IN0001 | I1| Micro-processeurs | 2 IN0002 | I1| Mimoires RAM | 2 INS001 | IN0002| DDR-SDRAM | 3 INS002 | IN0002| DDR2-SDRAM| 3 INS003 | IN0002| SDR-SDRAM | 3 IN0003 | I1| Cartes mhres | 2 IN0004 | I1| Disques durs | 2 IN0005 | I1| Cartes graphiques | 2 IN0006 | I1| Cartes son| 2 IN0007 | I1| Riseau| 2 IN0008 | I1| Lecteurs CD/DVD | 2 IN0009 | I1| Graveurs CD/DVD | 2 IN0010 | I1| Bontiers | 2 I2 | 00| Piriphiriques | 1 IN0011 | I2| Cli USB | 2 IN0012 | I2| Modems| 2 IN0013 | I2| Imprimantes | 2 (19 rows) The first col is the family id and the second is the mother family id. I would get a list of all families that are never in the col n°2, so the families that aren't node but leaf. The query, i made was "SELECT * FROM famille WHERE famille_code NOT IN (SELECT DISTINCT famille_mere_famille_code FROM famille);" But the DB returns 0 records instead of 15. If i use a list instead of a subquery it works normaly but it's not easy to manage it like this. So if anyone can help me please ... ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [SQL] plsql / time statement
Daniel Caune wrote: Hi, Is there any option to set so that psql provides the execution time of each SQL statement executed? \timing (either as a manual command or as a default in your .psqlrc file). Cheers, Steve ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] how to solve this problem
I have a table where there are 20 columns named vinc1, vinc2, vinc3, vinc4, etc the values contained into each column are simply 1 or 0 (each column is dichotomic) 1 means presence 0 means absence I would obtain a column (new_column) containg the name of the columns, comma separated, where the value is = 1 for example: vinc1 vinc2 vinc3 vinc4 new_column 1 0 1 0 vinc1,vinc3 0 0 0 1 vinc4 0 1 1 1 vinc2,vinc3,vinc4 can someone help me to find the best way to obtain this result??? thank you very much Here's a brute-force method. Maybe someone else has a more elegant way. More info on the nature of your data and what you are trying to obtain from it would help in finding such elegance. select substr( case when vinc1 = 1 then ',vinc1' else '' || case when vinc2 = 1 then ',vinc2' else '' || ... case when vinc20 = 1 then ',vinc20' else '' ,2) as new_column ... As to the binary representation mentioned elsewhere, the idea is that you can view vinc1-vinc20 as bits in a binary number equal to vinc1 + 2*vinc2 + 4*vinc3 + 8*vinc4...2^19*vinc20. Whether or not this is useful depends on what you are trying to do. Cheers, Steve ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] it's not NULL, then what is it?
... canon=# select count(maf) from gallo.sds_seq_reg_shw canon-# where maf ISNULL; count --- 0 (1 row) I believe count will only count not-null anyway so this will always return zero. Try count(*) instead of count(maf). Here's an example: st...@[local]=> select * from barr; LOG: duration: 0.226 ms a | b ---+--- a | b c | d | e (3 rows) st...@[local]=> select coalesce(a, 'a is null'), coalesce(b, 'b is null') from barr; LOG: duration: 0.283 ms coalesce | coalesce ---+-- a | b c | d a is null | e (3 rows) st...@[local]=> select count(a) from barr; LOG: duration: 0.339 ms count --- 2 (1 row) st...@[local]=> select count(*) from barr where a isnull; LOG: duration: 0.350 ms count --- 1 (1 row) -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] SQL report
[email protected] wrote: I have the following senario. I have a tracking system. The system will record the status of an object regularly, all the status records are stored in one table. And it will keep a history of maximum 1000 status record for each object it tracks. The maximum objects the system will track is 100,000. Which means I will potentially have a table size of 100 million records. I have to generate a report on the latest status of all objects being tracked at a particular point in time, and also I have to allow user to sort and filter on different columes in the status record displayed in the report. ... Just wanna to know if anyone have a different approach to my senario. Thanks alot. Not knowing all the details of your system, here are some things you could experiment with: 1. Add a "latest record id" field in your object table (automatically updated with a trigger) that would allow you to do a simple join with the tracking table. I suspect that such a join will be far faster than calculating "max" 100,000 times at the expense of a slightly larger main table. 2. Add a "current record flag" in the status table that simply flags the most recent record for each object (again, use triggers to keep the flag appropriately updated). This would also eliminate the need for the "max" subquery. You could even create a partial index filtering on the "current record flag" which could speed things up if the reporting query is written correctly. 3. Partition the table into a "current status table" and "historical status table" (each inheriting from the main table). Use a trigger so that anytime a new status record in added, the old "current" record is moved from the "current" to the "historical" table and the new one added to the "current" table. The latest status report will only need a simple join on the "current" table with a max size of 100,000 rather than a more complex query over a 100,000,000 record table. Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Month/year between two dates
Bor wrote:
Hi to all,
I have a very simple question. Let's say that I have three records (id, date
from, date to):
1 2009-01-01 2009-08-31
2 2009-08-01 2009-08-10
3 2009-08-11 2009-08-31
Now I want to get records, "related" to a single month/year data (two
integers). For 2009/08 (int1 = 2009, int2 = 8) I should get all three
records, for 2009/05 only record 1, but for 2009/11 none of the records.
Is there any simple way to do this? A query would do :).
Thanks alot.
Lots of ways. The following springs to mind but I'm sure there are
simpler ways (I'm assuming the date_from and date_to are data-type date
and you are stuck with using int for year and month).
Use date_trunc to convert any date in a month to the first of the month
and the following should work (untested):
...
date_trunc('month', date_from) <= (int1::text || '-' || int2::text ||
'-1')::date and
date_trunc('month', date_to) >= (int1::text || '-' || int2::text ||
'-1')::date
...
Cheers,
Steve
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] LIMIT 1; The Integer part only
bilal ghayyad wrote: 1) When writing the function (I mean sql function) in the postgresql, I noticed the use for LIMIT 1, but did not understand what does it mean and why we use it? Limit return to 1 record (or other specified number). Note that without ORDER BY, there is no guarantee of which record will show up at the top of the list. See also OFFSET. I.e. to skip the first 30 records and only return the next 10 (records 31-40) in your list: SELECT * from foo order by bar LIMIT 10 OFFSET 30; 2) Also in the function (the sql function) in the postgresql, if I need to take the integer part of the number, which math function can do this for me? For example, I have 3.900 and I need only the 3 (the integer part), which math function to be used? floor(3.900) Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Table Design for Hierarchical Data
Lee Hachadoorian wrote: I am trying to come up with a structure to store employment data by NAICS (North American Industrial Classification System). The data uses a hierarchical encoding scheme ranging between 2 and 5 digits. That is, each 2-digit code includes all industries beginning with the same two digits. 61 includes 611 which includes 6111, 6112, 6113, etc. A portion of the hierarchy is shown after the sig. From the http://www.census.gov/eos/www/naics/ website: "NAICS is a two- through six-digit hierarchical classification system, offering five levels of detail. Each digit in the code is part of a series of progressively narrower categories, and the more digits in the code signify greater classification detail. The first two digits designate the economic sector, the third digit designates the subsector, the fourth digit designates the industry group, the fifth digit designates the NAICS industry, and the sixth digit designates the national industry. The five-digit NAICS code is the level at which there is comparability in code and definitions for most of the NAICS sectors across the three countries participating in NAICS (the United States, Canada, and Mexico). The six-digit level allows for the United States, Canada, and Mexico each to have country-specific detail. A complete and valid NAICS code contains six digits." I think I'd be inclined to store it as defined above with tables for sector, subsector, industry-group and NAICS-industry. So the NAICS table might have a primary key of industry_code (11131, Orange Groves) and a industry_group column with a foreign-key constraint to the industry-group table (1113, Fruit and Tree Nut Farming). You might add a constraint to ensure that the industry-group is the appropriate substring of the naics code and so on up the heirarchy. If you are dealing with importing a large amount of static source data for analysis, these tables will also be tailor-made places to do pre-aggregation. Adjacency lists work well in certain cases where the depths of the trees are variable or indeterminate. For example, think of an employee->boss org-chart for a large company. The maintenance supervisor for an area might be a dozen levels below the CEO and be a several levels above the branch night janitor while the CEO's personal assistant is just one level down but with no direct reports. The CTE/recursive-query features in 8.4 are great for this. But in the case you have described, the number of levels is well defined as is the type of information associated with each level. But this all depends on the nature of your source data, how often it is updated, how big it is and the questions you want answered. It might be perfectly acceptable to just have the 5-digit code on all your individual data records and do something like select ... group by substr(full_naics_code,1,3) where substr(full_naics_code,1,2)='61'). In this case you will still want to keep the NAICS definition table separate and link to it. One question that might impact this is the coding of your source data. Is it all full 5-digit coding or are some records coded at a high level of detail and others only to the top-level? One way to store this data would be to store at the most granular level (5-digit NAICS) and then aggregate up if I wanted employment at the 4-, 3-, or 2-digit level. The problem is that because of nondisclosure rules, the data is sometimes censored at the more specific level. I might, for example, have data for 6114, but not 61141, 61142, 61143. For a different branch of the tree, I might have data at the 5-digit level while for yet another branch I might have data only to the 3-digit level (not 4 or 5). I think that means I have to store all data at multiple levels, even if some of the higher-level data could be reconstructed from other, lower-level data. What do you mean by censored? Is the data supplied to you pre-aggregated to some level and censored to preserve confidentiality or are do you have the record-level source data and the responsibility to suppress data in your reports? Is the data suppression ad-hoc (i.e. someone comes to you and says don't display these five aggregates), based on simple rules (don't display any aggregate with fewer than 15 records) or on more complex rules (don't display any data that would allow calculation of a group of fewer than 15)? My guess is that the multi-table scenario will be better suited to flagging aggregates for suppression. Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] creating a versioning system for sets?
I'm working on a web app for a quality control checklist. I already have a table set up, but I have a hunch that our model is sub-optimal and I could get some better performance.I'm hoping someone on this list can help me think clearly about how to express this efficiently in SQL. Each checklist has dozens, sometimes hundreds of questions. Each question has between 2 and 10 possible answers. Each Question is a varchar string, and so is each answer. A completed checklist is when all of the questions are associated with one of its possible answers -- i.e., when one answer is chosen. Checklists are different for different purposes, and they can change over time. So to keep completed checklists from inadvertently changing when we want to have changes in new checklists, we have templates. Templates, Questions, and Answers are a mirror of Checklists, Questions, and Answers, and represent the 'current version' of the checklist. We don't do the checklists ourselves, but provide them for our clients, so that is another level of grouping going on. So the current table hierarchy looks like this - Clients - Templates - TemplateQuestions - TemplateQuestionAnswers - Checklists - ChecklistQuestions - ChecklistQuestionAnswers Because we don't want changes in the current template to 'go back in time' and change completed checklists, data is copied from Templates into Checklists when a user goes to start a new checklist. As you can guess, this creates a lot of duplication. In ChecklistQuestionAnswers, out of about a million answer rows, there are only 4,000 distinct answers. Of course, TemplatesQuestionAnswers has duplication too, but not as bad. So what I'm think I want to do is create a versioning system for checklist templates, so I can save on space by storing unique questions with unique sets of answers only once. That way, instead of duplicating text wholesale, I can just link a Checklist against a *version* of a Template, and then a checklist set is which answer was chosen for which question. Here's what I've sketched out so far. "A clients has many templates. A template has many revisions, but only one current revision. Each revision has many questions, and each question has many ( between 2 and 10 ) answers. Each Checklist relates to one Template. Each checklist has a set answers that indicate the answer select for each question in its version of the template." Questions /* all unique question wordings */ Questions.id Questions.question Answers /* all unique answer wordings. */ Answers.id Answers.answer Templates Templates.client_id /* relates to client table. */ Templates.template_name Templates.current_version /* this is related to TemplateVersions.version_number */ TemplateVersions /* A logical grouping of a set of questions and answers */ TemplateVersions.version TemplateVersions.template_id /* relates this version to a template. */ TemplateQuestions TemplateQuestions.template_version /* relates a question to a template version */ TemplateQuestions.question_id /* relates a unique question to this template version */ TemplateQuestions.id TemplateQuestionAnswers TemplateQuestionAnswers.template_question_id /* relates this answer to a particular template version question */ TemplateQuestionAnswers.answer_id /* relates the unique question to a unique answer */ TemplateQuestionAnswers.id Checklists Checklists.id Checklists.template_version /* relates this question to a template version -- associating this checklist to a client happens through this relationship */ ChecklistAnswers /* ( I might call this something other than 'Answers' since the lack of ChecklistQuestionAnswers breaks 'name symmetry' with TemplateQuestionAnswers ) */ ChecklistAnswers.checklist_id ChecklistAnswers.question_id ChecklistAnswers.answer_id The rub I'm getting hung up on is guaranteeing that ChecklistAnswers associates a proper question-and-answer pair -- the relationship that exists in the version of the Template that its Checklist parent is referencing. In other words, each row in ChecklistAnswers must 'mirror' a question_id from TemplateQuestions to one child question from TemplateQuestionAnswers, form the template_version in Checklists. I'm trying to think of how to do this and my thinking process short circuits here. This is really the 'deliverable' of the database -- a completed checklist -- so all the other templates and everything is sort of epiphenomenal or an abstraction of that. If I can't get this working, I've missed the whole point! This seems a *little* unwieldy, so I'm wondering if I'm making a solution whose complexity is not worth the space-savings I might get from implementing it. Also note, I've simplified this a bit. There are other dimensions of complexity, such as a category system for grouping questions for reporting, but I don't think we need to get into that here. -
Re: [SQL] What does PostgreSQL do when time goes backward?
John Hasler wrote: Frank writes: My ntp client changes clock (by small amount) at any time: Jul 25 05:29:38 bax ntpd[10269]: adjusting local clock by 0.098724s Jul 25 05:31:43 bax ntpd[10269]: adjusting local clock by 0.038991s Jul 25 06:13:38 bax ntpd[10269]: adjusting local clock by -0.037131s Jul 25 15:01:52 bax ntpd[10269]: adjusting local clock by -0.112429s Ken writes: These do seem to be larger values than you might expect from a clock conditioned with ntpd. Is it a VM or is there something going on that would stop or suspend your system? John writes: There is certainly something wrong there. I saw very bad clock performance on one Linux box I had (dual-single core AMD cpus, no VMs), even with NTP, until I changed the clocksource kernel parameter to hpet. Unfortunately (or fortunately) I no longer have that box. -- Steve Wampler -- [email protected] The gods that smiled on your birth are now laughing out loud. -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Cumulative result with increment
On 02/07/2011 01:11 PM, Shawn Tayler wrote: Hello, I am struggling with what is probably a simple problem but I am coming up blank. In the interest of full disclosure I am not a very savy programmer by any stretch. I have a table of data from an external source which contains numbers of events per operating group per hour per day, kind of like this: Group | events | duration | timestamp The Group, events, and duration (milliseconds) are all integers in char format. Timestamp is as stated. The records are at 1 hour increments. I need to sum the events and duration fields in 8 hour (shifts, mid to 8AM, 8AM to 4PM, etc). Id like the output to be in table format ala: Group | events | duration | date| shift --+-++---+--- 100 | 26 |00:00:25.00 |2011-01-01 | Day I have a function to convert the duration to an interval already and can make one to do the shift labeling. Its the rest that has me stumped. Any suggestions or direction? As always, your help is greatly appreciated. I'm not sure exactly what you want but it sounds like you just want an output column that has the shift instead of timestamp. You can then sum on that column. Don't know what your shifts are called, but this will give you dog-, day- and night-shifts based on your times: case when extract(hour from '2011-01-12 1600'::timestamptz)< 8 then 'dog' when extract(hour from now())< 16 then 'day' else 'night' end as shiftname This can be used for grouping as well as display. Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] convert in GMT time zone without summer time
On 04/16/2011 05:02 AM, Jasen Betts wrote:
On 2011-04-15, LaraK wrote:
Hello,
I want write a function that converts a timestamp with time zone to the UTC
zone. But it should all be stored in the winter time.
Done! All timestamp with time zone information is stored internally in UTC.
But you need to be sure you really understand date/time manipulation in
PostgreSQL so you don't reinvent the wheel.
[CODE]
SELECT
to_char(CONVERT_TO_UTC(to_timestamp('2011-03-22 14:17:00', '-MM-DD
hh24:MI:SS'), 'CET'), '-mm-dd hh24:MI:SS') AS winter,
to_char(CONVERT_TO_UTC(to_timestamp('2011-04-22 14:17:00', '-MM-DD
hh24:MI:SS'), 'CET'), '-mm-dd hh24:MI:SS') AS summer
[/CODE]
must come out:
[CODE]
WINTER | SUMMER
+-
2011-03-22 13:17:00 | 2011-04-22 12:17:00
[/CODE]
that test case is ambiguous your inputs are timespamptz but
have an unspecified timezone (and so get the zone appropriate to
your time locale). I'm assuming your time locale is "Europe/Berlin"
and you really mean the following:
SELECT to_char(CONVERT_TO_UTC( '2011-03-22 14:17:00+01'::timestamptz
,'CET'),'-mm-dd hh24:MI:SS') AS winter, to_char(CONVERT_TO_UTC(
'2011-04-22 14:17:00+02'::timestamptz ,'CET'),'-mm-dd hh24:MI:SS')
AS summer;
If you can use the correct time zone name, everything is done for you.
Better yet, it will keep working when the timezone rules change (if you
apply your patches regularly) or for other time zones:
steve=> select '2011-03-22 14:17:00 Europe/Berlin' at time zone 'UTC';
timezone
-
2011-03-22 13:17:00
(1 row)
steve=> select '2011-04-22 14:17:00 Europe/Berlin' at time zone 'UTC';
timezone
-
2011-04-22 12:17:00
Cheers,
Steve
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] convert in GMT time zone without summer time
On 05/03/2011 12:15 AM, LaraK wrote:
Very good!
Another question:
I want to convert the string to the type TIMESTAMP WITH TIME ZONE. What do I
have to format? 'TZ' does not.
select to_timestamp('2011-03-22 14:17:00 Europe/Berlin', '-MM-DD
HH:MI:SS TZ')
Just cast it to a timestamp with time zone:
select timestamptz '2011-04-22 19:17:00 Europe/Berlin';
Remember...
The value of a timestamp with time zone is always stored internally as UTC.
When a timestamp with time zone is displayed, the time zone is based on
the client's default, the "set timezone to" statement or the "at time
zone" clause in the query.
In the case of an explicit "at time zone" clause, the result becomes a
timestamp without time zone data type (that is why the previous static
example with the "at time zone" clause was a timestamp without time zone).
A timestamp with time zone is useful to identify a specific point in
time. "Bin Laden's death was announced at...", "shuttle Endeavor
launched at...", "Amazon EC2 crashed at...". Most timestamp data I
encounter is of this type.
A timestamp without time zone might be useful for data like "Breakfast
is served at 7am". Presumably a hotel chain would serve at 7am in each
hotel and not have all hotels serve at 7am corporate headquarters time.
It takes a bit of time to wrap your head around time and time zones but
it would be well worth your time to carefully read
http://www.postgresql.org/docs/8.4/interactive/datatype-datetime.html
(IIRC, you are using 8.4) a couple times.
Cheers,
Steve
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] self join
On 05/14/2011 07:36 PM, Jasen Betts wrote: use the "NOT IN" operator with a subquery to retch the disallowed values Hmmm, "retch" as a synonym for "output"? I've seen more than one case where that is an appropriate description. :) Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] How to remove a set of characters in text-columns ?
On 06/30/2011 12:28 PM, Andreas wrote:
Hi,
how can I remove a set of characters in text-columns ?
Say I'd like to remove { } ( ) ' " , ; . : !
Of course I can chain replace ( replace ( replace ( replace ( ... , ''
) and replace the chars one by one against an empty string ''.
There might be a more elegant way.
Is there ?
regards
Look at regexp_replace()
http://www.postgresql.org/docs/9.0/static/functions-string.html
http://www.postgresql.org/docs/9.0/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP
Cheers,
Steve
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] why these results?
On 08/01/2011 03:50 PM, Wes James wrote: select count(*) from table; count --- 100 (1 row) is correct select count(*) from table where col::text ~~* '%text%'; count --- 1 (1 row) is correct. But now if I do: select count(*) from table where col::text !~~* '%text%'; count --- 98 (1 row) Shouldn't it be 99? That is out of 100 records there is one that has "text" in column "col" so the !~~* should return 99 rows. ?? -wes select count(*) from table where col is null; (null is neither equal nor not-equal to anything, even null) Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Mysterious column "name"
On 08/09/2011 08:34 AM, Oliveiros d'Azevedo Cristina wrote: Strange... Tables have hidden columns but AFAIK, "name" is not one of them... http://www.postgresql.org/docs/9.0/interactive/ddl-system-columns.html ... Good day! I found one strange results for sql-query. create table testtable ( f1 int, f2 int ); insert into testtable(f1, f2) values(1, 100); select testtable.* from testtable; // it is ok select testtable from testtable; // it is ok select testtable.name from testtable; // returns strange result. - name name - "(1,100)" - no error message about nonexistent column. Result is similar to second variant, but pgAdmin3 shows not full record. ... See http://www.postgresql.org/docs/9.0/static/rowtypes.html on composite data types and scroll to section 8.15.3. Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] how to calculate differences of timestamps?
On 09/26/2011 06:31 PM, Andreas wrote: How could I calculate differences of timestamps in a log-table? Table log ( user_id integer, login boolean, ts timestamp ) So login = true would be a login-event and login = false a logout. Is there a way to find the matching login/logout to calculate the difference? Or is there a better table "design" to do this? One way is a sub_select: select o.user_id, o.ts as logout_time, (select max(i.ts) from log i where i.user_id = o.user_id and i.ts < o.ts and login ) as login_time from log where not login ; This will give you login/logout time pairs. Just replace the "," with a "-" if you are interested in login duration. Depending on the frequency and duration of logins and the number of users you may have to play with indexes though an index on ts will probably suffice for most cases. Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] plpgsql function executed multiple times for each return value
We have a plpgsql function called paymentcalc, which calculates the payment necessary to pay off a loan. It's defined like this: CREATE OR REPLACE FUNCTION paymentcalc(IN amount numeric, IN interestrate numeric, IN termmonths integer, IN paymentfreq integer, IN dueday1 integer, IN dueday2 integer, IN borrowdate date, IN firstdue date, IN gapins character, IN lifeins character, IN disabilityins character, OUT payment numeric, OUT finalpayment numeric, OUT finaldue date) RETURNS record AS ... LANGUAGE 'plpgsql' STABLE; We want to execute this function, with inputs from a table, and return the calculated values as separate columns: select (p).payment, (p).finalpayment, (p).finaldue from (select paymentcalc(amount, interestrate / 100, termmonths, paymentfreq, dueday1, dueday2, borrowdate, firstdue, gapins, lifeins, disins) as p from apps where id = 100) s This works, but using "RAISE NOTICE" we've determined that this executes the paymentcalc function 3 times! It seems to execute the function once for each return value. The function is time consuming, and we want to execute it only once. Thinking that maybe a table returning function would work better, we tried this: CREATE OR REPLACE FUNCTION paymentcalc2(IN amount numeric, IN interestrate numeric, IN termmonths integer, IN paymentfreq integer, IN dueday1 integer, IN dueday2 integer, IN borrowdate date, IN firstdue date, IN gapins character, IN lifeins character, IN disabilityins character) RETURNS SETOF paymentcalc_return AS ... LANGUAGE 'plpgsql' STABLE; However, we now have difficulty trying to send parameters from a table as inputs to this function. We want to do something like: select * from paymentcalc2(amount, interestrate / 100, termmonths, paymentfreq, dueday1, dueday2, borrowdate, firstdue, gapins, lifeins, disins) from apps where id = 100 But obviously this query won't work with two "from"s. We can't do a join between paymentcalc2 and apps, because we get "invalid reference to FROM-clause entry for table apps". So my questions are: 1) How do we cause the paymentcalc function to be executed only once? and 2) How do we call a table returning function with inputs from a table? Thank you very much! Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] internal format of timstamp?
On 12/29/2011 12:42 PM, Jasen Betts wrote: On 2011-12-29, Lars Gustafsson wrote: Hi, I am trying to recover a lot of deleted rows from a database ( pg 8.2.3 ) , not my database, I promise….. When using the tool pgfsck I get good results, but timestamp is not implemented. When trying to export as int8 i get fx. 4735129360236469258 representing december 29, 2011, 16:30 But how should I do the conversion from the numeric value to the actual timestamp ? possibly that's a floating point timestamp try it as float8 instead of int8. I am not an internals person, but suspect that your first step should be to check pg_config to see if the server from which you are attempting to recover data was compiled with --enable-integer-datetimes. Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] compare table names
On 01/09/2012 08:28 AM, Tony Capobianco wrote: I see what you're saying: pg=# select tablename from pg_tables where tablename like 'tmp_staging%' and tablename< 'tmp_staging1230' and tablename> 'tmp_staging1228'; tablename tmp_staging1229 This query is part of a larger script where I want to dynamically select tablenames older than 10 days and drop them. The tables are created in a tmp_stagingMMDD format. I know postgres does not maintain object create times, how can I write this to select tables from pg_tables that are older than 10 days? Thanks. Tony ...Ah, there's the missing part - the 1229 represents a date that is missing year information. If you can change things up a bit, I'd add the year to the name "tmp_stagingMMDD" which makes the query easy. (We do this in a few cases where we are given blocks of data that are valid through a certain date. Each block of data is a child of the main table and has a name that represents the last date the data is valid. A daily script drops any partition that has expired.) If you can't add the year, you will be stuck with extra work to properly handle the first 10-days of each year. Alternately, you could have a separate table that just tracks the creation dates of the temporary tables and be free from any requirement to have dates be part of the table names. Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Unable To Modify Table
On 01/12/2012 08:42 AM, Carlos Mennens wrote: I seem to have an issue where I can't modify a table due to another tables foreign key association: ... How does one accomplish my goal? Is this difficult to change or once that foreign key is created, are you stuck with that particular constraint? Try updating the values in both tables within a transaction with constraints set to deferred: http://www.postgresql.org/docs/current/static/sql-set-constraints.html Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] date range to set of dates expansion
On 01/19/2012 07:16 AM, Gary Stainburn wrote: On Thursday 19 January 2012 15:11:46 Gary Stainburn wrote: I'll be upgrading my live server as soon as possible, but in the meantime can anyone suggest a way I can do the same thing using Postgresql 8.1 until I can evaluate 8.4 on my live systems? I'm sure most here would recommend moving to 9.1 rather than 8.4. Better performance, cooler replication functionality, more advanced in-place upgrade capabilities for future upgrades, a couple years longer before end-of-life, advances to windowing functions and other SQL commands and much other goodness. Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Update Mass Data in Field?
On 01/26/2012 03:59 PM, Carlos Mennens wrote: I'm new to SQL so I'm looking for a way to change several email addresses with one command. For example everyone has a 'holyghost.org' domain and I need to change a few 100 email addresses in the field 'emp_email'. I need to UPDATE employees table which has a COLUMN 'emp_email' and change %holyghost.org to %ghostsoftware.com. I tried: UPDATE employees SET emp_email = '%ghostsoftware.com' WHERE emp_email LIKE '%holyghost.org'; It didn't update anything when I ran the command. Does anyone know how I need to correct my SQL statement above to change everyone's email address? ...set emp_email = regexp_replace(emp_email, '@holyghost.org$', '@ghostingsoftware.org') where emp_email ~ '@holyghost.org$'... This is using the regular expression match and regexp_replace to ensure that the pattern is anchored at the end of the field and includes the "@" sign in the expression to avoid accidentally matching something like [email protected]. You can always do a select of the emp_email alongside the replacement expression to be sure it will do what you want before actually updating your database. Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] querying a column w/ timestamp with timezone datatype
On 01/30/2012 07:00 AM, Anson Abraham wrote:
I an 9.1 PG database: I have a column which is a timestamp w/ time
zone. So the value I have as one record in table is: 15:55:24.342848+00
If i want to find records less or greater than that timestamp, how do
I construct the query?
select * from schema.table where tscol >= '15:55:24.342848+00';
select * from schema.table where tscol >= '15:55:24.342848+00'::timestamp;
select * from schema.table where tscol >= cast('15:55:24.342848+00' as
timestamp with time zone);
do not work. Do I have to convert the value to a string and substr to
15:55:24 and then convert back to a timestamp? It's been a long while
since I had to query a pg table w/ a timestamp with time zone value.
Any help here would be appreciated.
Those aren't timestamps - timestamps include the date part.
Perhaps you are thinking about a *time* with time zone (a type that
exists due to SQL requirements but which is a somewhat nonsensical type,
the use of which is not recommended):
http://www.postgresql.org/docs/9.1/static/datatype-datetime.html#DATATYPE-TIMEZONES
Cheers,
Steve
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] time interval math
On 02/08/2012 12:01 PM, Edward W. Rouse wrote: I'm still working on getting this to work, but the summary is this: I am getting several (many) intervals of hour, minutes and seconds. I need a sum of the absolute value these intervals, similar to the SUM(ABS()) function for numbers; and I need to divide this sum by an integer (bigint). Getting the intervals is no problem, but I can't find built in functions for the rest. Currently on 8.3, want to upgrade to 9.x but I can't until this is finished. Do these functions exist, or will I be forced to convert to seconds, do the math and then convert back to hour-minute-second format (I am assuming from current data that, after the divide, the result should be in the minute: second range). You will have to do some work on your own. Time and intervals are tricky beasts and depend on the oddities of daylight saving rules. Even though you are only using hours/minutes/seconds the interval type also supports days and months. A day interval is probably 24 hours but could be 23 or 25 if it crosses a DST boundary. Months have different numbers of days. You have situations where adding and subtracting a month does not give the original date: select '2011-03-31'::date - '1 month'::interval + '1 month'::interval; ?column? - 2011-03-28 00:00:00 There is no abs(interval) function but, if you know that all your intervals are basic H:M:S and that you won't have any difficulty due to problems similar to the above you can mimic it with: ...case when myinterval < '0'::interval then '0'::interval - myinterval else myinterval end... You are even allowed to sum that and divide it (though I suspect there are some interesting corner-cases waiting to be discovered): ...sum(case when myinterval < '0'::interval then '0'::interval - myinterval else myinterval end)/2... Before you upgrade, be sure to read the release notes and test your calculations. The way intervals are handled, especially regarding intervals across DST boundaries, have changed over time. IIRC most of those changes were pre-8.3 but haven't looked recently. Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Setting the process title, or can I?
On 03/20/2012 03:14 AM, Bèrto ëd Sèra wrote: I currently have an emergency ... As an emergency procedure we have set a script that each minute has a look at the situation and runs pg_cancel_backend() against anything that has been waiting for more than X secs. Then it sleeps one more minute... ... Is there anyway I can mark the process running the filter, maybe using the update_process_title feature? I'd like to have something I can see from a ps command, grep for it and kill -15 it, without wasting our scarce resources on one more pg process to use info from pg_stat_activity. Several things spring to mind. 1. The script that starts the filter can note its pid and set an at-job to kill it after 30 minutes. I've used this technique with success. 2. Create a new role specifically to run the filter. The role is generally visible in ps. But this all begs the question. Why not: 3. Set the statement_timeout so long-running statements will be automatically killed. This can be done on a per-role basis so simply set it as desired for the role used by the problematic Java program. It is a default and can be changed by the user so if there are individual statements that need a longer timeout you can change just for those statements. Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] how to concatenate in PostgreSQL
On 03/24/2012 04:43 AM, Rehan Saleem wrote: hi , how can we concatinate these lines and execute sql command In what? Psql? A PL/pgSQL function. C/Java/PHP/Python/Perl/Erlang/Lua? setsql = 'select user,username, firstname ' set sql += ' lastname, cardno from table1 where userid=' + 5 exec(sqi) where 5 is the userid from table1 thanks Cheers, Steve
Re: [SQL] how to concatenate in PostgreSQL
On 03/27/2012 07:48 AM, Rehan Saleem wrote: well i am quite sure its PostgreSQL forum and it is obvious, i am asking this to concatenate in plpgsql. *From:* Steve Crawford *To:* [email protected] *Sent:* Monday, March 26, 2012 9:08 PM *Subject:* Re: [SQL] how to concatenate in PostgreSQL On 03/24/2012 04:43 AM, Rehan Saleem wrote: hi , how can we concatinate these lines and execute sql command In what? Psql? A PL/pgSQL function. C/Java/PHP/Python/Perl/Erlang/Lua? setsql = 'select user,username, firstname ' set sql += ' lastname, cardno from table1 where userid=' + 5 exec(sqi) where 5 is the userid from table1 thanks Cheers, Steve Yes, after about a decade of PostgreSQL use and forum participation I think I know which forum this is. Sarcasm and biting at people who want to help you is not a good way to make friends and get help on the forums. If you meant PL/pgSQL then say it. It is not the same as SQL though the syntax is generally similar. Look at DECLARE and BEGIN for starters. I've seen too many threads get reset back to the start once some incorrect assumptions get corrected to feel that up-front clarification is a waste of time. http://wiki.postgresql.org/wiki/Guide_to_reporting_problems is more oriented to problem reporting than general questions but has helpful advice nonetheless. Back to the issue at hand...others have pointed out the || operator and the issue with NULL. It appears you are doing a lot of conversion requiring writing of PL/pgSQL functions. Spend a few minutes skimming: http://www.postgresql.org/docs/current/static/plpgsql.html As to the topic of executing the command including many examples see: http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN If you are writing functions that must use dynamically generated table and column names, become familiar with the quote_ident, quote_literal and quote_nullable functions. Cheers, Steve
Re: [SQL] Simple way to get missing number
On 04/24/2012 07:15 AM, Emi Lu wrote: Good morning, May I know is there a simple sql command which could return missing numbers please? For example, t1(id integer) values= 1, 2, 3 500 select miss_num(id) from t1 ; Will return: === 37, 800, 8001 T select generate_series(1,500) except select id from t1; Example select anumber from fooo; anumber - 1 3 5 7 9 11 13 15 select generate_series(1,15) except select anumber from fooo order by 1; generate_series - 2 4 6 8 10 12 14 Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Simple way to get missing number
On 04/24/2012 11:10 AM, Emi Lu wrote:
I got it and thank you very much for everyone's help!!
It seems that "left join where is null" is faster comparing with
"except". And my final query is:
select num as missing
from generate_series(5000, #{max_id}) t(num)
left join t1 on (t.num = t1.id)
where t1.id is null;
BTW, there are many options. Two more of them include EXISTS:
select allnumbers from generate_series(1,15) as allnumbers where not
exists (select 1 from fooo where fooo.anumber=allnumbers.allnumbers);
And IN:
select allnumbers from generate_series(1,15) as allnumbers where
allnumbers not in (select anumber from fooo);
They all give you the same result. The "right" choice will depend on the
size of your table, how it is indexed, how fully it is populated and
even on your version of PostgreSQL. (Apologies for the funky field/table
naming.)
Cheers,
Steve
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Simple Upgrade from PostgreSQL version 8.1.11 (With schemas)
On 07/10/2012 08:28 AM, Chris Preston wrote: Hello all, How far can I get to a higher version of PostgreSQL by just entering a command line instruction to upgrade without any major effort? Regards Chris Preston At the simplest you just do a dump (using the dump tools from the *new* version of PostgreSQL) and restore to the new version. If you aren't worried about down-time and have the absolute simplest use-case then it isn't too difficult. Most people have to deal with reviewing the release notes to see how they affect the overall system, setting up a test of the new version to shake out issues and planning the upgrade to minimize down-time. Among the things that have changed over the past few releases are tightening up of casting rules and changing the default for standard_conforming_strings to on either of which could cause serious breakage. Once you get to a newer version (anything past 8.3? 8.4) you can use the new upgrade tools moving forward to minimize downtime during the upgrade process but you will *always* need to test and evaluate before deploying. Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Split a string to rows?
On 01/07/2013 11:44 AM, Emi Lu wrote: Is there a function to split a string to different rows?... Have you looked at regexp_split_to_table? Cheers, Steve -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] how can I replace all instances of a pattern
On 03/26/2013 06:08 AM, James Sharrett wrote: I'm trying remove all instances of non-alphanumeric or underscore characters from a query result for further use. This is part of a function I'm writing that is in plpgsql Examples: Original value 'My text1' 'My text 2' 'My-text-3' 'My_text4' 'My!text5' Desired 'Mytext1' 'Mytext2' 'Mytext3' 'My_text4' (no change) 'Mytext5' The field containing the text is column_name. I tried the following: Select regexp_replace(column_name,'\W','') from mytable This deals with the correct characters but only does the first instance of the character so the output is: 'My text1' 'Mytext 2' (wrong) 'Mytext-3' (wrong) 'My_text4' 'My!text5' I managed to get the desired output by writing the text into a variable through a loop and then just keep looping on the variable until all the characters are removed: sql_qry:= 'select column_name from mytable'; for sql_record in execute sql_qry loop curr_record := sql_record.column_name; while length(substring(curr_record from '\W'))>0 loop curr_record := regexp_replace(curr_record, '\W',''); end loop; rest of the code This works but it seems like a lot of work to do something this simple but I cannot find any function that will replace all instances of a string AND can base it on a regular expression pattern. Is there a better way to do this in 9.1? You were on the right track with regexp_replace but you need to add a global flag: regexp_replace(column_name,'\W','','g') See examples under http://www.postgresql.org/docs/9.1/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP Cheers, Steve
Re: [SQL] Unique index VS unique constraint
Unique indexes can be partial, i.e. defined with a where clause (that must be included in a query so that PostgreSQL knows to use that index) whereas unique constraints cannot. JORGE MALDONADO wrote > I have search for information about the difference between "unique index" > and "unique constraint" in PostgreSQL without getting to a specific > answer, > so I kindly ask for an explanation that helps me clarify such concept. A constraint says what valid data looks like. An index stores data in such a way as to enhance search performance. Uniqueness is a constraint. It happens to be implemented via the creation of a unique index since an index is quickly able to search all existing values in order to determine if a given value already exists. PostgreSQL has chosen to allow a user to create a unique index directly, instead of only via a constraint, but one should not do so. The uniqueness property is a constraint and so a "unique index" without a corresponding constraint is an improper model. If you look at the model without any indexes (which are non-model objects) you would not be aware of the fact that duplicates are not allowed yet in the implementation that is indeed the case. Logically the constraint layer sits on top of an index and performs its filtering of incoming data so that the index can focus on its roles of storing and retrieving. Extending this thought the underlying index should always be non-Unique and a unique filter/constraint would use that index for validation before passing the new value along. However, practicality leads to the current situation where the index takes on the added role of enforcing uniqueness. This is not the case for any other constraint but the UNIQUE constraints case is so integral to PRIMARY KEY usage that the special case behavior is understandable and much more performant. Conceptually the index is an implementation detail and uniqueness should be associated only with constraints. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Unique-index-VS-unique-constraint-tp5773386p5773427.html Sent from the PostgreSQL - sql mailing list archive at Nabble.com. -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Aggregating both tables in a join?
Hi, I've got a table which models the relationship between flights and legs, which is defined like this:- create table flight_leg ( flight_id integer references flight(id), leg_id integer references leg(id) ); One leg may be part of many flights and one flight may be made up of 1 or more legs. I also have a table which maps flights to comparable flights. The mapping is created manually. It is defined like this:- create table cmp_flight ( flight_id integer references flight(id), cmp_flight_id integer references flight(id) ); The flight and leg tables referenced are omitted for brevity. I want to be able to show all the flight_ids and their leg counts as well as the cmp_flight_ids and leg counts. e.g. insert into flight_legs values (1, 1); insert into flight_legs values (1, 2); insert into flight_legs values (2, 3); insert into flight_legs values (2, 4); insert into flight_legs values (2, 5); insert into flight_legs values (3, 3); insert into flight_legs values (4, 1); insert into flight_legs values (5, 3); insert into flight_legs values (5, 4); insert into flight_legs values (6, 4); insert into flight_legs values (7, 5); insert into cmp_flight values (1, 2); insert into cmp_flight values (1, 3); insert into cmp_flight values (4, 5); insert into cmp_flight values (4, 6); insert into cmp_flight values (4, 7); The result I'm looking for is flight_id num_legs cmp_flight_id cmp_num_legs 1 22 3 1 23 1 4 15 2 4 16 1 4 17 1 This does the trick:- select cf.flight_id, tl1.num_legs, cf.cmp_flight_id, tl2.num_legs cmp_num_legs from cmp_flight cf join (select fl1.flight_id fid1, count(*) num_legs from flight_leg fl1 group by fl1.flight_id) tl1 on fid1 = cf.flight_id join (select fl2.flight_id fid2, count(*) num_legs from flight_leg fl2 group by fl2.flight_id) tl2 on fid2 = cf.cmp_flight_id; Is this a better way to go about it? I've considered keeping a track of the number of legs in the flight table, using a trigger, but it doesn't seem relationally 'clean'. Kind Regards, Steve Sabljak ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SQL] COPY FROM query
Hi,
For what it's worth, WindowsXP (at least) seems to currently support
forward slashes at the OS level, so this Postgres behavior isn't as odd
as it might seem. If you enclose your paths with double quote (")
marks, Windows will even accept Unix style paths for some instructions
on the command line:
dir "/temp"
Feeding Windows API calls with forward slashes seems to work with
everything I've tried so far, so using them in Postgres seems perfectly
smart, and reasonable..
Hope that helps,
Steve
At 11:03 AM 2/12/2007, [EMAIL PROTECTED] wrote:
Date: Mon, 12 Feb 2007 09:09:08 +0900
From: Paul Lambert <[EMAIL PROTECTED]>
To: Joe <[EMAIL PROTECTED]>
Cc: [email protected]
Subject: Re: COPY FROM query.
Message-ID: <[EMAIL PROTECTED]>
Joe wrote:
> Hi Paul,
>
> On Mon, 2007-02-12 at 08:19 +0900, Paul Lambert wrote:
>> I'm attempting to copy from a table into a file using a select
query
>> inside the copy.
>>
>> The following is my command:
>>
>> COPY (SELECT DISTINCT ON (dealer_id,appraisal_id) * FROM
>> appraisals_temp) TO 'c:\autodrs_appraisal_new.txt' WITH DELIMITER
AS '^'
>> CSV HEADER;
>>
>> I get the following returned:
>>
>> WARNING: nonstandard use of escape in a string literal
>> LINE 1: ...ealer_id,appraisal_id) * FROM appraisals_temp) TO
'C:\autodr...
>> ^
>> HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
>>
>> ERROR: relative path not allowed for COPY to file
>> SQL state: 42602
>>
>>
>> (The caret character is pointing to the M in FROM)
>
> I believe that on Windows you need to use double backslashes, i.e.,
'c:\
> \autodrs_appraisal_new.txt', although the regular slash may also
work,
> i.e., 'c:/autodrs_appraisal_new.txt'.
>
> Joe
>
>
> ---(end of
broadcast)---
> TIP 5: don't forget to increase your free space map settings
>
>
If this is the case, it is strange that the first copy statement works
as that is also only using a single backslash.
Having said that, if I switch it to a forward slash it works without
error... odd, given that weenblows standard is backslash.
Thanks for the help though.
--
Paul Lambert
Database Administrator
AutoLedgers
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
Re: [SQL] Change the Default Database
Hi Rommel, I thought I'd throw a little more input in, as an alternative to psql. I use Ruby on Rails with Postgres, and for testing, it drops the testing database, recreates it and loads the structures dumped from the development database, before running tests - to ensure a clean test every time (it also has a tool for re-building the development database programatically, which one can run before all this). So, Rails has to solve the same problem you have, and it doesn't use psql to do it. My investigation into it's method is as so: 1) Initiate a connection to Pg server via Pg network port. 1.a) Dump dev database structures to file 2) Drop database 'test' 3) Create database 'test' (with correct parameters) 4) Initiate a new connection to Pg server via port, request connection to newly created 'test' database 5) Load dev database structures from file to 'test' database. The key obviously is reconnecting over the port connection and specifying the database you wish to connect to - I think this is how PGAdmin does the job too (when you click on "SQL" you're making a port level connection to the server, specifying which database you want to talk with - from what I can tell, Pg doesn't let you operate across databases, as MS SQL does - one connection = one database). I'm not nearly as expert as others on the list, so any corrections to the above analysis would be welcome. Sincerely, Steve At 04:49 AM 2/27/2007, [EMAIL PROTECTED] wrote: Date: Mon, 26 Feb 2007 18:02:38 -0400 From: "Rommel the iCeMAn" <[EMAIL PROTECTED]> To: Subject: Re: Change Default Database Message-ID: <[EMAIL PROTECTED]> I seem to be blundering a lot today! I thought I was replying to the entire list, didn't realize I replied to one person :-) Nothing was wrong with my script, I assumed that since it was generated by pgAdmin that I could run it inside pgAdmin. It works perfectly when I run it using psql. It is very much like SQL Server's osql command line tool. Thanks to everyone that helped ... So I've learnt three valuable lessons: 1. It's better to execute scripts using the command line tool 'psql'. 2. Make sure I'm replying to the list and not to an individual. 3. Do not post HTML messages!!! :-)) Thanks again everyone. Rommel Edwards Software Developer, Barbados, Caribbean. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[SQL] GiST index question: performance
Hi, First off, can I say how much I love GiST? It's already solved a few problems for me that seemed impossible to solve in real-time queries. Thanks to everyone who works on that project! I'm developing a geographic index based on a set of zip code boundaries. Points of interest (POI) will fall within some boundaries and not others. I need to search to find which POI are within a specified boundary. I think have two options (see below) and I'm wondering if anyone has an opinion or experience as to whether one or the other will have substantially different performance characteristics. I can obviously test when I get that far, but I'd prefer to try the anticipated faster route first, if anyone has existing experience they can share: 1) Index a series of circles of NN radius around each boundary marker (lat/long point). Run a search on POI for those that fall within any of the specified circles. 2) Index a set of polygons that mark the "minimum area" around the boundary markers in question. Run a search on POI that fall within this single polygon. The polygon will have more points, but there will be more circles to search - my understanding of GiST is limited so I'm not sure if there's a performance benefit to searching many circles or a few polygons. My tables are of this size: # of POI: 50,000 # of zip blocks (with and without regions): 217,000 # of zip blocks in a given city (and hence in a given polygon): ~5 Any thoughts or ideas? Thank you, Steve p.s. I could use a GIS system alongside of Postgres but performance and efficiency are key to this system, and it seems to me that raw GiST indexed SQL queries are going to be fastest and create the lowest load on the server? ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [SQL] GiST index question: performance
Thanks Oleg - very interesting stuff you are working on. You may recall I exchanged emails with you on openfts a little while ago - my ISP that manages my Pg SQL server is (in my interests) concerned about installing anything non-standard (read: unstable) onto their server. I was able to get them to install your TSearch2 b/c it's been proven many times, but I'm hesitant to even bring up Q3C since it's less widely deployed. The search method I proposed in my first email is not totally accurate but just searching circles with radii using a GiST index and standard Pg circle datatypes seems like a "close enough" solution for me (as opposed to Q3C's conical search intersections with a spherical projection). I realize that at higher latitudes my circles will be elliptical but our needs are for approximations that are very fast rather than accurate and the radii being searched are small relative to the size of the sphere (I.e. when searching Nome, find everything in +/- 40 miles and especially don't return Anchorage POI).. It's an end user database, so if the query takes 500ms, that's really too long. On the Q3C site, I see that your measure of speed is processing many, many rows in 20 hours, which is a whole different ballgame. :) Do you have a thought as to whether GiST is going to be faster/more efficient with Pg standard types of polygons or circles? I suppose I should just test out both, and quit wasting your time. I'll certainly repost to the list with whatever I uncover. I really do appreciate the help you've provided. Sincerely, Steve At 12:21 PM 3/5/2007, you wrote: On Mon, 5 Mar 2007, Steve Midgley wrote: Hi, First off, can I say how much I love GiST? It's already solved a few problems for me that seemed impossible to solve in real-time queries. Thanks to everyone who works on that project! Thanks, Steve ! I'm developing a geographic index based on a set of zip code boundaries. Points of interest (POI) will fall within some boundaries and not others. I need to search to find which POI are within a specified boundary. You POI is what we call ConeSearch query in astronomy. Please, take a look on Q3C algorithm available from http://q3c.sf.net. Some information http://www.sai.msu.su/~megera/wiki/SkyPixelization This is what we use in our Virtual Observatory project and we're able to work with 10^9 objects on moderate hardware. It doesn't use GiST but special pixelization scheme allow to use standard Btree. I think have two options (see below) and I'm wondering if anyone has an opinion or experience as to whether one or the other will have substantially different performance characteristics. I can obviously test when I get that far, but I'd prefer to try the anticipated faster route first, if anyone has existing experience they can share: 1) Index a series of circles of NN radius around each boundary marker (lat/long point). Run a search on POI for those that fall within any of the specified circles. 2) Index a set of polygons that mark the "minimum area" around the boundary markers in question. Run a search on POI that fall within this single polygon. The polygon will have more points, but there will be more circles to search - my understanding of GiST is limited so I'm not sure if there's a performance benefit to searching many circles or a few polygons. My tables are of this size: # of POI: 50,000 # of zip blocks (with and without regions): 217,000 # of zip blocks in a given city (and hence in a given polygon): ~5 Any thoughts or ideas? Thank you, Steve p.s. I could use a GIS system alongside of Postgres but performance and efficiency are key to this system, and it seems to me that raw GiST indexed SQL queries are going to be fastest and create the lowest load on the server? ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [email protected], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
Re: [SQL] GiST index question: performance
Hi Peter, :) All my Pg code is written via (or handed to) an abstraction layer, and I actually write no functions or stored procedures at all. I write using Rails, so in this case it's a Ruby library called ActiveRecord which has a Postgres module that allows me to talk via "ActiveRecord-speak" or via direct Postgres sql commands. (For example, AR has no idea how to create a GiST index, so I issue that DDL statement manually using the special syntax - also AR is not always so smart about SQL queries so tricky ones I write by hand). Maybe I misunderstand Q3C completely but it looks like C code that has to be installed into the Postgres server itself - not a series of SQL functions that can implemented on an unmodified server. I think my ISP is fine with anything that gets installed via user-level privileges. Anything that requires root and/or anything that involves binary code they are more cautious about. To be fair, I'm cautious about the same things, but given Oleg's reputation and contributions to Pg, I wouldn't be so concerned about Q3C specifically. Am I ignorant of something fundamental in this conversation? I really do appreciate any education or insight here. Are C code "patches" or functions more of a risk to server stability/reliability than higher level code? Or am I speaking gibberish? Thanks, Steve At 01:01 AM 3/6/2007, Peter Eisentraut wrote: Steve Midgley wrote: > my ISP that manages my Pg SQL server is (in my interests) > concerned about installing anything non-standard (read: unstable) > onto their server. I was able to get them to install your TSearch2 > b/c it's been proven many times, but I'm hesitant to even bring up > Q3C since it's less widely deployed. How do you manage to get your own code installed under that theory? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] A form of inheritance with PostgreSQL
Hi Greg, While not in a C++ framework, you might find that it's not too hard to implement something similar in your system - It's called "Single Table Inheritance." References to the Ruby on Rails implementation here: http://wiki.rubyonrails.org/rails/pages/SingleTableInheritance It's based on Martin Fowler's Patterns of Enterprise Architecture book - please find references to his original patterns here: http://www.martinfowler.com/eaaCatalog/singleTableInheritance.html The key, I believe, is simply adding a "type" and a "parent_id" to the "class" table, so you can model all your types and their hierarchical relations. Fowler's diagram is pretty clear. I think then you would store the data in another table (or tables) and link into this inheritance structure to establish ancestry for any piece of data (some people try to store the data in this table too, but I think that's a mistake personally). If I understand what you're trying to do, you can use this design pattern in your application language to implement an inheritance scheme without any special database features (i.e. in a SQL-standard manner). I hope this is helpful, Steve At 12:28 AM 3/9/2007, [EMAIL PROTECTED] wrote: Date: Thu, 08 Mar 2007 13:01:51 -0500 From: Greg Toombs <[EMAIL PROTECTED]> To: [email protected] Subject: A form of inheritance with PostgreSQL Message-ID: <[EMAIL PROTECTED]> Hello. I'm trying to figure out how to nicely implement a C++ class-like system with PostgreSQL. Consider the following: Tables Fruit, Apple, Orange I want to design the foreign key scheme such that there are relations between fruit and apple, and fruit and orange, that imply that apple is a fruit, and orange is a fruit. I don't want to eliminate the existence of Apple and Orange tables, because there will be columns specific to both Apple and Orange; if I include these columns in Fruit, then if Fruit is an Orange, the Apple columns will be needlessly present in Apple rows. The different ways of implementing this scheme that I've thought of (some uglier than others): - Have Fruit contain foreign keys to both Apple and Orange, and write a check constraint in Fruit specifying that exactly one of (Apple FK, Orange FK) needs to be non-null. The disadvantage of this method is that it isn't exactly loosely coupled. For every other fruit type table I implemented I'd have to go back and add a foreign key in Fruit. - Have a foreign key in Apple to Fruit, and in Orange to Fruit; then somehow create a constraint that imposes uniqueness on the union of foreign keys in both Apple and Orange. To figure out what type of fruit a Fruit row is, run a query for foreign keys in Orange and Apple matching the primary key of Fruit. You'd also want to somehow create a constraint that the result of this query should always return exactly one row (perhaps with a trigger?) Any advice will be appreciated! As I'm relatively new to Postgre, I might need some help with the actual implementation as well. Thank you. - Greg ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SQL] log file permissions?
Gerardo Herzig wrote: > Hi dudes. I have my pg log file rotation configured so i have a > psql_`today`.log. The thing is, can i configure postgres so the file > permissions (today is 600) could by 640, so developers could login and > tailf the logfile (without using the postgres superuser, course)? This really belongs over in admin. Are you logging through syslog or stdout (ie. what are the following settings in postgresql.conf and are they in any way overridden by your startup scripts)?: syslog = syslog_facility = syslog_ident = If your are logging through syslog, then postgres has no control over those permissions. You will have to look at your syslog daemon configuration. Cheers, Steve ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SQL] Issue with copying data from a text file.
Hi, I think I had the exact same problem as you do a while back and I solved it by removing the header row and the "CSV HEADER" clause of the statement. For the large files I had, it was easier (for me) to remove the header row than it was to escape out all the quotes (or regen the file): COPY deal_lines_temp_load FROM 'c:/temp/autodrs_deal_lines.txt' WITH DELIMITER AS '^'; I think the parser doesn't look for nor generate quoted rows except when CSV is specified.. It would be nice if there was a way to specify a "HEADER" row without invoking CSV parsing rules (friendly hint to core devs!) :) Let us all know if that works! Steve At 03:14 AM 3/20/2007, you wrote: Date: Tue, 20 Mar 2007 11:25:38 +0900 From: Paul Lambert <[EMAIL PROTECTED]> To: [email protected] Subject: Issue with copying data from a text file. Message-ID: <[EMAIL PROTECTED]> I have a procedure in place that copies data from a caret delimited text file into a table storing some information. One of the fields in the table contains an item description which may contain item dimensions such as - 17" alloy wheels The problem I am getting when I do my load is I believe due to the presence of the double quotation marks giving the copy the impression that it is to include the information following as a single text string until it gets to the next set of double quotes. As a result, I get the following: AutoDRS=# COPY deal_lines_temp_load FROM 'c:/temp/autodrs_deal_lines.txt' WITH DELIMITER AS '^' CSV HEADER; ERROR: value too long for type character varying(30) CONTEXT: COPY deal_lines_temp_load, line 87, column order_desc: "17 5 spoke alloy wheels.^1291.18^117.38^983.69^1291.18^^C^^ The column as you can see is defined as a 30 character field, the load contains in this column ^17" 5 spoke alloy wheels.^ I note an option in the COPY command to specify the quote character, defaulting to double quote. The problem being a single quote will also be used in the data, as will other characters. Is there any way to get a copy to have no quote character? I.e. read the file and put whatever is between the caret characters straight into the appropriate field exactly as is. TIA, Paul. -- Paul Lambert Database Administrator AutoLedgers ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SQL] [pgsql-sql] Daily digest v1.2492 (19 messages)
Hi John,
It sounds like a disk-bound operation, so cpu is not maxed out. I'm not
clear on all the details of your operation but it sounds like you're
using Java to do row-by-row based inserts, selects and updates within a
transaction, from a file. This can be a very slow process if you have
many rows. The OS stats you describe fits that theory (but not
conclusively).
If you are using (psuedo-)code such as:
Open file {
Read line {
select from Pg: "select from [other_table] where val =
[line[colN]]"
exec to Pg: "insert into [table] (col1, col2, ...) values
(line[col1], line[col2]..."
}
}
You can radically speed up such a system by using the "copy"
(http://www.postgresql.org/docs/8.2/interactive/sql-copy.html) command
to load all the data at once from the file into Pg and then do
post-processing with Java/SQL to get all the fields looking right.
Doing a bulk update with a join across several tables is so much faster
than looping through them with a wrapper in Java (or other lang) you
won't believe it.
I hope this helps and is on-topic for you.
Steve
At 09:38 AM 4/3/2007, [EMAIL PROTECTED] wrote:
Date: Tue, 03 Apr 2007 22:16:13 +0800
From: John Summerfield <[EMAIL PROTECTED]>
To: [email protected]
Subject: A long-running transaction
Message-ID: <[EMAIL PROTECTED]>
I have a Java (java 1.1) program that I wrote some years ago, to read
records from a text file and insert it into a ostgresql database.
One of the assumptions I made was that one file contained one day's
data, maybe as many as 1500 records, and I coded it to do the whole
lot
as one transaction so either a single file was loaded in its entirity,
or none of its data was.
I lost the Java code, but revived the idea and I've collected about
two
years' data using (Linux) shell scripts, and loading the data using
psql.
Then, I found the Java code on a disused hard disk:-)
I made the necessary changes for it to build in java 1.5, and used
psql
to extract data from my new database in the correct format for the old
program. This time, I have a little more data than I ever loaded at
once
before:
[EMAIL PROTECTED]:~$ wc -l testdata
6242217 testdata
[EMAIL PROTECTED]:~$ \ls -hl testdata
-rw-r--r-- 1 summer summer 285M 2007-03-28 22:32 testdata
[EMAIL PROTECTED]:~$
Now, it wouldn't surprise me if postgresql used lots of memory - but
how
much could it possibly need? My laptop, where I first tried this, has
1.25 Gbytes, so I could allow it some.
It wouldn't surprise me a lot if it used lots of memory and caused all
sorts of problems, but no, it's chugging away, still using no more RAM
than it could have had on my old Pentium 133 all those years ago.
In the process of checking it out, I've set it running on a machine
with
a AMD Sempron(tm) 2400+ running Kubuntu 6.10 (kernel is
2.6.17-6-server-xen0) and 512 Mbytes of RAM.
This is the java program:-)
summer pts/6:0.0 Thu205days 1:07 1:07
/usr/bin/gij-4.1 -cp /usr/s
It's been running five days so far, and I can see where it's up to by
attaching strace. It's reading 2k of the input file every few seconds.
Okay, clearly something's wrong, and I don't think it's all my crddu
code.
No probs swapping:
[EMAIL PROTECTED]:~$ free
total used free sharedbuffers
cached
Mem:460800 456472 4328 0860
262164
-/+ buffers/cache: 193448 267352
Swap: 14618722841461588
[EMAIL PROTECTED]:~$
It is hitting the disk pretty hard now on this machine, but the
laptop's
still going too, and the disk seems to run about half the time, part
of
a second running, part idle (but the intervals are getting shorter).
It struck me as fairly curious that neither postgresql nor the
application was hogging the CPU.
Perhaps the laptop is more interesting: look at the size of the buffer
pool:
[EMAIL PROTECTED]:~> free
total used free sharedbuffers
cached
Mem: 12955281268548 26980 0 3976
392388
-/+ buffers/cache: 872184 423344
Swap: 1941496 326561908840
[EMAIL PROTECTED]:~>
Again, no problem with over-use of RAM, and I'm logged on using KDE
too
and that's running fine.
It's been running a little longer here:
summer pts/2328Mar07 5days 25:12 25:11 java -cp
/home/summer/Classes/:/usr/share/p
This is Sun's Java 1.5 on OpenSUSE 10.2.
This is what suggested I should write:
[EMAIL PROTECTED]:~> procinfo
Linux 2.6.18.8-0.1-default ([EMAIL PROTECTED]) (gcc 4.1.2 20061115) #1
1CPU [Echidna.]
Memory: TotalUsedFree Shared Buffers
Mem: 1295528 1271720 23808 03716
Swap: 1941496 32656 1908840
Bootup: Tue Mar 27 18:50:19 2007Load avera
Re: [SQL] Question on interval
Rodrigo De León wrote: > On 4/20/07, Wei Weng <[EMAIL PROTECTED]> wrote: >> Hi all. >> >> How do I write a query that converts an interger to the interval type? >> >> Like convert integer 10 to INTERVAL '10 seconds'? >> >> The integer is a column in a table though, so it is more like convert >> integer tbl.theInteger to INTERVAL 'tbl.theInteger seconds". >> >> Thanks! >> >> >> Wei > > select (10||' sec')::interval; Or, if you prefer: select 10*'1 second'::interval; Cheers, Steve ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[SQL] Selecting rows with "static" ordering
Hello, I have a strange problem (or one that I've never had before anyway). I am searching for a list of "id's" for a given table (these id values are generated at run-time and held statically in an application-local variable). From that application, I want to retrieve all those rows, and I want them in the order they are currently stored in that variable. So take for example this foreign application variable: ids = "3,2,5,1,4" The application then executes this sql: select * from table where id in (3,2,5,1,4) As-is, of course, the above query will return the 5 records in a semi-random (i.e. unpredictable/unreliable) order. And I don't want to just "order by id" - I want to "order by id(3,2,5,1,4)" (if you see what I mean) Is there a "neat trick" that anyone knows for pulling this off in a single query? Basically right now I'm issuing 5 queries to the backend to ensure ordering but this horribly inefficient. Any input or advice would be appreciated, Steve ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [SQL] subtract a day from the NOW function
Scott Marlowe wrote:
> Campbell, Lance wrote:
>> Michael,
>> So based on your feedback would it be better to do option A or B below?
>>
>> 1) I have a timestamp field, "some_timestamp", in table "some_table".
>> 2) I want to compare field "some_timestamp" to the current date - 1 day.
>> I need to ignore hours, minutes and seconds.
>>
> You might want to use date_trunc then:
>
> select * from sometable where date_trunc('day',tiemstampfield) >
> date_trunc('day',now() - interval '1 day');
>
> or something like that.
Beware in the "or something like that category" that PostgreSQL
considers "1 day" to be "24 hours" thus depending on whether the
timestampfield is with or without TZ and where you do your truncation
(before or after subtracting), you can end up with unexpected results in
the vicinity of DST changes:
select '2007-03-12'::timestamptz - '1 day'::interval;
?column?
2007-03-10 23:00:00-08
select '2007-03-12'::timestamp - '1 day'::interval;
?column?
-
2007-03-11 00:00:00
Especially note that truncating a timestamptz preserves the timezone
info so you will very likely need to address issues on the days that
Daylight Saving starts or ends:
select date_trunc('day',current_timestamp);
date_trunc
2007-06-07 00:00:00-07
Cheers,
Steve
---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at
http://www.postgresql.org/about/donate
Re: [SQL] subtract a day from the NOW function
Michael Glaesemann wrote: > > On Jun 7, 2007, at 13:58 , Steve Crawford wrote: > >> Beware in the "or something like that category" that PostgreSQL >> considers "1 day" to be "24 hours" > > Actually, recent versions of PostgreSQL take into account daylight > saving time in accordance with the current PostgreSQL time zone setting, > so '1 day' in the context of timestamptz +/- interval may be 23, 24, or > 25 hours Interesting - thanks. That's one more thing I need to check when upgrading my server. If my reading is correct, there are some subtle gotchas here. If I go back and try on a 7.4 machine it appears that interval makes a DST correction if the interval includes a unit of "month" or greater but does not make a correction for "week" or "day" intervals. On 8.2 I'm seeing an adjustment if the DST adjustment includes units of "day" or greater (ie. '1 day' and '1 day 01:00' get adjusted but '24 hours' and '25 hours' do not). But PG doesn't follow the same rules in subtracting timestamptz values so operations involving timestamps and intervals are (sometimes) not reversible: select timestamptz '2007-11-05' - timestamptz '2007-11-04'; ?column? 1 day 01:00:00 select timestamptz '2007-11-04' + interval '1 day 01:00:00'; ?column? 2007-11-05 01:00:00-08 Cheers, Steve ---(end of broadcast)--- TIP 6: explain analyze is your friend
[SQL] Increment a sequence by more than one
Hi, I'm writing an import app in a third party language. It's going to use "copy to" to move data from STDIN to a postgres (8.2) table. There are some complexities though: it's going to copy the records to a "versioned" table first, and then at a later time the records will be copied by a different procedure to the "live" table. The live table and versioned table are identical in terms of their field definitions. But there is no sequence associated with the versioned table (whose primary key is "id" plus "import_group_id", whereas the live table's pk is just "id"). So all versioned table entries must already "know" what their id would be in the live table. (This makes sense for other business process we have, but it's a bit of a problem in this instance). My problem: I'd like to be able to grab a block of id's from the live table's pk sequence. So let's say my importer has 5,000 new rows to import and the current max pk in the live table is 540,203. I'd like to be able to increment the primary key sequence in such a way that I get a block of ids all to myself and the sequence is reset to 545,203 with a guarantee that all the id's between 540203 and 545203 are unused. I'm guessing this can be done with a stored procedure, but if possible I'd like to be able to run this command from my third party app without calling a stored procedure (we try not to use stored procedures here b/c we code entirely in this third party language - if we had to, it's possible we could install a stored procedure though). But since I've seen so much magic on display from people on this list, I'm going to ask if it's possible to do this solely from PG SQL sent from a third party language? The main tricky bit seems to be ensuring that everything is locked so two such increment calls at the same time don't yield overlapping blocks of ids. Is there a way to "lock" the sequence generator for the duration of a "nextval" and "setval" call? Since pk sequence functions like nextval cannot be rolled back, I'm guessing that "begin/end" won't do the job? I.e: -- need "magic lock" statement on pk sequence here nextval -- returns 540203 setval(545203) -- now sequence is set to where I want it and I "own" 5000 id's -- release magic lock here My fallback is to just have a routine that calls "nextval" a bunch of times and stores all the id's it gets - they may or may not be sequential but they'll be unique. This is going to be a really slow way to get a large number of id's of course and just seems plain wrongheaded in many ways. Any insights? All help is appreciated and input on a better way to solve the problem completely is of course welcome as well. Sincerely, Steve
