from:"Steve"

[SQL] Question regarding indices

2010-09-11 Thread Steve

Hello List,

I have a small question about the order of values in a query. Assume I have a 
table with the following fields:
  uid INT,
  data BIGINT,
  hits INT

And an unique index on (uid, data). I use libpq C API to query data from the 
table. The query is something like this:
SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN 
(2033,2499,590,19,201,659)

Would the speed of the query be influenced if I would sort the data? I can 
imagine that just querying a bunch of bigint would not make a big difference 
but what about several thousand of values? Would sorting them and sending the 
SQL query with ordered data influence the speed of the query?


// Steve
-- 
GRATIS: Spider-Man 1-3 sowie 300 weitere Videos!
Jetzt freischalten! http://portal.gmx.net/de/go/maxdome

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

[SQL] Question about PQexecParams

2010-09-11 Thread Steve

Hello list,

I would like to call a function from my C application by using libpq and 
PQexecParams. My problem is that I don't know how to specify that I want to 
send an array to the function.

Assume the function is called lookup_data and takes the following parameters: 
lookup_data(integer,integer,bigint[])

I would like to specify the OID with my query. How would I do that? Assume I 
would like to query 3 values for bigint:

const char *paramValues[5];
Oid paramTypes[5];
int paramLengths[5];
int paramFormats[5];

int32_t ivalue1 = htonl(value1);
paramValues[0]  = (char *)&ivalue1;
paramTypes[0]   = INT4OID;
paramLengths[0] = sizeof(ivalue1);
paramFormats[0] = 1;

int32_t ivalue2 = htonl(value2);
paramValues[1]  = (char *)&ivalue2;
paramTypes[1]   = INT4OID;
paramLengths[1] = sizeof(ivalue2);
paramFormats[1] = 1;

etc...

How would I tell libpq that the next 3 values are an array of bigint?

I tried to use INT8OID and specify the query like below but that did not work:
SELECT * FROM lookup_data($1,$2,{$3,$4,$5})

Probably I have to set the query to be:
SELECT * FROM lookup_data($1,$2,{$3})

Or:
SELECT * FROM lookup_data($1,$2,$3)

But what would I set for paramTypes? How can I say that the values are an array 
of bigint? I assume that I can set paramValues to be an array and paramLengths 
to be sizeof one value multiplied by the amount of elements in the array.

I am somehow lost and don't know how to call the function and pass an array to 
libpq.

Can any one help me with this?


// Steve
-- 
GMX DSL SOMMER-SPECIAL: Surf & Phone Flat 16.000 für nur 19,99 Euro/mtl.!*
http://portal.gmx.net/de/go/dsl

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Question regarding indices

2010-09-11 Thread Steve


 Original-Nachricht 
> Datum: Sat, 11 Sep 2010 10:05:18 -0400
> Von: Michael Gould 
> An: Steve 
> Betreff: Re: [SQL] Question regarding indices

> Steve,
> 
Hello Michael,


> If I remember correctly the sort only works on the final result set and so
> doing a sort before the select isn't going to help
> 
I think you got me wrong. It's not that I want the RESULT to be sorted. I only 
asked if sorting the values in the query would have an effect on the speed of 
the query.

Or to ask it the other way around: Is an index in PostgreSQL sorted or not? If 
it is sorted and the PostgreSQL parser is looking for entries in the index then 
having the values pre-sorted would probably be beneficial. Right?


// Steve

> Sent from Samsung mobile
> 
> Steve  wrote:
> 
> >Hello List,
> >
> >I have a small question about the order of values in a query. Assume I
> have a table with the following fields:
> >  uid INT,
> >  data BIGINT,
> >  hits INT
> >
> >And an unique index on (uid, data). I use libpq C API to query data from
> the table. The query is something like this:
> >SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN
> (2033,2499,590,19,201,659)
> >
> >Would the speed of the query be influenced if I would sort the data? I
> can imagine that just querying a bunch of bigint would not make a big
> difference but what about several thousand of values? Would sorting them and
> sending the SQL query with ordered data influence the speed of the query?
> >
> >
> >// Steve
> >-- 
> >GRATIS: Spider-Man 1-3 sowie 300 weitere Videos!
> >Jetzt freischalten! http://portal.gmx.net/de/go/maxdome
> >
> >-- 
> >Sent via pgsql-sql mailing list ([email protected])
> >To make changes to your subscription:
> >http://www.postgresql.org/mailpref/pgsql-sql

-- 
GRATIS: Spider-Man 1-3 sowie 300 weitere Videos!
Jetzt freischalten! http://portal.gmx.net/de/go/maxdome

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Question regarding indices

2010-09-11 Thread Steve

 Original-Nachricht 
> Datum: Sat, 11 Sep 2010 11:04:16 -0400
> Von: Tom Lane 
> An: "Steve" 
> CC: [email protected]
> Betreff: Re: [SQL] Question regarding indices

> "Steve"  writes:
> > I have a small question about the order of values in a query. Assume I
> have a table with the following fields:
> >   uid INT,
> >   data BIGINT,
> >   hits INT
> 
> > And an unique index on (uid, data). I use libpq C API to query data from
> the table. The query is something like this:
> > SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN
> (2033,2499,590,19,201,659)
> 
> > Would the speed of the query be influenced if I would sort the data? I
> can imagine that just querying a bunch of bigint would not make a big
> difference but what about several thousand of values? Would sorting them and
> sending the SQL query with ordered data influence the speed of the query?
> 
> It's unlikely to make enough difference to be worth the trouble.
> 
Making a quick sort is ultra easy in C. Anyway... is there a difference in the 
speed of the query with pre-sorted values or not? If there is one then I will 
go and sort the values. Right now I have a quick sort implemented but I will 
probably do a lazy quick sort and then a final insertion sort, because 
insertion is faster on a slightly ordered dataset than quick sort.

Probably I am pulling hairs here about the speed but I really want to minimize 
the time it needs for PostgreSQL to return the data. I personally am happy with 
the speed when using PostgreSQL but the application I use has an MySQL driver 
too and I got some users claiming that MySQL is faster than PostgreSQL, 
transfers less data over the wire, etc... and I want to optimize the PostgreSQL 
part to be on the same level as the MySQL part. So everything that helps to 
squeeze the last nanosecond out of the PostgreSQL part is welcome. I already 
switched to binary transmission in order to minimize the data send over the 
wire when using PostgreSQL and I have added an function to do +/- what the 
MySQL proprietary "INSERT ON DUPLICATE KEY UPDATE" does. I hate when users 
compare apples with oranges but what can I do? You can not explain them that 
PostgreSQL is different and more standard compliant and that the chance to 
loose data is lower with PostgreSQL then with all that what MySQL is d
 oing (MyISAM tables, etc...). It's pointless to explain to them. It's like 
trying to explain a mole how the sun is shining.

So all I want is to explore the available capabilities of PostgreSQL to get the 
best out of the libpq engine as possible. If you have any recommendation at 
what I should look in order to get better speed, then let me know.

>   regards, tom lane
>
// Steve
-- 
Achtung Sicherheitswarnung: GMX warnt vor Phishing-Attacken!
http://portal.gmx.net/de/go/sicherheitspaket

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Question regarding indices

2010-09-12 Thread Steve

 Original-Nachricht 
> Datum: Sat, 11 Sep 2010 11:08:00 -0400
> Von: Lew 
> An: [email protected]
> Betreff: Re: [SQL] Question regarding indices

> On 09/11/2010 08:29 AM, Steve wrote:
> > I have a small question about the order of values in a query.
> > Assume I have a table with the following fields:
> >uid INT,
> >data BIGINT,
> >hits INT
> > And an unique index on (uid, data). I use libpq C API to query
> > data from the table. The query is something like this:
> > SELECT uid,data,hits FROM mytable WHERE uid=2
> > AND data IN (2033,2499,590,19,201,659)
> >
> > Would the speed of the query be influenced if I would sort the data?
> 
> What do you mean by "sort the data"?  Which data?
> 
I mean sorting the values in the brackets. Instead of:
SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN 
(2033,2499,590,19,201,659)

I would then send this here:
SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN 
(19,201,590,659,2033,2499)

Off course this is a small dataset but the query usually has thousands of 
elements and not only the above 6 elements.


> > I can imagine that just querying a bunch of bigint would not make a
> > big difference but what about several thousand of values? Would sorting
> > them and sending the SQL query with ordered data influence the speed of
> the query?
> 
> Send the query from where to where?
> 
Sending the query from my application to the PostgreSQL server.


> Are you referring to a sort of the items in the IN subselect?
>
Yes.


> My guess is
> that sorting that won't matter but it's only a WAG.
> 
What is "WAG"?


> -- 
> Lew
> 
SteveB
-- 
Achtung Sicherheitswarnung: GMX warnt vor Phishing-Attacken!
http://portal.gmx.net/de/go/sicherheitspaket

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Question about PQexecParams

2010-09-22 Thread Steve


 Original-Nachricht 
> Datum: Sun, 12 Sep 2010 01:52:04 +0400
> Von: Dmitriy Igrishin 
> An: Steve 
> CC: [email protected]
> Betreff: Re: [SQL] Question about PQexecParams

> Hey Steve,
> 
> 2010/9/11 Steve 
> 
> > Hello list,
> >
> > I would like to call a function from my C application by using libpq and
> > PQexecParams. My problem is that I don't know how to specify that I want
> to
> > send an array to the function.
> >
> > Assume the function is called lookup_data and takes the following
> > parameters: lookup_data(integer,integer,bigint[])
> >
> > I would like to specify the OID with my query. How would I do that?
> Assume
> > I would like to query 3 values for bigint:
> >
> > const char *paramValues[5];
> > Oid paramTypes[5];
> > int paramLengths[5];
> > int paramFormats[5];
> >
> > int32_t ivalue1 = htonl(value1);
> > paramValues[0]  = (char *)&ivalue1;
> > paramTypes[0]   = INT4OID;
> > paramLengths[0] = sizeof(ivalue1);
> > paramFormats[0] = 1;
> >
> > int32_t ivalue2 = htonl(value2);
> > paramValues[1]  = (char *)&ivalue2;
> > paramTypes[1]   = INT4OID;
> > paramLengths[1] = sizeof(ivalue2);
> > paramFormats[1] = 1;
> >
> > etc...
> >
> > How would I tell libpq that the next 3 values are an array of bigint?
> >
> > I tried to use INT8OID and specify the query like below but that did not
> > work:
> > SELECT * FROM lookup_data($1,$2,{$3,$4,$5})
> >
> Incorrect.
> 
> >
> > Probably I have to set the query to be:
> > SELECT * FROM lookup_data($1,$2,{$3})
> >
> Incorrect.
> 
> >
> > Or:
> > SELECT * FROM lookup_data($1,$2,$3)
> >
> Correct.
>
Thanks.


> You may specify a data type by OID (1016 for bigint[],
> please refer to
> http://www.postgresql.org/docs/9.0/static/catalog-pg-type.html
> to obtain information about types) or attach an explicit cast to a
> parameter symbol to force treating it as bigint[] (or any specified type),
> e.g.
> SELECT * FROM lookup_data($1, $2, $3::bigint[])
> 
Thanks.


> > But what would I set for paramTypes? How can I say that the values are
> an
> > array of bigint? I assume that I can set paramValues to be an array and
> > paramLengths to be sizeof one value multiplied by the amount of elements
> in
> > the array.
> >
> Please note, that in this case, you must pass to paramValues[2] a textual
> representation
> of bigint[], e.g. '{1,2,3}'.
> Its not necessary to specify a length of text-format parameters (its
> ignored).
> The length is essential only if you transmit data in a binary format.
> 
Really? I must use a textual representation of the array? Why?
I searched the Internet up and down and as far as I can tell, there is a 
possibility to send the array in binary. I have to add a special header to the 
array and do off course that host to network translation and then I can send 
the array in binary. Unfortunately I can not find enough information about the 
format of the whole header + array. The header looks to be easy to create (just 
3 times 4 bytes for 1) number of dimensions (aka ndims), 2) if the array has 
null elements (aka hassnull), 3) array element oid (aka typeid. In my case 
INT8OID aka 20)) and then followed by the content of the array. And here I have 
a problem. I don't know how that data following the header should look like? I 
think that each value is in a block of 8 bytes (converted from host to 
network). But I am not sure (the examples I have seen are all for int4 and not 
for bigint). I am confused by the two examples I have found so far. One of them 
is dividing those 8 bytes into two 4 bytes blocks and adds so
 mething they call "dims" and "lbound". I have no clue what that is? I think 
the PostgreSQL function "array_recv()" is responsible for the format but I can 
not find any documentation about the format of a binary array representation. 
Maybe you know a place where I can read about how to send an array of int64_t 
to the PostgreSQL backend in binary?

I know that I could go the textual representation path, but I really want to 
send the data in binary. And I don't want/can libpqtypes (which would btw make 
the task ultra easy).

Maybe I can not see the forest because of the trees but I really can not find 
any documentation how to create a correct struct representing an array 
datatype. Can it be that this part is not documented at all?


> 
> >
> > I am somehow lost and don't know how to call the function and pass an
> array
> > to libpq.
> >
> > Can any one help me with this?
> >
> Hope this helps.
> 
Yes. You helped me a bit. But I am still not there where I want/need to be.


> Regards,
> Dmitriy
>
// Steve
-- 
Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief!  
Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

[SQL] Question on string value expression wildcarding

2000-08-24 Thread Steve Wampler



I have LIKE expressions:

(a) name LIKE 'kp.dhs.%'
(b) name LIKE 'kp.dhs%'

where the name column contains strings prefixed with "kp.dhs.".

I'm using postgresql  7.0.2.

Expression (a) fails to match any names while (b) matches
all strings prefixed with "kp.dhs", including (as expected)
those prefixed with "kp.dhs.".

So I take it that ".%" has some special meaning in wildcarding,
but my (limited) SQL references don't mention this case.
Is this To Be Expected SQL behavior?  If so, what
expression can be used to match only strings prefixed with
"kp.dhs."?

Thanks!
--
Steve Wampler-  SOLIS Project, National Solar Observatory
[EMAIL PROTECTED]

[SQL] Using a postgres table to maintain unique id?

2000-11-13 Thread Steve Wampler



Hi,

I have a distributed application that needs unique 64-bit
"id" values.  The current implementation keeps the current
value for this id in a file on one system and uses
a process to provide access to this file via CORBA.

However, I also use Postgres (7.0.2) throughout this
application and it seems cleaner to me to keep the current
id value in a table and just use postgres to provide access
(with a trigger function to increment the id on access).

Is this reasonable?  Is it fast?  (I need 10 or more IDs
generated each second.)  Can I avoid having the table
gradually fill with "old" rows for this entry, and this
avoid the need to run VACUUM ANALYZE periodically?

Any tips on how to implement the trigger would be
appreciated.  (Is it possible to generate an int8 sequence
value?)

Thanks!
--
Steve Wampler-  SOLIS Project, National Solar Observatory
[EMAIL PROTECTED]

Re: [SQL] Using a postgres table to maintain unique id?

2000-11-13 Thread Steve Wampler

Poet/Joshua Drake wrote:
> 
> >However, I also use Postgres (7.0.2) throughout this
> >application and it seems cleaner to me to keep the current
> >id value in a table and just use postgres to provide access
> >(with a trigger function to increment the id on access).
> 
> Why not a sequence?

Can someone show me how to create (and use) an int8 sequence?

> >Is this reasonable?  Is it fast?  (I need 10 or more IDs
> >generated each second.)  Can I avoid having the table
> >gradually fill with "old" rows for this entry, and this
> >avoid the need to run VACUUM ANALYZE periodically?
> 
> The only problem I have had with this type of thing is when a number gets
> deleted, it does not get recycled.

Fortunately, I don't want any number to ever get recycled - the id needs to
be unique throughout the 25+ year lifetime of the project.  The table
would have a single row with a single column.  Selecting that table cell
would return the current value, but leave the value incremented in the
table cell (it's ok if it increments the value before returning).

--
Steve Wampler-  SOLIS Project, National Solar Observatory
[EMAIL PROTECTED]

Re: [SQL] Using a postgres table to maintain unique id?

2000-11-13 Thread Steve Wampler

Marten Feldtmann wrote:
> 
>  Throw away all the "hardwired"-stuff and do it with software. I
> once described an algorithm in one of this lists how to create
> unique values for clients without minimum interaction with the
> database.
> 
>  The result: query once in the beginning of your application,
> generate your id's "offline" at the maximum speed you may
> have and store your last generated id when your client
> finished. Superior to all the "hardwired"-database solutions !

Yes, but...

(1) The application I have is composed of about 50 processes
running on 3 different OS/architectures (Linux/intel,
Solaris/sparc, and VxWorks/ppc).  The IDs I need must be
unique across all processes (I suppose one solution would
be to provide each ID with a unique prefix based on the
process that is running, but...)

(2) Some of these systems are real-time boxes that might get
rebooted at any moment, or might hang for hardware-related
reasons [I'd like to able to say that all of the processes
could detect imminent failure, but unfortunately, I can't].
So determining when a client "finishes" is not always possible,
which prevents (he claims) the above solution from claiming
ID uniqueness.

However, it might be sufficient to provide a process on the
postgres DB machine (if *that* machine dies, *everything* stops...)
that serves IDs via CORBA to all the other applications and
(internally) uses the "software" approach given above.  This
process could "sync" with the database every N seconds or so
(where N might be < 1.0).  This, while still not guaranteeing
uniqueness, would at least come pretty close...  It would still be
nice to avoid having to VACUUM ANALYZE this table, though, and it
"feels" as though it is duplicating functionality already provided
by postgres DB backends.

I'll think about this solution - thanks!

--
Steve Wampler-  SOLIS Project, National Solar Observatory
[EMAIL PROTECTED]

Re: [SQL] Using a postgres table to maintain unique id?

2000-11-14 Thread Steve Wampler


Thomas Swan wrote:
> 
> At 11/13/2000 06:22 PM -0800, Michael Teter wrote:
> >
> > From what I can tell (both from docs and doing a
> > describe on sequences in my database), a postgresql
> > sequence is an int4, not an int8, and thus you are
> > limited to a max of 2.1 billion values.
> >
> > If you require an int8 sequence, you'll probably have
> > to manage your own and just use an int8 column.
> >
> I had originally started using int8 and creating custom sequences.   However,
> as mentioned in a previous post, there is an inherent performance penalty in
> using int8 over int4.   Tom Lane advised me that the int8 routines are an
> emulated or synthesized data type.  in the test I did on our 7.0.2 server I
> notice about a 25-30% decrease in performance when using complex joins on
> tables containing referential keys, primary keys all in the int8 data type.
> 
> This might be something to think about as well.

Thanks.  Because of these and other comments people have made, I've gone back
to using a flat_file-with-server approach instead of adding a table to my
postgres DB.  While an int4 *might* work, it doesn't handle the "worst-case"
scenario (which is up around 15 billion values).

Thanks to everyone for your comments and suggestions!

--
Steve Wampler-  SOLIS Project, National Solar Observatory
[EMAIL PROTECTED]

[SQL] Selecting Most Recent Row

2000-12-13 Thread Steve Meynell



Ok here is the problem.
Table: Widgets
Fields:  Unique_Key, DateStamp, Batch_Number, Instructions.
Basic Select Statement:
select Unique_Key from Widgets where Batch_Number>='inputedvalue'
Problem:
Some Batch_Numbers might be duplicated over a period of time. 
I want to select the most recent of these Batch Numbers.
 
-- 
Steve Meynell
Candata Systems

Re: [SQL] Selecting Most Recent Row

2000-12-13 Thread Steve Meynell


Joel,
Thank you very much.  I gave that a try and it worked perfectly.  It
definately was the distinct keyword I was missing.

Thanks Again,
Steve


Joel Burton wrote:

>
> Will DateStamp being the date of insertion? If so, is it that you want
> the record for the most recent (largest) datestamp for each
> batch_number?
>
> something like
>
> SELECT DISTINCT ON (batch_number) unique_key, datestamp,
> batch_number, instructions FROM widgets ORDER BY batch_number,
> datestamp desc;
>
> (sort by batch then by date (last first) and show the first (aka
> 'distinct') row, considering only the batch_number for distinctness)

[SQL] Help Retrieving Latest Record

2001-02-16 Thread Steve Meynell



Hi,  I am not sure if this went through the first time so here it
is again.
 
Ok what I am trying to do is select out of the database the latest record
meeting a certain criteria.
Example:
Number
|    Fruit
| Date
15   
Apples  July 20, 1999
20   
Oranges    June 7, 2000
13
Pears   
Jan 31, 2000
17
Apples April 10, 1999
Now what I need to do is select the oranges out because the date is
the latest one, something like:
select * from basket where max(date);
This would yield me:
20   
Oranges    June 7, 2000
I know this doesn't work but I need something like it.
or something like
select * from basket where max(date) and fruit='Apples';
This would yield me:
15   
Apples  July 20, 1999
Thank you in advance,
 
-- 
Steve Meynell
Candata Systems

[SQL] Help retrieving lastest record

2001-02-16 Thread Steve Meynell



Ok what I am trying to do is select out of the database the latest record
meeting a certain criteria.
Example:
Number
|    Fruit
| Date
15   
Apples  July 20, 1999
20   
Oranges    June 7, 2000
13
Pears   
Jan 31, 2000
17
Apples April 10, 1999
Now what I need to do is select the oranges out because the date is
the latest one, something like:
select * from basket where max(date);
This would yield me:
20   
Oranges    June 7, 2000
I know this doesn't work but I need something like it.
or something like
select * from basket where max(date) and fruit='Apples';
This would yield me:
15   
Apples  July 20, 1999
Thank you in advance,
 
-- 
Steve Meynell
Candata Systems

[SQL] Heres a good one...

2001-04-27 Thread Steve Meynell


Ok here is what looks like a good one that has stumped me.

Let me set it up... I have two tables in my database test.  They are
called journal and distrib.

journal looks like this (condensed)

   Table "journal"
   Attribute  |  Type  | Modifier
+-+--
 objectid   | integer   |
 posting_date  | char(4)  |
 gl_update_flag   | char(1)  |

And distrib look like this (condensed)

Table "distrib"
Attribute   |  Type | Modifier
--+-+--
 objectid | integer  |
 distrib_objectid   | integer  |
 source_objectid   | integer  |

The dataset for each is as follows

journal:
test=# select * from journal;
 objectid | posting_date | gl_update_flag
--+--+
  100| March 31   | H
  101| March 31   | H
  102| April 02 | Y
  103| April 02 | H
  104| March 14   | H
  105| February 01  | H
  106| February 01  | H
(7 rows)

distrib:
test=# select * from distrib;
 objectid | distrib_objectid | source_objectid
--+--+-
1  |103 | 100
2  |104 | 100
3  |101 | 102
4  |101 | 105
(4 rows)

Now the trick here is...

I want to update the gl_update_flag in journal to Y all of the records
where the gl_update_flag is H now and the posting date is before or on
March 31 and where the objectid from journal matches either the
distrib_objectid or the source_objectid from the distrib table we need
to also update the opposite journal entry to Y as well.  And can this be
done in one command?

An example from the above data set would be  Journal objectid 100 would
change and so would 103 and 104.
And Journal objectid 101 would update 101 and 102 and 105.
And Journal objectid 106 would only update 106.

Any Ideas?

Thanks in Advance,
Steve



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

[SQL] Re: Heres a good one...

2001-04-30 Thread Steve Meynell

First,  thank you very much for trying but you were a bit off the mark but
close.  Second, posting_date being of type char(4) is quite legal and
irrelevant for this exercise but for argument sake lets say char(15) but
apart from all that...

Your sql statement was close but it only update 4 out of the possible 6 that
is should have updated...  journal.objectid# 103 should have been updated
because of journal.objectid# 100 was and journal.objectid# 106 should have
been because it alone met the before March 31 and update flag = H criteria.
But it is a start for me thanks.

And your second sql statement just didn't work.  It said 'ERROR:  parser:
parse error at or near "select"'

Steve

Anuradha Ratnaweera wrote:

> First, posting_date in journal can _NOT_ be of type char(4)! I guess it is
> a "date".
>
> Try
>
> update journal set gl_update_flag='Y' from distrib where
> journal.gl_update_flag = 'H' and journal.posting_date <= '2001-03-31' and
> (journal.objectid = distrib.distrib_objectid or journal.objectid =
> distrib.source_objectid)
>
> or
>
> update journal set gl_update_flag='Y' where gl_update_flag = 'H' and
> posting_date <= '2001-03-31' and ((objectid in select distrib_objectid
> from distrib) or (objectid in select source_objectid from distrib))
>

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

[SQL] Doing a regexp-based search/replace?

2001-10-18 Thread Steve Frampton


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello:

I've got a table containing property_id's with values of the form
###-.  I would like to discard the slash onwards (and I can't use a
substr() because I am not guaranteed if a) the - portion exists, b)
what position it exists from.

If this were a text file, I would use a sed expression such as:

cat textfile | sed 's/-.*$//'

I've been looking for a way to do this with PostgreSQL but so far haven't
found a function that seems to be suitable.  I thought maybe I could do it
with translate, but translate doesn't appear to work with regular
expressions.  So far I've tried things like:

select translate(property_id, '-.*', '') from mytable;

I need to do this, because the -.* portion of my property_id was entered
in error, and I would like to do an update on the entire table and just
have the left-hand side of the property_id column remaining.

Any ideas?  Thank you in advance.

- ---< LINUX: The choice of a GNU generation. >-
Steve Frampton   <[EMAIL PROTECTED]>   http://www.LinuxNinja.com
GNU Privacy Guard ID: D055EBC5  (see http://www.gnupg.org for details)
GNU-PG Fingerprint: EEFB F03D 29B6 07E8 AF73  EF6A 9A72 F1F5 D055 EBC5
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.0 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7zv1TmnLx9dBV68URAisEAJ4nNYz4lxpgWojULE/Xo9SUb5IexgCfS9At
J6kAVn/3vFHeJkl9bjr4AcQ=
=W4xQ
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [SQL] Granting database level permissions...

2001-10-23 Thread Steve Brett


have a look at pg_hba.conf in your data dir. it's all in there.

Steve

"Thomas Swan" <[EMAIL PROTECTED]> wrote in message
[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> Is it possible to grant database level access to a user in PostgreSQL?
>
> I have created a user and database, and I want the user to have full
> control over that database.   I have been through the online docs and
> news groups but have not found a statement or answer to the question.
>
> Thomas
>
>
>
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

[SQL] list of tables ?

2002-07-11 Thread Steve Brett


can anyone point me in the right direction ?

i need to list all the tables in a database.

i've looked at pgadmin_tables which is empty and pga_schema whihc contains a
sinlge row i don't want to parse ...

is there an easier way t get a list of tables ?

i'm on 7.2

ta,

Steve Brett 



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [SQL] list of tables ? -update to question ...

2002-07-11 Thread Steve Brett


sorry ... i didn't make myself clear ...

i have of course come across \dt before ...


what i meant was via sql as in 'select tablelist from '

Steve

> -Original Message-
> From: Stephane Schildknecht [mailto:[EMAIL PROTECTED]]
> Sent: 11 July 2002 15:06
> To: Steve Brett
> Subject: Re: [SQL] list of tables ?
> 
> 
> Le jeu 11/07/2002 à 16:00, Steve Brett a écrit :
> > can anyone point me in the right direction ?
> > 
> > i need to list all the tables in a database.
> > 
> > i've looked at pgadmin_tables which is empty and pga_schema 
> whihc contains a
> > sinlge row i don't want to parse ...
> > 
> > is there an easier way t get a list of tables ?
> > 
> > i'm on 7.2
> > 
> > ta,
> > 
> > Steve Brett 
> \d
> 

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [SQL] list of tables ?

2002-07-11 Thread Steve Brett


thanks.

Steve

> -Original Message-
> From: Achilleus Mantzios [mailto:[EMAIL PROTECTED]]
> Sent: 11 July 2002 15:10
> To: Steve Brett
> Cc: Pgsql-Sql (E-mail)
> Subject: Re: [SQL] list of tables ?
> 
> 
> On Thu, 11 Jul 2002, Steve Brett wrote:
> 
> > can anyone point me in the right direction ?
> > 
> > i need to list all the tables in a database.
> > 
> > i've looked at pgadmin_tables which is empty and pga_schema 
> whihc contains a
> > sinlge row i don't want to parse ...
> > 
> > is there an easier way t get a list of tables ?
> > 
> > i'm on 7.2
> 
> select * from pg_tables;
> 
> > 
> > ta,
> > 
> > Steve Brett 
> > 
> > 
> > 
> > ---(end of 
> broadcast)---
> > TIP 4: Don't 'kill -9' the postmaster
> > 
> 
> -- 
> Achilleus Mantzios
> S/W Engineer
> IT dept
> Dynacom Tankers Mngmt
> tel:+30-10-8981112
> fax:+30-10-8981877
> email:  [EMAIL PROTECTED]
> [EMAIL PROTECTED]
> 

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [SQL] how do i import my sql query result to a file

2002-07-18 Thread Steve Brett


\? will get you a list of the commands in psql.

Steve

> -Original Message-
> From: Joseph Syjuco [mailto:[EMAIL PROTECTED]]
> Sent: 18 July 2002 22:47
> To: [EMAIL PROTECTED]
> Subject: [SQL] how do i import my sql query result to a file
> 
> 
> how do i import results of my select query to a file
> thanks
> 
> 
> 
> 
> ---(end of 
> broadcast)---
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [SQL] Ran out of connections

2002-12-04 Thread Steve Crawford

You probably didn't need to reboot - I suspect you could have probably 
restarted PostgreSQL and Apache (quick version) or killed the extra postgres 
processes.

I suspect you need to look carefully at your code and method of connecting 
(ie. are you using mod-perl, plain old cgi perl, PHP or what). A problem with 
Apache 1.x is that of connection pooling/persistence. If you don't use 
persistent connections you suffer the overhead of opening and closing the 
database connection as required by your web process. You can use mod-perl and 
keep your connections open but if your pool of web processes that have open 
connections exceeds the number of connections allowed by PostgreSQL you will 
have problems (pooling of database connections between processes is 
problematic so each Apache process has its own connection).

Be sure that there isn't a bug causing a cgi to abort leaving a stranded 
connection. I don't have experience with local unix socket connections where 
the client has died but when I have windoze users reboot when they had an 
open connection it will wait till the TCP/IP connection times out (frequently 
for an hour) before the PostgreSQL connection is closed. If the timeout is 
similar for local unix sockets then a failure in the cgi could leave open 
connections and you will run out quickly.

If you are doing lots of database backed work you may want to check out 
AOLserver (http://www.aolserver.com/). It has a multi-threaded architecture 
featuring connection pooling and persistence "out of the box." Oh, it's 
free/open-source as well.

Of course you can also get pooling/persistence with enterprise Java solutions 
such as JBoss (www.jboss.org).

Cheers,
Steve

On Wednesday 04 December 2002 2:08 pm, Mike Diehl wrote:
> Hi all.
>
> Twice this week, I've come to work to find my Postgres server out of
> connections... effectively freezing my web server.
>
> Today, before I rebooted the entire system, I did a ps -auxw and kept the
> file to study.  I didn't find too many clients running.  But I did find a
> whole LOT of postgres processes running, idle.  BTW, one of the postgres
> processes was doing a vacuum analyze.  I'm running 7.2.
>
> Can anyone tell me how to fix this?  The out put of the ps command can be
> seen at http://dominion.dyndns.org/~mdiehl/ps.txt
>
> Thanx in advance,

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [SQL] Ran out of connections

2002-12-04 Thread Steve Crawford

Doing anything unusual? Forking processes, opening multiple connections 
within a single CGI?

Have you seen any evidence that a process that opens a connection is failing 
to complete normally?

-Steve


On Wednesday 04 December 2002 3:52 pm, Mike Diehl wrote:
> On Wednesday 04 December 2002 03:25 pm, Roberto Mello wrote:
>  > On Wed, Dec 04, 2002 at 03:08:35PM -0700, Mike Diehl wrote:
>  > > Can anyone tell me how to fix this?  The out put of the ps command
>  > > can be seen at http://dominion.dyndns.org/~mdiehl/ps.txt
>  >
>  > Are you using PHP by chance? I've seen this behavior under
>  > Apache+PHP before. My "fix" (workaround rather) was to disable
>  > persistent connections.
>
> Nope.  I'm using Perl and cgi.  I've got some perl that runs via cron, and
> some more that runs via apache.  I'm not even using ModPerl.
>
> It did occur to me that since some of my scripts communicate with other
> devices, that I may have some IO blocking, or zombies, but the ps output
> didn't indicate that.  I can't see that many scripts running.  Usually, I
> see one postgres process for each script/cgi that is running.  Not in this
> case.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [SQL] Need help paging through record sets

2002-12-20 Thread Steve Crawford

Sort of depends on the nature of your application. You can use offset to get 
specific chunks:

select * from foo order by date limit 100 offset 100;

You should be aware, however, that on a very large table this can be quite 
inefficient as you will have to do the select and sort on the large table 
just to get the next chunk. (With only 600 tuples you will probably have 
everything in memory anyway so it's not really a problem.) Also, if the data 
is actively updated you could get a tuple added or deleted between page views 
which would mess up the offsets and cause someone to miss an item or get a 
duplicate.

If you want to page through a small subset of a large file you can use 
cursors or temporary tables but you will have to be sure your connection 
persistence, session management and such can accomodate such an arrangement.

Cheers,
Steve


On Friday 20 December 2002 12:53 pm, [EMAIL PROTECTED] wrote:
> Hello everyone, and thanks for reading my first newbie post. :-)
>
> I am a neopyhte PHP and postgreSQL user, with a website at
> www.the-athenaeum.org.  We store (among other things) artworks, which
> people can view in a list, sorted by artist name, date, medium, etc.
>
> We now have enough works that I need to rewrite the PHP listings script
> (and its embedded SQL) so that users can page through records.  As an
> example, if a user is looking at works by date ascending, they may want to
> see 100 records at a time.  Since we have 600+ records, there would be 7
> pages.  They'd start on the first page (of course!) and there would be
> links to pages 2 through 7 as well, just like with results pages of a
> Google search.  They could, from page 1, click any of the other pages to go
> immdiately to that set of 100 records for display.
>
> I see this kind of thing all over the place, and in looking it up, I see
> most solutions use "SELECT TOP x", which postgreSQL doesn't seem to have. 
> I know how to use LIMIT, but that always starts from the top.  I could add
> a piece to the WHERE clause, say something like "WHERE date > 01-02-1853",
> but how do I know where the cutoffs are several pages along, without
> retrieving the whole record set?
>
> I suppose the optimal solution for me would be to sort all of the records,
> then be able to select a range from that sorted record set.  So, if they
> click the link to page 3, I'd like to do this (in pseudocode):
>
> 1.  SORT records by the date field, descending
> 2.  Retrieve only records 200-299 from the sorted list
>
> Is there a way to do that?  How is it done elsewhere?
>
> Thanks in advance for your help,
> Chris McCormick, webmaster
> The Athenaeum - Interactive Humanities Online
> www.the-athenaeum.org
>
>
>
>
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [SQL] empty arrays

2003-01-02 Thread Steve Crawford

Caution!

In 7.2.x your statement is interpreted by the parser to be a single element 
with an empty string which is converted to a zero. If you do this instead:

create table test_table ( test_column integer[], another_column integer );
CREATE
steve=# insert into test_table (another_column) values (1);
INSERT 7140977 1
steve=# select * from test_table;
 test_column | another_column 
-+
 |  1
(1 row)

You will see that there is no item in the array which you can confirm with: 
 select array_dims(test_column) from test_table;
 array_dims 

 
(1 row)
(Which seemingly oddly shows nothing instead of 0 since there is not yet 
really any array to have dimensions. Per the docs, an array can be null but 
it cannot contain SQL null elements. What the docs don't say is that 
requesting an element beyond either end of an array does return a SQL null.)

Here's the gotcha: when you upgrade to 7.3.x your statement will generate an 
error since an empty string is no longer arbitrarily assumed to be a zero 
(either in integer fields or arrays) so if you attempt the same you will get:

create table test_table ( test_column integer[], another_column integer );
CREATE TABLE
tati=> insert into test_table (test_column) values ( '{ }' );
ERROR:  pg_atoi: zero-length string

If you want to insert a null array you are best off saying so:
insert into test_table (test_column) values (null);
INSERT 17053 1
tati=> select * from test_table;
 test_column | another_column 
-+
 |       
(1 row)

Cheers,
Steve


On Monday 30 December 2002 9:40 am, [EMAIL PROTECTED] wrote:
> using: psql (PostgreSQL) 7.2.1
>
> why does an empty array return an array of length 1 rather than array of
> length 0? one would think that the results below would have returned { }
> instead of {0}.
>
> simple test using psql:
>
> # create table test_table ( test_column integer[] );
> CREATE
> # insert into test_table (test_column) values ( '{ }' );
> INSERT 43475 1
> # select * from test_table;
>
> test_column
> 
>  {0}
> (1 row)
>
>
> i want to be able to store a zero-length array field in the database. how
> do i specify this with  sql?
>
> Regards,
>
> Floyd Shackelford
> 4 Peaks Technology Group, Inc.
> VOICE: 334.735.9428
> FAX:   916.404.7125
> EMAIL: [EMAIL PROTECTED]
> ICQ #: 161371538
>
> Shackelford Motto: ACTA NON VERBA - Actions, not words
>
> Alabama StateMotto: AUDEMUS JURA NOSTRA DEFENDERE - We Dare Defend Our
> Rights
>
>
>
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [SQL] index on to_char(created, 'YYYY') doesn't work

2003-01-15 Thread Steve Crawford

I don't have "real-life" experience with partial 
indexes but...

You probably won't see an increase in speed unless the index use can get you 
down to a really small fraction of your total row count (I don't know just 
how small but remember being surprised at the number but realizing, on 
reflection, that it made sense. It was something like 10% or less).

One thing you could try is to use a partial index (one containing only the 
rows in which you are interested).

Here's a really dumb example:

create index foo on session (username) where username is null and 
to_char(created, '') = '2002';

Why dumb? Because the index will only contain nulls. You could probably 
choose a more intelligent index based on the other queries you do. Still, 
this index could increase your query speed considerably (as long as your 
where in creating the index matches the where in your query - if you change 
your query to 2003 you will be back to a sequential scan).

BTW, I tried to create an index on the to_char function and had no luck - 
seems like it should work but it doesn't on 7.2.3 or 7.3.1.

Cheers,
Steve

On Wednesday 15 January 2003 4:08 am, Andreas Joseph Krogh wrote:
> On Wednesday 15 January 2003 16:12, you wrote:
> > On Wed, 15 Jan 2003, Andreas Joseph Krogh wrote:
> > > -BEGIN PGP SIGNED MESSAGE-
> > > Hash: SHA1
> > >
> > > On Wednesday 15 January 2003 11:37, you wrote:
> > > > The following does not work:
> > > >
> > > > create index session_u_idx on session (to_char(created, ''));
> > > > ERROR:  parser: parse error at or near "''" at character 57
> > > >
> > > > Can I make a function to do this and index using the result of that
> > > > funtion? Do anyone have an example of such a function?
> > >
> > > I tried the following function:
> > > - -
> > > create function drus (timestamp) returns varchar AS'
> > > DECLARE
> > > str_created VARCHAR;
> > > created ALIAS FOR $1;
> > > BEGIN
> > > str_created:= to_char(created, '''');
> > > RETURN str_created;
> > > END;
> > > ' LANGUAGE 'plpgsql';
> >
> > add
> > WITH (iscachable)
>
> Thank you, not _that_ works:-)
> But now this doesn't work:
> create index session_u_idx on session (drus(created), username);
>
> Can't I have multicolumn-indexes with functions? Any idea how to rewrite
> that so it works?
> Here is my session table:
> CREATE TABLE session (
>   session_id varchar(256) NOT NULL PRIMARY KEY,
>   created timestamp DEFAULT 'now' NOT NULL,
>   last_accessed timestamp NOT NULL,
>   destroyed timestamp NOT NULL,
>   username varchar -- Allow sessions from not logged in users
> );
>
> Here is my query I wish to optimize using indexes:
> SELECT to_char(created, 'IW') as week, count(session_id) from session WHERE
> username IS NULL and to_char(created, '') = '2002' group by week ORDER
> BY week;
>
> Any hints on optimizing this query, index-usage etc?

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [SQL] SQL to list databases?

2003-01-23 Thread Steve Crawford

psql -E causes psql to show it's "behind the scenes" queries to try:
psql -lE
(that's a lower case ell before the E)

Cheers,
Steve


On Thursday 23 January 2003 10:56 am, Ben Siders wrote:
> Is there a query that will return all the databases available, similar
> to what psql -l does?
>
>
>
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [SQL] bytea

2003-02-07 Thread Steve Crawford

That would be convenient but there are some difficulties. Say you have a 
function that worked something like:
insert.values (, filetobytea("/home/me/myfile"),...);
It would be a nice ability to have but the server may not be on the same 
machine (or even the same type/os/filesystem) as the client so sending a 
pathname to the server is of little use. Such an ability must be handled by 
the client which can read/translate the local file into the appropriate byte 
sequence to send to the server.

I'm toying with a client program in C program that would include the ability 
to take a file on the client machine and store it in a bytea field but 
haven't come up with a good way to handle the necessary buffer space (ie. I 
don't see any way to "stream" data to the server - it appears I need to 
allocate sufficient buffer space for the entire file plus the entire escaped 
version of the file before I can generate the query which causes me some 
other problems.) Any suggestions would be appreciated.

Cheers,
Steve

On Wednesday 05 February 2003 2:25 am, Adrian Chong wrote:
> Hi Christoph,
>
> Thanks for your reply. But what I want to do is loading a file of a
> particular path with a sql statement in psql. Why I need to care about how
> the file looks like? Thanks.
>
> Adrian
>
> - Original Message -
> From: "Christoph Haller" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Cc: <[EMAIL PROTECTED]>
> Sent: Wednesday, February 05, 2003 6:15 PM
> Subject: Re: [SQL] bytea
>
> > > I have a table containing a field of type bytea:
> > >
> > > CREATE TABLE a_table (
> > > a_field bytea
> > > );
> > >
> > > How can I import a file in a SQL script? What function I can use?
> >
> > The documentation says as in PostgreSQL 7.2.1 (I doubt this changed
> > significantly since)
> >
> > Octets of certain values must be escaped (but all octet values may be
> > escaped) when used as part of a string literal in an SQL
> > statement. In general, to escape an octet, it is converted into the
> > three-digit octal number equivalent of its decimal octet value, and
> > preceded by two backslashes.
> >
> > In general it goes like this
> > INSERT INTO a_table ( a_field ) VALUES ( '\\000\\001\\002\\003' ) ;
> > to load the first four ASCII characters.
> > You did not mention how your file looks like.
> > There is also a C function available called PQescapeBytea
> > which does all the required escaping to store memory areas in bytea
> > columns.
> > Refer to Command Execution Functions within libpq - C Library for
> > details.
> >
> > Regards, Christoph
> >
> >
> >
> > ---(end of broadcast)---
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to [EMAIL PROTECTED] so that your
> > message can get through to the mailing list cleanly
>
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

[SQL] Replacing a simple nested query?

2003-07-13 Thread Steve Wampler


I've got a simple nested query:

  select * from attributes where id in (select id from
 attributes where (name='obsid') and (value='oid00066'));

that performs abysmally.  I've heard this described as the
'classic WHERE IN' problem.

Is there a better way to obtain the same results?  The inner
select identifies a set of ids (2049 of them, to be exact)
that are then used to locate records that have the same id
(about 30-40K of those, including the aforementioned 2049).

Thanks!
-Steve

-- 
Steve Wampler -- [EMAIL PROTECTED]
Quantum materiae materietur marmota monax si marmota
monax materiam possit materiari?

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [SQL] Replacing a simple nested query?

2003-07-14 Thread Steve Wampler

On Sun, 2003-07-13 at 14:50, Steve Wampler wrote:
> I've got a simple nested query:
> 
>   select * from attributes where id in (select id from
>  attributes where (name='obsid') and (value='oid00066'));
> 
> that performs abysmally.  I've heard this described as the
> 'classic WHERE IN' problem.
> 
> Is there a better way to obtain the same results?  The inner
> select identifies a set of ids (2049 of them, to be exact)
> that are then used to locate records that have the same id
> (about 30-40K of those, including the aforementioned 2049).

For the record, Joe Conway and Hannu Krosing both provided
the same solution:

   select at.* from attributes_table at, attributes a
  where at.id = a.id and a.name='obsid' and a.value='oid00066';

which is several orders of infinity faster than than my naive
approach above:
-
lab.devel.configdb=# explain analyze select * from
attributes_table where id in (select id from attributes
where (name='obsid') and (value='oid00066')) order by id;
NOTICE:  QUERY PLAN:

Index Scan using id_index on attributes_table  (cost=0.00..8773703316.10
rows=241201 width=59) (actual time=136297.91..3418016.04 rows=32799
loops=1)
  SubPlan
->  Materialize  (cost=18187.48..18187.48 rows=15 width=25) (actual
time=0.01..1.68 rows=1979 loops=482402)
  ->  Index Scan using name_index on attributes_table 
(cost=0.00..18187.48 rows=15 width=25) (actual time=0.27..251.95
rows=2049 loops=1)
Total runtime: 3418035.38 msec
--
lab.devel.configdb=# explain analyze select at.* from
attributes_table at, attributes a
where at.id = a.id and a.name='obsid' and a.value='oid00066';
NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..18739.44 rows=217 width=84) (actual
time=0.76..1220.65 rows=32799 loops=1)
  ->  Index Scan using name_index on attributes_table 
(cost=0.00..18187.48 rows=15 width=25) (actual time=0.47..507.31
rows=2049 loops=1)
  ->  Index Scan using id_index on attributes_table at 
(cost=0.00..35.80 rows=12 width=59) (actual time=0.11..0.31 rows=16
loops=2049)
Total runtime: 1235.42 msec
---

My thanks to both Joe and Hannu!
Steve
-- 
Steve Wampler -- [EMAIL PROTECTED]
Quantum materiae materietur marmota monax si marmota
monax materiam possit materiari?

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [SQL] Table Partitioning and Rules

2003-07-17 Thread Steve Crawford

OK, so basically you are trying to keep a hundered some odd attributes on 
everyone in the US. It's possible that a 150 column table is properly 
normalized (I have a similar situation) but it is rare.

Suppose it is really properly normalized. You can still benefit from indexes 
on just some of the columns by choosing those most commonly used in queries. 
You may also want to research partial indexes (create index foo ... where 
bar=baz) which can under certain circumstances be far smaller and faster than 
full indexes.

Breaking the table up won't improve the full table scan - in fact as each 
sub-table's file grows it will probably fragment on the disk much worse than 
a single growing file would which will, along with all the overhead of 
joining all the tables, make things worse.

Review your structure carefully. Plan on $$$ for the hardware.

Cheers,
Steve

On Thursday 17 July 2003 12:22 pm, Girish Bajaj wrote:
> The problem is that Im worried about sequential scans. This particular
> table can have upto 150 cols and 250 million records. Now we have a
> reporting requirement that someone could select on ANY col and filter on
> any col as well. Meaning someone could so a SELECT on col number
> 1,2,310,1000 from contact where col num 75='X' and col num 139 = 'Y'.
>
> I cant possibly index all the cols in the table. So I thought Id best
> manage the data by splitting up the table into multiple partitions and
> eventually depending on application logic, only scan those tables that are
> necessary to scan sequentially instead of the whole big table.
>
> Im getting a little confused here cause eventually I would want to join in
> this 250 million gigantic table as well.. and that would be a real big
> problem causing loads of sequential scans wouldn't it?
>
> Thanks,
> Girish
>
>
> -Original Message-
> From: Josh Berkus [mailto:[EMAIL PROTECTED]
> Sent: Thursday, July 17, 2003 2:03 PM
> To: Richard Huxton; Girish Bajaj; [EMAIL PROTECTED]
> Subject: Re: [SQL] Table Partitioning and Rules
>
> Girish,
>
> > > Essentially Im trying to store a persons information in a table in the
> > > database. Since we could have millions of people, with duplicates! Ive
> > > decided we need to partition the table into segments where all people
>
> with
>
> > > the LastName starting from A to G will be in one table. H-N will be in
> > > another table and O-Z in the third. Ive created a VIEW that does a
> > > UNION
>
> on
>
> > > all the tables.
>
> This sounds hideously inefficient and a management headache besides.  I
> think
> PostgreSQL will accept up to 2 billion rows in any one table, and splitting
> stuff into 3 tables will not improve your performance ... quite the
> opposite.
>
> Change your database design.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [SQL] Calc

2003-10-15 Thread Steve Crawford

On Wednesday 15 October 2003 12:59 pm, Muhyiddin A.M Hayat wrote:
> Dear all,
>
> I have below table
>
> +-++-+
>
>  | id   |db|cr  |
>
> +-++-+
>
>  |1 || 200  |
>  |2 | 100|  |
>  |3 | 100|  |
>  |4 | 150|  |
>  |5 || 200  |
>
> I Would like to calc that balance, and look like
>
> +-++-+---+
>
>  | id   |db|cr  |
>  | bal|
>
> +-++-+---+
>
>  |1 || 200 
>  | |200   | 2 | 100|   
>  |   |100   | 3 | 100| 
>  | |0| 4 | 150 
>  |   |  |-150  | 5 |   
>  | | 200  |50 |
>
> What can I do to get result like that

Something like this:

select id,db,cr,(select sum(cr-db) from calc sub where sub.id <= 
calc.id) from calc;

This of course assumes that ID indicates the correct order of the 
entries and it will blow up if you allow nulls for cr or db (which 
you shouldn't since that would literally be interpreted as "they 
withdrew 10 and deposited an unknown amount"). If you have null 
values already and they should be interpreted as 0 just do this:

select id, db, cr, (select sum(coalesce(cr,0)-coalesce(db,0)) from 
calc sub where sub.id <= calc.id) from calc;

I assume no responsibility for potential lack of scalability of this 
query. :) It's quite possible a faster solution exists - we'll see 
what comes up on the list.

Cheers,
Steve

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [SQL] Calc

2003-10-16 Thread Steve Crawford

On Thursday 16 October 2003 10:37 am, Muhyiddin A.M Hayat wrote:
> > Something like this:
> >
> > select id,db,cr,(select sum(cr-db) from calc sub where sub.id <=
> > calc.id) from calc;
> >
> > This of course assumes that ID indicates the correct order of the
> > entries and it will blow up if you allow nulls for cr or db
> > (which you shouldn't since that would literally be interpreted as
> > "they withdrew 10 and deposited an unknown amount"). If you have
> > null values already and they should be interpreted as 0 just do
> > this:
> >
> > select id, db, cr, (select sum(coalesce(cr,0)-coalesce(db,0))
> > from calc sub where sub.id <= calc.id) from calc;
> >
> > I assume no responsibility for potential lack of scalability of
> > this query. :) It's quite possible a faster solution exists -
> > we'll see what comes up on the list.
>
> If  data from "View" without ID, how can I do?
>
> My View:
>
>   trx_date  | trx_time |  descriptions|
> payment_method | debet |  credit  | creator
> +--+--+
>- ---+---+--+-
>  2003-10-09 | 21:55:02 | Resto Biling : 13,800, Paid : 10,000 |
> Visa
>
> |  3800 |  | middink
>
>  2003-10-16 | 03:28:30 | Payment - Thank You  |
> Visa
>
> |   | 4.00 | middink
>
>  2003-10-08 | 18:17:40 | Payment - Thank You  |
> Cash
>
> |   | 5.00 

The id field only keeps the transactions in the correct order so you 
can sum the previous transactions. You can do the same thing with 
your date and time fields (though using a single datetime field would 
be easier). A caution, though: you will have trouble if two 
transactions share the same date and time.

Cheers,
Steve



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[SQL] Copying rows between tables?

2004-01-13 Thread Steve Wampler


I know I can select from one table into a new table, but is
there a way to select from one table into an existing table
(aside from copying everything to a file, editing the file
and then copying from that file)?  Without destroying existing
entries, of course...

I have an application where old records are archived into
'archive' tables.  Occasionally there is a need to copy
some of these old records into the 'active' table.

Thanks for any pointers!
Steve
-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [SQL] User defined types -- Social Security number...

2004-03-01 Thread Steve Crawford

On Monday 01 March 2004 8:54 am, Dana Hudes wrote:
> I would represent an SSN as numeric(9,0).
> an int 32 would work though.
> 2**31 is > 9
>
> On Sun, 29 Feb 2004, Christopher Browne wrote:
> > The world rejoiced as [EMAIL PROTECTED] (Michael Chaney) 
wrote:
> > > Look, you're thinking way too hard on this.  An SSN is a
> > > 9-digit number, nothing more.  There are some 9-digit numbers
> > > which aren't valid SSN's, and you might want to get fancy and
> > > create a constraint for that.
> > >
> > > Regardless, you are making a *major* mistake of confusing data
> > > storage with rendering.  It is common to *render* an SSN as
> > > xxx-xx- and its cousin the FETID (Federal Employers Tax ID)
> > > as xx-xxx.  To store the dashes makes no sense.  They're in
> > > the same place each time, it's wasted data.

I missed the start of this thread but will chime in with a comment 
anyway.

My rule is to select an appropriate numeric type of data if you will 
be doing numeric types of things to it, character types if you will 
be doing character manipulations, etc.

I don't know of any good reasons to need to know SSN/6.9, sqrt(SSN), 
SSN+7.86 but there are plenty of good reasons to need the first three 
characters (the "area number"), the middle two characters (the "group  
number", and the last 4 characters (the "serial number", often 
(ab)used as a password for banking and other purposes).

While the dashes certainly don't need to be stored, they are not in 
arbitrary positions as they delimit the parts of the SSN noted above. 

One might even want to store the SSN in three columns, the first 
linked to a table of valid area numbers, the second error checked so 
"00" is not valid and so on or get even more fancy and error check 
against: http://www.ssa.gov/employer/highgroup.txt. It all depends on 
one's specific requirements.

Google and you will find SSN info pages such as: 
http://proagency.tripod.com/usasssearch.html
http://www.ssa.gov/foia/stateweb.html

Cheers,
Steve

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [SQL] "=" operator vs. "IS"

2004-06-28 Thread Steve Crawford

> I'm just curious - why is it not possible to use the "=" operator
> to compare values with NULL? I suspect that the SQL standard
> specified it that way, but I can't see any ambiguity in an
> expression like "AND foo.bar = NULL". Is it because NULL does not
> "equal" any value, and the expression should be read as "foo.bar is
> unknown"? Or is there something else I'm missing?

You've got it. NULL is the _absence_ of a known value so any 
comparison or operation on it yields an unknown result.

So why can't you use = NULL?

Consider the a list of names and ages where Jack's and Jill's ages are 
null. Now we run a query to list people who are of the same age. 
Should Jack and Jill be listed as being the same age? Of course not. 
You can't compare whether the two unknown values are equal any more 
than you could determine whether or not they are over 18.

The SQL spec and PostgreSQL properly use and enforce this 
interpretation of NULL.

The correct way to ask your questions is ...where foo.bar is null...

Cheers,
Steve




---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [SQL] locks and triggers. give me an advice please

2004-07-20 Thread Steve Crawford

> often, I am turning triggers off and on to perform a mass operation
> on a table, and i am interested how should i care of another user
> operations.
>
> the scene is:
> table t1 with user defined triggers
> and many tables reference t1, (so FK triggers defined on t1)
>
> the operation i want to perform on t1 makes a great load to a
> server and have no use in triggers at all.
> the best way to perform this operation is to delete all records,
> modify, and insert them back without changing any adjuscent table.
> (this way takes a few seconds.)
> so i turn off triggers on t1 completely (updating
> pg_class.reltriggers) operate
> and turn on triggers on t1.
>
> it works fine.
>
> the question is:
>
> what should i do to prevent other users of data modification on the
> t1 and the adjuscent tables while triggers is off ?

If I understand your question correctly you should use a transaction 
and lock the table;

begin transaction;
lock t1 in access exclusive mode;

Turn off triggers and do your updates.
(Note, "truncate t1" is faster than "delete from t1" followed by a 
"vacuum full" and you might consider running "reindex table t1" after 
your mass update or if appropriate drop your indexes, load the data, 
then recreate them.)

Re-establish triggers.

commit; --end of transaction unlocks the table


Cheers,
Steve


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [SQL] Cast NULL into Timestamp?

2004-12-10 Thread Steve Crawford

On Friday 10 December 2004 11:24 am, Wei Weng wrote:
> I have a table
>
> create table temp
> (
>   tempdatetimestamp,
>   tempnamevarchar(10)
> );
>
> And I tried to insert the following:
>
> insert into table temp (tempname, tempdate)
> select distinct 'tempname', null from some_other_relevant_table;
>
> And I got an error that says "column "tempdate" is of type
> timestamp ... but expression is of type text... will need to
> rewrite or cast the expression".
>
> I really need the distinct. Is there anyway to cast this NULL into
> a timestamp or any other workarounds?

How about:
insert into table temp (tempname)
select distinct 'tempname' from some_other_relevant_table;

Unless there's something you have left out in describing your setup 
this will leave the tempdate column null.

Cheers,
Steve


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

[SQL] Looking up table names by REFERENCES

2005-01-25 Thread Steve Castellotti






    Hello all. I'm trying to write a recursive procedure to allow me to handle some data backup routines and deal with conflicts when restoring from within the program.

    Essentially, what I'd like to be able to do is if a table called "image" has a column called "file_id" which references a column called "file_id" in another table called "file" I want to be able to determine that pragmatically. If I wanted to backup all of the information about images in the database, I would need to backup all of the information about the file(s) each image corresponds to.

    For instance, I can get a list of all (user) table names with:

SELECT relname AS table_name, oid
	FROM pg_class
		WHERE NOT relname ~ 'pg_.*'
			AND NOT relname ~ 'pga_.*'
			AND NOT relname ~ '.*_pkey'
			AND NOT relname ~ '.*_id_key'
		ORDER BY relname;

and I can get a list of column names and their types (for the "image" table) with:

SELECT a.attname AS field, t.typname AS type 
	FROM pg_class c, pg_attribute a, pg_type t 
		WHERE c.relname = 'image' and a.attnum > 0 
			and a.attrelid = c.oid and a.atttypid = t.oid 
		ORDER BY a.attnum;


    Surely there's a simple way I can trace REFERENCES in a particular column across tables?


    Any help would be most appreciated, especially if I could be cc'd directly.


Cheers

Steve Castellotti

[SQL] Pagination with Output Variables?

2005-02-22 Thread Steve - DND

I have a MSSQL Stored Procedure that returns a particular section of a
resultset depending on the desired "page". This procedure also has a
parameter marked as OUTPUT which indicates the total number of records in
the actual resultset. Is there a way to do something similar in PG or does
TotalRecords need to be part of the type returned by the function.

MSSQL Procedure

CREATE PROCEDURE PageCustomers (
@Page int,
@PerPage int,
@TotalRecords int OUTPUT
) AS

--Do stuff to determine number of records
SELECT @TotalRecords = COUNT(*) FROM tmpCustomers

--Get the actual data
SELECT CustomerID, CustomerName, EmailAddress FROM tmpCustomers WHERE ...


Would the only way to do this in PG to have my function return a type like
this?
CREATE TYPE "PageCustomers_type"
   AS ("TotalRecords" int4, "CustomerID" int4, "CustomerName" varchar(100),
"EmailAddress" varchar(100));

Can a function return two type results? So that the first type would be just
the total number of records, and the second type would be the resultset of
customers?

Thanks,
Steve



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[SQL] Junk queries with variables?

2005-02-23 Thread Steve - DND

I really have to be missing something here and this probably a *really* noob
question. I don't have a problem running little junk queries in the pgAdmin
query window(SELECT blah FROM blah, INSERT INTO blah, etc...), but I can't
figure out how to run queries with variables outside of a function. I just
want to use variables without having to go about creating and dropping a
function for every stupid little query I need to write. Example:

amount int4 := 1000;
earliest_date timestamp := current_timestamp;

SELECT ...

I always get the error: "ERROR:  syntax error at or near "amount" at
character 1". What have I done wrong, or am I missing?

Thanks,
Steve



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [SQL] Junk queries with variables?

2005-02-24 Thread Steve - DND

> Michael's given you one option - another to look at is PREPARE/EXECUTE
> PREPARE my_query(int4) AS INSERT INTO foo VALUES ($1);
> EXECUTE my_query(1);
> EXECUTE my_query(7);
> ...

This doesn't seem to be quite what I'm looking for. PREPARE according to the
docs is for a one line statement. I'm looking for the ability to do multiple
operations, basically using it for general DB queries that aren't really
needed more than once. This again comes from my MSSQL background where you
can go to town and just start writing out TSQL to do one off queries.

Thanks,
Steve



---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [SQL] Junk queries with variables?

2005-02-24 Thread Steve - DND

>
> I don't know about pgAdmin, but in psql you can use \set:
>
> \set id 1
> SELECT * FROM foo WHERE id = :id;
>
> \set name '\'Some Name\''
> SELECT * FROM foo WHERE name = :name;
>

Whenever I try the above I get an error at the backslash. Do I need to
create a different language for this? Right now I only have plpgsql
available.

Thanks,
Steve



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [SQL] Junk queries with variables?

2005-02-28 Thread Steve Valaitis

>
> In pgadmins SQL-window SQL is the 'language' of choice. Or it is
> rather the
> only language. Thus if you intend to program plTk or PL/pgSQL, there's no
> way around defining a function.
>
> (At first you have to define a new language in your schema)

I'm a little confused, is there no way around this, or are you saying I need
to use CREATE LANGUAGE to define a new language to use? Currently the only
language I have for the DB is plpgsql.

Thanks,
Steve



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [SQL] Make COUNT(*) Faster?

2005-07-08 Thread Steve Wampler

Chris Browne wrote:

> None of those transactions have COMMITted, so there are some 78 tuples
> "in limbo" spread across 16 transactions.
> 
> If there were some "single secret place" with a count, how would you
> suggest it address those 78 tuples and 16 transactions that aren't yet
> (and maybe never will be) part of the count?

Hmmm, I understand this and don't doubt it, but out of curiousity, how
does the current SELECT COUNT(*) handle this?  It doesn't lock the entire
table while counting (I assume) so the current implementation is really
just an approximate count in the above scenario anyway.  Or even when
not, since the true 'count' is likely to have changed by the time the
user does anything with the result of SELECT COUNT(*) on any active table
(and on an inactive table, pg_class.reltuples is nearly as good as
SELECT COUNT(*) and far faster to get to.)

I assume this has been beaten well past death, but I don't see why it
wouldn't be possible to keep pg_class.reltuples a bit more up-to-date
instead of updating it only on vacuums.

-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [SQL] Make COUNT(*) Faster?

2005-07-08 Thread Steve Wampler

Bruno Wolff III wrote:
> No, it is an exact count.

Yes, for the transaction, but it's an approximation of the number of
tuples in the table - which is probably what the people who worry about
its cost are more interested in (an approximate count for the table).
I'm also claiming that a true count for any active table is meaningless and
am *not* suggesting that effort be spent on trying to produce such
a true count.

>>I assume this has been beaten well past death, but I don't see why it
>>wouldn't be possible to keep pg_class.reltuples a bit more up-to-date
>>instead of updating it only on vacuums.
> 
> 
> Because it costs resources to keep track of that and people don't usually need
> exact tuple counts for whole tables. 

Yes, we agree completely! (Which is why I said 'a bit more' instead of
'exactly' above.)  My uses for COUNT(*) are to get 'reasonable' approximate
counts of the table sizes - not true counts, but approximate values.  
Unfortunately,
pg_class.reltuples gets too far off too fast for me to use it as a consistent 
guide
to current table size.  If you Folks Who Know believe that simply keeping
pg_class.reltuples 'closer' to the actual table size is too expensive, I'll
accept that [after all, I have to right now anyway], but I'm surprised that
it is, given all the other work that must go on at the start/close of a 
transaction.

I also understand that 'reasonable' and 'closer' are vague terms.
In the example scenerio where there were around 80 rows in an indeterminate
state, my claim is that, in a table of around a million rows, it doesn't
matter whether some portion of those indeterminate rows are included in
an approximation of the table size or not (though it might in a table of
100 'true' rows - but the decision to ask for a true 'transaction' count (slow)
or an approximate table size (fast) should be left to the user in either case).

So, leave COUNT(*) alone.  But it would be very handy to have a
way to get an approximate table size that is more accurate than is
provided by a pg_class.reltuples that is only updated on vacuums.

-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [SQL] Make COUNT(*) Faster?

2005-07-08 Thread Steve Wampler

Tom Lane wrote:
> Steve Wampler <[EMAIL PROTECTED]> writes:
> 
>>So, leave COUNT(*) alone.  But it would be very handy to have a
>>way to get an approximate table size that is more accurate than is
>>provided by a pg_class.reltuples that is only updated on vacuums.
> 
> If you want something cheap, you could use the same technique the
> planner uses nowadays: take RelationGetNumberOfBlocks() (which is
> guaranteed accurate) and multiply by reltuples/relpages.  I don't
> see anyplace where RelationGetNumberOfBlocks is directly exposed to
> users now, but it'd be trivial to code up a couple of C functions to
> provide this functionality.

Yes - this would be an excellent approximation for my needs!  The
solution that Dawid Kuroczko suggested (just call "explain select *
on ..." and parse the result) would be equivalent these days, right?
(I think in the 7.x versions the planner was just using pg_class.reltuples,
which wouldn't have helped.)  If true, I can handle that parsing myself
easily enough without exposing RelationGetNumberOfBlocks.

Thanks (Tom and Dawid)!

-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [SQL] Make COUNT(*) Faster?

2005-07-08 Thread Steve Wampler

Christopher Browne wrote:
>>I'm also claiming that a true count for any active table is
>>meaningless and am *not* suggesting that effort be spent on trying
>>to produce such a true count.
> 
> 
> That's a pretty big assumption that would in fact be WRONG.

Please reread the message from Bruno and reconcile the above statement
with his assertion (which I believe) that there is *no* single true count
for an active table.  [I'm defining 'active' as currently
undergoing insert/copy/delete/update actions].

> We have managers interested in counting the number of objects we have
> around (As a domain registry, what objects would you imagine those
> might be :-)), and they're keen on possibly even being able to
> reconcile those counts from day to day based on transaction activity.

If Bruno is correct, then they need to do this reconcilation from
within a single transaction (the same one that does the COUNT(*)) - or
else they are working on an 'inactive' table [one not currently
accepting changes].  If neither condition holds, then isn't the result
they are using from COUNT(*) currently is *already* an approximation?

> Leaping into some sort of vague guesstimation would destroy the
> ability to do any kind of analysis of activity, and I daresay enrage
> them.

No doubt!  Let's hope the above conditions hold.

> There may be times that a really rough guess can suffice; there are
> other times when exactness is absolutely vital.

But, as others have said, COUNT(*) does not return a true count for
a table, but rather just a true count for the *current transaction*.
So COUNT(*)'s from different simultaneous transactions may very well
produce different values.

> Creating a "fast but WRONG COUNT(*)" which prevented getting the exact
> answer that the present implementation provides would be a severe
> misfeature.

Agreed - note that I did not suggest replacing the current COUNT(*)
with an inexact version, but wanted (and now have) a quick way to get
a reasonable approximation of the current table size.

-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[SQL] Problem with "NOT IN (subquery)

2005-11-12 Thread Steve SAUTETNER

Hi,

I have a table named "famille" whose structure and content is :

 famille_code | famille_mere_famille_code |  famille_libelle  |
famille_niveau
--+---+---+-
---
 00   |   | Mhre  |
0
 I1   | 00| Composants|
1
 IN0001   | I1| Micro-processeurs |
2
 IN0002   | I1| Mimoires RAM  |
2
 INS001   | IN0002| DDR-SDRAM |
3
 INS002   | IN0002| DDR2-SDRAM|
3
 INS003   | IN0002| SDR-SDRAM |
3
 IN0003   | I1| Cartes mhres  |
2
 IN0004   | I1| Disques durs  |
2
 IN0005   | I1| Cartes graphiques |
2
 IN0006   | I1| Cartes son|
2
 IN0007   | I1| Riseau|
2
 IN0008   | I1| Lecteurs CD/DVD   |
2
 IN0009   | I1| Graveurs CD/DVD   |
2
 IN0010   | I1| Bontiers  |
2
 I2   | 00| Piriphiriques |
1
 IN0011   | I2| Cli USB   |
2
 IN0012   | I2| Modems|
2
 IN0013   | I2| Imprimantes   |
2
(19 rows)

The first col is the family id and the second is the mother family id.
I would get a list of all families that are never in the col n°2, so
the families that aren't node but leaf.

The query, i made was "SELECT * FROM famille WHERE famille_code NOT IN
(SELECT DISTINCT famille_mere_famille_code FROM famille);"

But the DB returns 0 records instead of 15. If i use a list instead of a
subquery it works normaly but it's not easy to manage it like this.

So if anyone can help me please ...


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [SQL] plsql / time statement

2006-02-28 Thread Steve Crawford


Daniel Caune wrote:

Hi,

 

Is there any option to set so that psql provides the execution time of 
each SQL statement executed?


\timing (either as a manual command or as a default in your .psqlrc file).

Cheers,
Steve

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [SQL] how to solve this problem

2006-04-13 Thread Steve Crawford


I have a table where there are 20 columns named
vinc1, vinc2, vinc3, vinc4, etc

the values contained into each column are simply 1 or 0  (each column is
dichotomic)
1 means presence
0 means absence

I would obtain a column (new_column) containg the name of the columns,
comma separated, where the value is = 1

for example:

vinc1   vinc2   vinc3   vinc4   new_column
1   0   1   0   vinc1,vinc3
0   0   0   1   vinc4
0   1   1   1   vinc2,vinc3,vinc4

can someone help me to find the best way to obtain this result???
thank you very much


Here's a brute-force method. Maybe someone else has a more elegant way. 
More info on the nature of your data and what you are trying to obtain 
from it would help in finding such elegance.


select
  substr(
  case when vinc1 = 1 then ',vinc1' else '' ||
  case when vinc2 = 1 then ',vinc2' else '' ||
  ...
  case when vinc20 = 1 then ',vinc20' else ''
  ,2) as new_column ...

As to the binary representation mentioned elsewhere, the idea is that 
you can view vinc1-vinc20 as bits in a binary number equal to vinc1 + 
2*vinc2 + 4*vinc3 + 8*vinc4...2^19*vinc20. Whether or not this is useful 
depends on what you are trying to do.


Cheers,
Steve

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [SQL] it's not NULL, then what is it?

2009-06-30 Thread Steve Crawford


...



  canon=# select count(maf) from gallo.sds_seq_reg_shw
  canon-#  where maf ISNULL;
   count
  ---
   0
  (1 row)

I believe count will only count not-null anyway so this will always 
return zero. Try count(*) instead of count(maf). Here's an example:


st...@[local]=> select * from barr;
LOG:  duration: 0.226 ms
a | b
---+---
a | b
c | d
  | e
(3 rows)

st...@[local]=> select coalesce(a, 'a is null'), coalesce(b, 'b is 
null') from barr;

LOG:  duration: 0.283 ms
coalesce  | coalesce
---+--
a | b
c | d
a is null | e
(3 rows)

st...@[local]=> select count(a) from barr;
LOG:  duration: 0.339 ms
count
---
2
(1 row)

st...@[local]=> select count(*) from barr where a isnull;
LOG:  duration: 0.350 ms
count
---
1
(1 row)



--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] SQL report

2009-07-30 Thread Steve Crawford


[email protected] wrote:

I have the following senario.

I have a tracking system. The system will record the status of an 
object regularly, all the status records are stored in one table. And 
it will keep a history of maximum 1000 status record for each object 
it tracks. The maximum objects the system will track is 100,000. Which 
means I will potentially have a table size of 100 million records.


I have to generate a report on the latest status of all objects being 
tracked at a particular point in time, and also I have to allow user 
to sort and filter on different columes in the status record displayed 
in the report.

...


Just wanna to know if anyone have a different approach to my senario. 
Thanks alot.


Not knowing all the details of your system, here are some things you 
could experiment with:


1. Add a "latest record id" field in your object table (automatically 
updated with a trigger) that would allow you to do a simple join with 
the tracking table. I suspect that such a join will be far faster than 
calculating "max" 100,000 times at the expense of a slightly larger main 
table.


2. Add a "current record flag" in the status table that simply flags the 
most recent record for each object (again, use triggers to keep the flag 
appropriately updated). This would also eliminate the need for the "max" 
subquery. You could even create a partial index filtering on the 
"current record flag" which could speed things up if the reporting query 
is written correctly.


3. Partition the table into a "current status table" and "historical 
status table" (each inheriting from the main table). Use a trigger so 
that anytime a new status record in added, the old "current" record is 
moved from the "current" to the "historical" table and the new one added 
to the "current" table. The latest status report will only need a simple 
join on the "current" table with a max size of 100,000 rather than a 
more complex query over a 100,000,000 record table.


Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Month/year between two dates

2009-08-11 Thread Steve Crawford


Bor wrote:

Hi to all,

I have a very simple question. Let's say that I have three records (id, date
from, date to):

1 2009-01-01 2009-08-31
2 2009-08-01 2009-08-10
3 2009-08-11 2009-08-31

Now I want to get records, "related" to a single month/year data (two
integers). For 2009/08 (int1 = 2009, int2 = 8) I should get all three
records, for 2009/05 only record 1, but for 2009/11 none of the records.

Is there any simple way to do this? A query would do :).

Thanks alot.
  
Lots of ways. The following springs to mind but I'm sure there are 
simpler ways (I'm assuming the date_from and date_to are data-type date 
and you are stuck with using int for year and month).


Use date_trunc to convert any date in a month to the first of the month 
and the following should work (untested):

...
date_trunc('month', date_from) <= (int1::text || '-' || int2::text || 
'-1')::date and
date_trunc('month', date_to) >= (int1::text || '-' || int2::text || 
'-1')::date

...

Cheers,
Steve

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] LIMIT 1; The Integer part only

2009-09-03 Thread Steve Crawford


bilal ghayyad wrote:

1) When writing the function (I mean sql function) in the postgresql, I noticed 
the use for LIMIT 1, but did not understand what does it mean and why we use it?
  
Limit return to 1 record (or other specified number). Note that without 
ORDER BY, there is no guarantee of which record will show up at the top 
of the list. See also OFFSET. I.e. to skip the first 30 records and only 
return the next 10 (records 31-40) in your list:


SELECT * from foo order by bar LIMIT 10 OFFSET 30;

2) Also in the function (the sql function) in the postgresql, if I need to take 
the integer part of the number, which math function can do this for me?

For example, I have 3.900 and I need only the 3 (the integer part), which math 
function to be used?
  

floor(3.900)

Cheers,
Steve

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Table Design for Hierarchical Data

2010-04-06 Thread Steve Crawford


Lee Hachadoorian wrote:


I am trying to come up with a structure to store employment data by 
NAICS (North American Industrial Classification System). The data uses 
a hierarchical encoding scheme ranging between 2 and 5 digits. That 
is, each 2-digit code includes all industries beginning with the same 
two digits. 61 includes 611 which includes 6111, 6112, 6113, etc. A 
portion of the hierarchy is shown after the sig.

From the http://www.census.gov/eos/www/naics/ website:

"NAICS is a two- through six-digit hierarchical classification system, 
offering five levels of detail. Each digit in the code is part of a 
series of progressively narrower categories, and the more digits in the 
code signify greater classification detail. The first two digits 
designate the economic sector, the third digit designates the subsector, 
the fourth digit designates the industry group, the fifth digit 
designates the NAICS industry, and the sixth digit designates the 
national industry. The five-digit NAICS code is the level at which there 
is comparability in code and definitions for most of the NAICS sectors 
across the three countries participating in NAICS (the United States, 
Canada, and Mexico). The six-digit level allows for the United States, 
Canada, and Mexico each to have country-specific detail. A complete and 
valid NAICS code contains six digits."


I think I'd be inclined to store it as defined above with tables for 
sector, subsector, industry-group and NAICS-industry. So the NAICS table 
might have a primary key of industry_code (11131, Orange Groves) and a 
industry_group column with a foreign-key constraint to the 
industry-group table (1113, Fruit and Tree Nut  Farming). You might add 
a constraint to ensure that the industry-group is the appropriate 
substring of the naics code and so on up the heirarchy. If you are 
dealing with importing a large amount of static source data for 
analysis, these tables will also be tailor-made places to do 
pre-aggregation.


Adjacency lists work well in certain cases where the depths of the trees 
are variable or indeterminate. For example, think of an employee->boss 
org-chart for a large company. The maintenance supervisor for an area 
might be a dozen levels below the CEO and be a several levels above the 
branch night janitor while the CEO's personal assistant is just one 
level down but with no direct reports. The CTE/recursive-query features 
in 8.4 are great for this. But in the case you have described, the 
number of levels is well defined as is the type of information 
associated with each level.


But this all depends on the nature of your source data, how often it is 
updated, how big it is and the questions you want answered. It might be 
perfectly acceptable to just have the 5-digit code on all your 
individual data records and do something like select ... group by 
substr(full_naics_code,1,3) where substr(full_naics_code,1,2)='61'). In 
this case you will still want to keep the NAICS definition table 
separate and link to it.


One question that might impact this is the coding of your source data. 
Is it all full 5-digit coding or are some records coded at a high level 
of detail and others only to the top-level?




One way to store this data would be to store at the most granular 
level (5-digit NAICS) and then aggregate up if I wanted employment at 
the 4-, 3-, or 2-digit level. The problem is that because of 
nondisclosure rules, the data is sometimes censored at the more 
specific level. I might, for example, have data for 6114, but not 
61141, 61142, 61143. For a different branch of the tree, I might have 
data at the 5-digit level while for yet another branch I might have 
data only to the 3-digit level (not 4 or 5). I think that means I have 
to store all data at multiple levels, even if some of the higher-level 
data could be reconstructed from other, lower-level data.


What do you mean by censored? Is the data supplied to you pre-aggregated 
to some level and censored to preserve confidentiality or are do you 
have the record-level source data and the responsibility to suppress 
data in your reports? Is the data suppression ad-hoc (i.e. someone comes 
to you and says don't display these five aggregates), based on simple 
rules (don't display any aggregate with fewer than 15 records) or on 
more complex rules (don't display any data that would allow calculation 
of a group of fewer than 15)? My guess is that the multi-table scenario 
will be better suited to flagging aggregates for suppression.


Cheers,
Steve

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

[SQL] creating a versioning system for sets?

2010-04-20 Thread Steve Lefevre

I'm working on a web app for a quality control checklist. I already
have a table set up, but I have a hunch that our model is sub-optimal
and I could get some better performance.I'm hoping someone on this
list can help me think clearly about how to express this efficiently
in SQL.

Each checklist has dozens, sometimes hundreds of questions. Each
question has between 2 and 10 possible answers. Each Question is a
varchar string, and so is each answer. A completed checklist is when
all of the questions are associated with one of its possible answers
-- i.e., when one answer is chosen.

Checklists are different for different purposes, and they can change
over time. So to keep completed checklists from inadvertently changing
when we want to have changes in new checklists, we have templates.
Templates, Questions, and Answers are a mirror of Checklists,
Questions, and Answers, and represent the 'current version' of the
checklist. We don't do the checklists ourselves, but provide them for
our clients, so that is another level of grouping going on.

So the current table hierarchy looks like this

- Clients
 - Templates
   - TemplateQuestions
 - TemplateQuestionAnswers
 - Checklists
   - ChecklistQuestions
 - ChecklistQuestionAnswers

Because we don't want changes in the current template to 'go back in
time' and change completed checklists, data is copied from Templates
into Checklists when a user goes to start a new checklist.

As you can guess, this creates a lot of duplication. In
ChecklistQuestionAnswers, out of about a million answer rows, there
are only 4,000 distinct answers. Of course, TemplatesQuestionAnswers
has duplication too, but not as bad.

So what I'm think I want to do is create a versioning system for
checklist templates, so I can save on space by storing unique
questions with unique sets of answers only once. That way, instead of
duplicating text wholesale, I can just link a Checklist against a
*version* of a Template, and then a checklist set is which answer was
chosen for which question.

Here's what I've sketched out so far.

"A clients has many templates. A template has many revisions, but only
one current revision. Each revision has many questions, and each
question has many ( between 2 and 10 ) answers. Each Checklist relates
to one Template. Each checklist has a set answers that indicate the
answer select for each question in its version of the template."

Questions /* all unique question wordings */
Questions.id
Questions.question

Answers /* all unique answer wordings. */
Answers.id
Answers.answer

Templates
Templates.client_id /* relates to client table. */
Templates.template_name
Templates.current_version /* this is related to
TemplateVersions.version_number */

TemplateVersions /* A logical grouping of a set of questions and answers */
TemplateVersions.version
TemplateVersions.template_id /* relates this version to a template. */

TemplateQuestions
TemplateQuestions.template_version /* relates a question to a
template version */
TemplateQuestions.question_id /* relates a unique question to this
template version */
TemplateQuestions.id

TemplateQuestionAnswers
TemplateQuestionAnswers.template_question_id /* relates this
answer to a particular template version question */
TemplateQuestionAnswers.answer_id /* relates the unique question
to a unique answer */
TemplateQuestionAnswers.id

Checklists
Checklists.id
Checklists.template_version /* relates this question to a template
version -- associating this checklist to a client happens through this
relationship */

ChecklistAnswers /* ( I might call this something other than
'Answers' since the lack of ChecklistQuestionAnswers breaks 'name
symmetry' with TemplateQuestionAnswers ) */
ChecklistAnswers.checklist_id
ChecklistAnswers.question_id
ChecklistAnswers.answer_id

The rub I'm getting hung up on is guaranteeing that ChecklistAnswers
associates a proper question-and-answer pair  -- the relationship that
exists in the version of the Template that its Checklist parent is
referencing.

In other words, each row in ChecklistAnswers must 'mirror' a
question_id from TemplateQuestions to one child question from
TemplateQuestionAnswers, form the template_version in Checklists. I'm
trying to think of how to do this and my thinking process short
circuits here. This is really the 'deliverable' of the database -- a
completed checklist -- so all the other templates and everything is
sort of epiphenomenal or an abstraction of that. If I can't get this
working, I've missed the whole point!

This seems a *little* unwieldy, so I'm wondering if I'm making a
solution whose complexity is not worth the space-savings I might get
from implementing it.

Also note, I've simplified this a bit. There are other dimensions of
complexity, such as a category system for grouping questions for
reporting, but I don't think we need to get into that here.

-

Re: [SQL] What does PostgreSQL do when time goes backward?

2010-08-04 Thread Steve Wampler


John Hasler wrote:


Frank writes:

My ntp client changes clock (by small amount) at any time:
Jul 25 05:29:38 bax ntpd[10269]: adjusting local clock by 0.098724s
Jul 25 05:31:43 bax ntpd[10269]: adjusting local clock by 0.038991s
Jul 25 06:13:38 bax ntpd[10269]: adjusting local clock by -0.037131s
Jul 25 15:01:52 bax ntpd[10269]: adjusting local clock by -0.112429s


Ken writes:

These do seem to be larger values than you might expect from a clock
conditioned with ntpd.  Is it a VM or is there something going on that
would stop or suspend your system?



John writes:

There is certainly something wrong there.


I saw very bad clock performance on one Linux box I had (dual-single core
AMD cpus, no VMs), even with NTP, until I changed the clocksource kernel
parameter to hpet.  Unfortunately (or fortunately) I no longer have that box.

--
Steve Wampler -- [email protected]
The gods that smiled on your birth are now laughing out loud.

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Cumulative result with increment

2011-02-07 Thread Steve Crawford


On 02/07/2011 01:11 PM, Shawn Tayler wrote:

Hello,

I am struggling with what is probably a simple problem but I am coming
up blank.  In the interest of full disclosure I am not a very savy
programmer by any stretch.

I have a table of data from an external source which contains numbers of
events per operating group per hour per day, kind of like this:

Group | events  | duration  | timestamp

The Group, events, and duration (milliseconds) are all integers in char
format.  Timestamp is as stated.  The records are at 1 hour increments.
I need to sum the events and duration fields in 8 hour (shifts, mid to
8AM, 8AM to 4PM, etc).

Id like the output to be in table format ala:

Group | events  |  duration  |   date| shift
--+-++---+---
100   |  26 |00:00:25.00 |2011-01-01 | Day  

I have a function to convert the duration to an interval already and can
make one to do the shift labeling.  Its the rest that has me stumped.

Any suggestions or direction?

As always, your help is greatly appreciated.



I'm not sure exactly what you want but it sounds like you just want an 
output column that has the shift instead of timestamp. You can then sum 
on that column. Don't know what your shifts are called, but this will 
give you dog-, day- and night-shifts based on your times:


case
when extract(hour from '2011-01-12 1600'::timestamptz)< 8 then 'dog'
when extract(hour from now())< 16 then 'day'
else 'night'
end as shiftname

This can be used for grouping as well as display.

Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] convert in GMT time zone without summer time

2011-04-18 Thread Steve Crawford


On 04/16/2011 05:02 AM, Jasen Betts wrote:

On 2011-04-15, LaraK  wrote:

Hello,

I want write a function that converts a timestamp with time zone to the UTC
zone. But it should all be stored in the winter time.

Done! All timestamp with time zone information is stored internally in UTC.

But you need to be sure you really understand date/time manipulation in 
PostgreSQL so you don't reinvent the wheel.

[CODE]

SELECT
to_char(CONVERT_TO_UTC(to_timestamp('2011-03-22 14:17:00', '-MM-DD
hh24:MI:SS'), 'CET'), '-mm-dd hh24:MI:SS') AS winter,
to_char(CONVERT_TO_UTC(to_timestamp('2011-04-22 14:17:00', '-MM-DD
hh24:MI:SS'), 'CET'), '-mm-dd hh24:MI:SS') AS summer
[/CODE]

must come out:
[CODE]
WINTER  | SUMMER
+-
2011-03-22 13:17:00 | 2011-04-22 12:17:00
[/CODE]

that test case is ambiguous your inputs are timespamptz but
have an unspecified timezone (and so get the zone appropriate to
your time locale). I'm assuming your time locale is "Europe/Berlin"
and you really mean the following:

SELECT to_char(CONVERT_TO_UTC( '2011-03-22 14:17:00+01'::timestamptz
,'CET'),'-mm-dd hh24:MI:SS') AS winter, to_char(CONVERT_TO_UTC(
'2011-04-22 14:17:00+02'::timestamptz ,'CET'),'-mm-dd hh24:MI:SS')
AS summer;


If you can use the correct time zone name, everything is done for you. 
Better yet, it will keep working when the timezone rules change (if you 
apply your patches regularly) or for other time zones:


steve=> select '2011-03-22 14:17:00  Europe/Berlin' at time zone 'UTC';
  timezone
-
 2011-03-22 13:17:00
(1 row)

steve=> select '2011-04-22 14:17:00  Europe/Berlin' at time zone 'UTC';
  timezone
-
 2011-04-22 12:17:00

Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] convert in GMT time zone without summer time

2011-05-03 Thread Steve Crawford


On 05/03/2011 12:15 AM, LaraK wrote:

Very good!

Another question:
I want to convert the string to the type TIMESTAMP WITH TIME ZONE. What do I
have to format? 'TZ' does not.

select to_timestamp('2011-03-22 14:17:00  Europe/Berlin', '-MM-DD
HH:MI:SS  TZ')



Just cast it to a timestamp with time zone:

select timestamptz '2011-04-22 19:17:00 Europe/Berlin';

Remember...

The value of a timestamp with time zone is always stored internally as UTC.

When a timestamp with time zone is displayed, the time zone is based on 
the client's default, the "set timezone to" statement or the "at time 
zone" clause in the query.


In the case of an explicit "at time zone" clause, the result becomes a 
timestamp without time zone data type (that is why the previous static 
example with the "at time zone" clause was a timestamp without time zone).


A timestamp with time zone is useful to identify a specific point in 
time. "Bin Laden's death was announced at...", "shuttle Endeavor 
launched at...", "Amazon EC2 crashed at...". Most timestamp data I 
encounter is of this type.


A timestamp without time zone might be useful for data like "Breakfast 
is served at 7am". Presumably a hotel chain would serve at 7am in each 
hotel and not have all hotels serve at 7am corporate headquarters time.


It takes a bit of time to wrap your head around time and time zones but 
it would be well worth your time to carefully read 
http://www.postgresql.org/docs/8.4/interactive/datatype-datetime.html 
(IIRC, you are using 8.4) a couple times.


Cheers,
Steve




--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] self join

2011-05-16 Thread Steve Crawford


On 05/14/2011 07:36 PM, Jasen Betts wrote:


use the "NOT IN" operator with a subquery to retch the disallowed
values
Hmmm, "retch" as a synonym for "output"? I've seen more than one case 
where that is an appropriate description. :)


Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] How to remove a set of characters in text-columns ?

2011-06-30 Thread Steve Crawford


On 06/30/2011 12:28 PM, Andreas wrote:

Hi,
how can I remove a set of characters in text-columns ?
Say I'd like to remove { } ( ) ' " , ; . : !
Of course I can chain replace ( replace ( replace ( replace ( ... , '' 
)  and replace the chars one by one against an empty string ''.


There might be a more elegant way.
Is there ?


regards


Look at regexp_replace()
http://www.postgresql.org/docs/9.0/static/functions-string.html
http://www.postgresql.org/docs/9.0/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP

Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] why these results?

2011-08-01 Thread Steve Crawford


On 08/01/2011 03:50 PM, Wes James wrote:

select count(*) from table;

count
---
100
(1 row)


is correct

select count(*) from table where col::text ~~* '%text%';

count
---
  1
(1 row)

is correct.

But now if I do:


select count(*) from table where col::text !~~* '%text%';
count
---
   98
(1 row)

Shouldn't it be 99?  That is out of 100 records there is one that has
"text" in column "col" so the !~~* should return 99 rows.  ??

-wes


select count(*) from table where col is null;

(null is neither equal nor not-equal to anything, even null)

Cheers,
Steve

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Mysterious column "name"

2011-08-09 Thread Steve Crawford


On 08/09/2011 08:34 AM, Oliveiros d'Azevedo Cristina wrote:

Strange...

Tables have hidden columns but AFAIK, "name" is not one of them...

http://www.postgresql.org/docs/9.0/interactive/ddl-system-columns.html

...


Good day!

I found one strange results for sql-query.

create table testtable (
f1 int,
f2 int
);

insert into testtable(f1, f2) values(1, 100);

select testtable.* from testtable; // it is ok

select testtable from testtable; // it is ok

select testtable.name from testtable; // returns strange result.
-
name
name
-
"(1,100)"
-
no error message about nonexistent column.
Result is similar to second variant, but pgAdmin3 shows not full record.
...


See http://www.postgresql.org/docs/9.0/static/rowtypes.html on composite 
data types and scroll to section 8.15.3.


Cheers,
Steve

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] how to calculate differences of timestamps?

2011-09-27 Thread Steve Crawford


On 09/26/2011 06:31 PM, Andreas wrote:

How could I calculate differences of timestamps in a log-table?

Table log ( user_id integer, login boolean, ts timestamp )

So login = true would be a login-event and login = false a logout.
Is there a way to find the matching login/logout to calculate the 
difference?


Or is there a better table "design" to do this?



One way is a sub_select:

select
o.user_id,
o.ts as logout_time,
(select
 max(i.ts)
 from
 log i
 where
 i.user_id = o.user_id and
 i.ts < o.ts and
 login
) as login_time
from
log
where
not login
;

This will give you login/logout time pairs. Just replace the "," with a 
"-" if you are interested in login duration.


Depending on the frequency and duration of logins and the number of 
users you may have to play with indexes though an index on ts will 
probably suffice for most cases.


Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

[SQL] plpgsql function executed multiple times for each return value

2011-10-08 Thread Steve Northamer



We have a plpgsql function called paymentcalc, which calculates the 
payment necessary to pay off a loan.  It's defined like this:


CREATE OR REPLACE FUNCTION paymentcalc(IN amount numeric, IN 
interestrate numeric, IN termmonths integer, IN paymentfreq integer, IN 
dueday1 integer, IN dueday2 integer, IN borrowdate date, IN firstdue 
date, IN gapins character, IN lifeins character, IN disabilityins 
character, OUT payment numeric, OUT finalpayment numeric, OUT finaldue date)

  RETURNS record AS
...
LANGUAGE 'plpgsql' STABLE;

We want to execute this function, with inputs from a table, and return 
the calculated values as separate columns:


select (p).payment, (p).finalpayment, (p).finaldue
from
	(select paymentcalc(amount, interestrate / 100, termmonths, 
paymentfreq, dueday1, dueday2,

borrowdate, firstdue, gapins, lifeins, disins) as p
from apps
where id = 100) s

This works, but using "RAISE NOTICE" we've determined that this executes 
the paymentcalc function 3 times!  It seems to execute the function once 
for each return value.  The function is time consuming, and we want to 
execute it only once.


Thinking that maybe a table returning function would work better, we 
tried this:


CREATE OR REPLACE FUNCTION paymentcalc2(IN amount numeric, IN 
interestrate numeric, IN termmonths integer, IN paymentfreq integer, IN 
dueday1 integer, IN dueday2 integer, IN borrowdate date, IN firstdue 
date, IN gapins character, IN lifeins character, IN disabilityins character)

  RETURNS SETOF paymentcalc_return AS
...
LANGUAGE 'plpgsql' STABLE;

However, we now have difficulty trying to send parameters from a table 
as inputs to this function.  We want to do something like:


select *
from paymentcalc2(amount, interestrate / 100, termmonths, paymentfreq, 
dueday1, dueday2,

borrowdate, firstdue, gapins, lifeins, disins)
from apps
where id = 100

But obviously this query won't work with two "from"s.  We can't do a 
join between paymentcalc2 and apps, because we get "invalid reference to 
FROM-clause entry for table apps".


So my questions are:  1) How do we cause the paymentcalc function to be 
executed only once?  and 2) How do we call a table returning function 
with inputs from a table?


Thank you very much!

Steve

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] internal format of timstamp?

2011-12-29 Thread Steve Crawford


On 12/29/2011 12:42 PM, Jasen Betts wrote:

On 2011-12-29, Lars Gustafsson  wrote:

Hi,

I am trying to recover a lot of deleted rows from a database  ( pg
8.2.3 ) , not my database, I promise…..

When using the tool pgfsck I get good results, but timestamp is not implemented.

When trying to export as int8  i get   fx.  4735129360236469258   representing  
 december 29, 2011, 16:30

But how should I do the conversion from the numeric value to the actual 
timestamp ?

possibly that's a floating point timestamp try it as float8 instead of
int8.



I am not an internals person, but suspect that your first step should be 
to check pg_config to see if the server from which you are attempting to 
recover data was compiled with --enable-integer-datetimes.


Cheers,
Steve

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] compare table names

2012-01-09 Thread Steve Crawford


On 01/09/2012 08:28 AM, Tony Capobianco wrote:

I see what you're saying:

pg=# select tablename from pg_tables where tablename like 'tmp_staging%' and 
tablename<  'tmp_staging1230' and tablename>  'tmp_staging1228';
  tablename

  tmp_staging1229


This query is part of a larger script where I want to dynamically select
tablenames older than 10 days and drop them.  The tables are created in
a tmp_stagingMMDD format.  I know postgres does not maintain object
create times, how can I write this to select tables from pg_tables that
are older than 10 days?

Thanks.
Tony
...Ah, there's the missing part - the 1229 represents a date that is 
missing year information.


If you can change things up a bit, I'd add the year to the name 
"tmp_stagingMMDD" which makes the query easy. (We do this in a few 
cases where we are given blocks of data that are valid through a certain 
date. Each block of data is a child of the main table and has a name 
that represents the last date the data is valid. A daily script drops 
any partition that has expired.)


If you can't add the year, you will be stuck with extra work to properly 
handle the first 10-days of each year.


Alternately, you could have a separate table that just tracks the 
creation dates of the temporary tables and be free from any requirement 
to have dates be part of the table names.


Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Unable To Modify Table

2012-01-12 Thread Steve Crawford


On 01/12/2012 08:42 AM, Carlos Mennens wrote:

I seem to have an issue where I can't modify a table due to another
tables foreign key association:

...

How does one accomplish my goal? Is this difficult to change or once
that foreign key is created, are you stuck with that particular
constraint?



Try updating the values in both tables within a transaction with 
constraints set to deferred:

http://www.postgresql.org/docs/current/static/sql-set-constraints.html

Cheers,
Steve

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] date range to set of dates expansion

2012-01-19 Thread Steve Crawford


On 01/19/2012 07:16 AM, Gary Stainburn wrote:

On Thursday 19 January 2012 15:11:46 Gary Stainburn wrote:

I'll be upgrading my live server as soon as possible, but in the meantime
can anyone suggest  a way I can do the same thing using Postgresql 8.1
until I can evaluate 8.4 on my live systems?
I'm sure most here would recommend moving to 9.1 rather than 8.4. Better 
performance, cooler replication functionality, more advanced in-place 
upgrade capabilities for future upgrades, a couple years longer before 
end-of-life, advances to windowing functions and other SQL commands and 
much other goodness.


Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Update Mass Data in Field?

2012-01-26 Thread Steve Crawford


On 01/26/2012 03:59 PM, Carlos Mennens wrote:

I'm new to SQL so I'm looking for a way to change several email
addresses with one command. For example everyone has a 'holyghost.org'
domain and I need to change a few 100 email addresses in the field
'emp_email'. I need to UPDATE employees table which has a COLUMN
'emp_email' and change %holyghost.org to %ghostsoftware.com.

I tried:

UPDATE employees SET emp_email = '%ghostsoftware.com' WHERE emp_email
LIKE '%holyghost.org';

It didn't update anything when I ran the command. Does anyone know how
I need to correct my SQL statement above to change everyone's email
address?



...set emp_email = regexp_replace(emp_email, '@holyghost.org$', 
'@ghostingsoftware.org') where emp_email ~ '@holyghost.org$'...


This is using the regular expression match and regexp_replace to ensure 
that the pattern is anchored at the end of the field and includes the 
"@" sign in the expression to avoid accidentally matching something like 
[email protected].


You can always do a select of the emp_email alongside the replacement 
expression to be sure it will do what you want before actually updating 
your database.


Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] querying a column w/ timestamp with timezone datatype

2012-01-30 Thread Steve Crawford


On 01/30/2012 07:00 AM, Anson Abraham wrote:
I an 9.1 PG database: I have a column which is a timestamp w/ time 
zone.  So the value I have as one record in table is: 15:55:24.342848+00


If i want to find records less or greater than that timestamp, how do 
I construct the query?


select * from schema.table where tscol >= '15:55:24.342848+00';
select * from schema.table where tscol >= '15:55:24.342848+00'::timestamp;
select * from schema.table where tscol >= cast('15:55:24.342848+00' as 
timestamp with time zone);


do not work.  Do I have to convert the value to a string and substr to 
15:55:24 and then convert back to a timestamp? It's been a long while 
since I had to query a pg table w/ a timestamp with time zone value. 
 Any help here would be appreciated.


Those aren't timestamps - timestamps include the date part.

Perhaps you are thinking about a *time* with time zone (a type that 
exists due to SQL requirements but which is a somewhat nonsensical type, 
the use of which is not recommended):

http://www.postgresql.org/docs/9.1/static/datatype-datetime.html#DATATYPE-TIMEZONES

Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] time interval math

2012-02-08 Thread Steve Crawford


On 02/08/2012 12:01 PM, Edward W. Rouse wrote:

I'm still working on getting this to work, but the summary is this:

I am getting several (many) intervals of hour, minutes and seconds. I need a
sum of the absolute value these intervals, similar to the SUM(ABS())
function for numbers; and I need to divide this sum by an integer (bigint).
Getting the intervals is no problem, but I can't find built in functions for
the rest. Currently on 8.3, want to upgrade to 9.x but I can't until this is
finished.

Do these functions exist, or will I be forced to convert to seconds, do the
math and then convert back to hour-minute-second format (I am assuming from
current data that, after the divide, the result should be in the minute:
second range).


You will have to do some work on your own.

Time and intervals are tricky beasts and depend on the oddities of 
daylight saving rules. Even though you are only using 
hours/minutes/seconds the interval type also supports days and months. A 
day interval is probably 24 hours but could be 23 or 25 if it crosses a 
DST boundary. Months have different numbers of days. You have situations 
where adding and subtracting a month does not give the original date:


select '2011-03-31'::date - '1 month'::interval + '1 month'::interval;
  ?column?
-
 2011-03-28 00:00:00

There is no abs(interval) function but, if you know that all your 
intervals are basic H:M:S and that you won't have any difficulty due to 
problems similar to the above you can mimic it with:
...case when myinterval < '0'::interval then '0'::interval - myinterval 
else myinterval end...


You are even allowed to sum that and divide it (though I suspect there 
are some interesting corner-cases waiting to be discovered):
...sum(case when myinterval < '0'::interval then '0'::interval - 
myinterval else myinterval end)/2...


Before you upgrade, be sure to read the release notes and test your 
calculations. The way intervals are handled, especially regarding 
intervals across DST boundaries, have changed over time. IIRC most of 
those changes were pre-8.3 but haven't looked recently.


Cheers,
Steve

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Setting the process title, or can I?

2012-03-20 Thread Steve Crawford


On 03/20/2012 03:14 AM, Bèrto ëd Sèra wrote:


I currently have an emergency ... As an emergency procedure we have 
set a script that each minute has a look at the situation and runs 
pg_cancel_backend() against anything that has been waiting for more 
than X secs. Then it sleeps one more minute...

...
Is there anyway I can mark the process running the filter, maybe using 
the update_process_title feature? I'd like to have something I can see 
from a ps command, grep for it and kill -15 it, without wasting our 
scarce resources on one more pg process to use info from pg_stat_activity.



Several things spring to mind.

1. The script that starts the filter can note its pid and set an at-job 
to kill it after 30 minutes. I've used this technique with success.


2. Create a new role specifically to run the filter. The role is 
generally visible in ps.


But this all begs the question. Why not:

3. Set the statement_timeout so long-running statements will be 
automatically killed. This can be done on a per-role basis so simply set 
it as desired for the role used by the problematic Java program. It is a 
default and can be changed by the user so if there are individual 
statements that need a longer timeout you can change just for those 
statements.


Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] how to concatenate in PostgreSQL

2012-03-26 Thread Steve Crawford


On 03/24/2012 04:43 AM, Rehan Saleem wrote:

hi ,
how can we concatinate these lines and execute sql command

In what? Psql? A PL/pgSQL function. C/Java/PHP/Python/Perl/Erlang/Lua?

   setsql = 'select user,username, firstname '
  set sql += ' lastname, cardno from table1 where userid=' + 5
  exec(sqi)
 where 5 is the userid from table1
thanks


Cheers,
Steve

Re: [SQL] how to concatenate in PostgreSQL

2012-03-27 Thread Steve Crawford

On 03/27/2012 07:48 AM, Rehan Saleem wrote:
well i am quite sure its PostgreSQL forum and it is obvious, i am
asking this to concatenate in plpgsql.

*From:* Steve Crawford
*To:* [email protected]
*Sent:* Monday, March 26, 2012 9:08 PM
*Subject:* Re: [SQL] how to concatenate in PostgreSQL

On 03/24/2012 04:43 AM, Rehan Saleem wrote:

hi ,
how can we concatinate these lines and execute sql command

In what? Psql? A PL/pgSQL function. C/Java/PHP/Python/Perl/Erlang/Lua?

setsql = 'select user,username, firstname '
set sql += ' lastname, cardno from table1 where userid=' + 5
exec(sqi)
where 5 is the userid from table1
thanks

Cheers,
Steve

Yes, after about a decade of PostgreSQL use and forum participation I
think I know which forum this is. Sarcasm and biting at people who want
to help you is not a good way to make friends and get help on the forums.

If you meant PL/pgSQL then say it. It is not the same as SQL though the
syntax is generally similar. Look at DECLARE and BEGIN for starters.
I've seen too many threads get reset back to the start once some
incorrect assumptions get corrected to feel that up-front clarification
is a waste of time.
http://wiki.postgresql.org/wiki/Guide_to_reporting_problems is more
oriented to problem reporting than general questions but has helpful
advice nonetheless.

Back to the issue at hand...others have pointed out the || operator and
the issue with NULL. It appears you are doing a lot of conversion
requiring writing of PL/pgSQL functions. Spend a few minutes skimming:

http://www.postgresql.org/docs/current/static/plpgsql.html

As to the topic of executing the command including many examples see:
http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN

If you are writing functions that must use dynamically generated table
and column names, become familiar with the quote_ident, quote_literal
and quote_nullable functions.

Cheers,
Steve

Re: [SQL] Simple way to get missing number

2012-04-24 Thread Steve Crawford


On 04/24/2012 07:15 AM, Emi Lu wrote:

Good morning,

May I know is there a simple sql command which could return missing 
numbers please?


For example,

t1(id integer)

values= 1, 2, 3  500

select miss_num(id)
from   t1 ;


Will return:
===
37, 800, 8001

T


select generate_series(1,500) except select id from t1;

Example

select anumber from fooo;
 anumber
-
   1
   3
   5
   7
   9
  11
  13
  15

select generate_series(1,15) except select anumber from fooo order by 1;
 generate_series
-
   2
   4
   6
   8
  10
  12
  14

Cheers,
Steve

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Simple way to get missing number

2012-04-24 Thread Steve Crawford


On 04/24/2012 11:10 AM, Emi Lu wrote:

I got it and thank you very much for everyone's help!!

It seems that "left join where is null" is faster comparing with 
"except". And my final query is:


select num as missing
from   generate_series(5000, #{max_id}) t(num)
left join t1  on (t.num = t1.id)
where t1.id is null;



BTW, there are many options. Two more of them include EXISTS:

select allnumbers from generate_series(1,15) as allnumbers where not 
exists (select 1 from fooo where fooo.anumber=allnumbers.allnumbers);


And IN:

select allnumbers from generate_series(1,15) as allnumbers where 
allnumbers not in (select anumber from fooo);


They all give you the same result. The "right" choice will depend on the 
size of your table, how it is indexed, how fully it is populated and 
even on your version of PostgreSQL. (Apologies for the funky field/table 
naming.)


Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Simple Upgrade from PostgreSQL version 8.1.11 (With schemas)

2012-07-10 Thread Steve Crawford


On 07/10/2012 08:28 AM, Chris Preston wrote:

Hello all,
How far can I get to a higher version of PostgreSQL by just entering a
command line instruction to upgrade without any major effort?

Regards

Chris Preston


At the simplest you just do a dump (using the dump tools from the *new* 
version of PostgreSQL) and restore to the new version. If you aren't 
worried about down-time and have the absolute simplest use-case then it 
isn't too difficult.


Most people have to deal with reviewing the release notes to see how 
they affect the overall system, setting up a test of the new version to 
shake out issues and planning the upgrade to minimize down-time. Among 
the things that have changed over the past few releases are tightening 
up of casting rules and changing the default for 
standard_conforming_strings to on either of which could cause serious 
breakage.


Once you get to a newer version (anything past 8.3? 8.4) you can use the 
new upgrade tools moving forward to minimize downtime during the upgrade 
process but you will *always* need to test and evaluate before deploying.


Cheers,
Steve



--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Split a string to rows?

2013-01-07 Thread Steve Crawford


On 01/07/2013 11:44 AM, Emi Lu wrote:


Is there a function to split a string to different rows?...



Have you looked at regexp_split_to_table?

Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] how can I replace all instances of a pattern

2013-03-26 Thread Steve Crawford


On 03/26/2013 06:08 AM, James Sharrett wrote:
I'm trying remove all instances of non-alphanumeric or underscore 
characters from a query result for further use.  This is part of a 
function I'm writing that is in plpgsql


Examples:

  Original value
'My text1'
'My text 2'
'My-text-3'
'My_text4'
'My!text5'

   Desired
'Mytext1'
'Mytext2'
'Mytext3'
'My_text4'  (no change)
'Mytext5'


The field containing the text is column_name.  I tried the following:

  Select regexp_replace(column_name,'\W','') from mytable

This deals with the correct characters but only does the first 
instance of the character so the output is:


'My text1'
'Mytext 2'  (wrong)
'Mytext-3'  (wrong)
'My_text4'
'My!text5'

I managed to get the desired output by writing the text into a 
variable through a loop and then just keep looping on the variable 
until all the characters are removed:


sql_qry:= 'select column_name from mytable';

for sql_record in execute sql_qry loop
curr_record := sql_record.column_name;

while length(substring(curr_record from '\W'))>0 loop
curr_record := regexp_replace(curr_record, '\W','');
end loop;

 rest of the code

This works but it seems like a lot of work to do something this simple 
but I cannot find any function that will replace all instances of a 
string AND can base it on a regular expression pattern.  Is there a 
better way to do this in 9.1?


You were on the right track with regexp_replace but you need to add a 
global flag:

regexp_replace(column_name,'\W','','g')

See examples under 
http://www.postgresql.org/docs/9.1/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP


Cheers,
Steve

Re: [SQL] Unique index VS unique constraint

2013-10-04 Thread Steve Grey

Unique indexes can be partial, i.e. defined with a where clause (that must
be included in a query so that PostgreSQL knows to use that index) whereas
unique constraints cannot.
JORGE MALDONADO wrote
> I have search for information about the difference between "unique index"
> and "unique constraint" in PostgreSQL without getting to a specific
> answer,
> so I kindly ask for an explanation that helps me clarify such concept.

A constraint says what valid data looks like.

An index stores data in such a way as to enhance search performance.

Uniqueness is a constraint. It happens to be implemented via the creation
of a unique index since an index is quickly able to search all existing
values in order to determine if a given value already exists.

PostgreSQL has chosen to allow a user to create a unique index directly,
instead of only via a constraint, but one should not do so. The uniqueness
property is a constraint and so a "unique index" without a corresponding
constraint is an improper model. If you look at the model without any
indexes (which are non-model objects) you would not be aware of the fact
that duplicates are not allowed yet in the implementation that is indeed the
case.

Logically the constraint layer sits on top of an index and performs its
filtering of incoming data so that the index can focus on its roles of
storing and retrieving. Extending this thought the underlying index should
always be non-Unique and a unique filter/constraint would use that index for
validation before passing the new value along. However, practicality leads
to the current situation where the index takes on the added role of
enforcing uniqueness. This is not the case for any other constraint but the
UNIQUE constraints case is so integral to PRIMARY KEY usage that the special
case behavior is understandable and much more performant.

Conceptually the index is an implementation detail and uniqueness should be
associated only with constraints.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Unique-index-VS-unique-constraint-tp5773386p5773427.html
Sent from the PostgreSQL - sql mailing list archive at Nabble.com.

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

[SQL] Aggregating both tables in a join?

2007-01-11 Thread Steve Sabljak


Hi,

I've got a table which models the relationship between flights and
legs, which is defined like this:-

create table flight_leg (
   flight_id integer references flight(id),
   leg_id integer references leg(id)
);

One leg may be part of many flights and one flight may be made up of 1
or more legs.

I also have a table which maps flights to comparable flights. The
mapping is created manually. It is defined like this:-

create table cmp_flight (
   flight_id integer references flight(id),
   cmp_flight_id integer references flight(id)
);

The flight and leg tables referenced are omitted for brevity.

I want to be able to show all the flight_ids and their leg counts as
well as the cmp_flight_ids and leg counts.

e.g.

insert into flight_legs values (1, 1);
insert into flight_legs values (1, 2);
insert into flight_legs values (2, 3);
insert into flight_legs values (2, 4);
insert into flight_legs values (2, 5);
insert into flight_legs values (3, 3);
insert into flight_legs values (4, 1);
insert into flight_legs values (5, 3);
insert into flight_legs values (5, 4);
insert into flight_legs values (6, 4);
insert into flight_legs values (7, 5);

insert into cmp_flight values (1, 2);
insert into cmp_flight values (1, 3);
insert into cmp_flight values (4, 5);
insert into cmp_flight values (4, 6);
insert into cmp_flight values (4, 7);

The result I'm looking for is

flight_id num_legs cmp_flight_id cmp_num_legs
1 22 3
1 23 1
4 15 2
4 16 1
4 17 1

This does the trick:-

select cf.flight_id, tl1.num_legs, cf.cmp_flight_id, tl2.num_legs cmp_num_legs
from cmp_flight cf
   join (select fl1.flight_id fid1, count(*) num_legs
   from flight_leg fl1 group by fl1.flight_id) tl1 on fid1 = cf.flight_id
   join (select fl2.flight_id fid2, count(*) num_legs
   from flight_leg fl2 group by fl2.flight_id) tl2 on fid2 =
cf.cmp_flight_id;

Is this a better way to go about it?
I've considered keeping a track of the number of legs in the flight
table, using a trigger, but it doesn't seem relationally 'clean'.

Kind Regards,
Steve Sabljak

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [SQL] COPY FROM query

2007-02-12 Thread Steve Midgley

Hi,

For what it's worth, WindowsXP (at least) seems to currently support 
forward slashes at the OS level, so this Postgres behavior isn't as odd 
as it might seem. If you enclose your paths with double quote (") 
marks, Windows will even accept Unix style paths for some instructions 
on the command line:

dir "/temp"

Feeding Windows API calls with forward slashes seems to work with 
everything I've tried so far, so using them in Postgres seems perfectly 
smart, and reasonable..

Hope that helps,

Steve

At 11:03 AM 2/12/2007, [EMAIL PROTECTED] wrote:

Date: Mon, 12 Feb 2007 09:09:08 +0900
From: Paul Lambert <[EMAIL PROTECTED]>
To: Joe <[EMAIL PROTECTED]>
Cc: [email protected]
Subject: Re: COPY FROM query.
Message-ID: <[EMAIL PROTECTED]>

Joe wrote:
> Hi Paul,
>
> On Mon, 2007-02-12 at 08:19 +0900, Paul Lambert wrote:
>> I'm attempting to copy from a table into a file using a select 
query

>> inside the copy.
>>
>> The following is my command:
>>
>> COPY (SELECT DISTINCT ON (dealer_id,appraisal_id) * FROM
>> appraisals_temp) TO 'c:\autodrs_appraisal_new.txt' WITH DELIMITER 
AS '^'

>> CSV HEADER;
>>
>> I get the following returned:
>>
>> WARNING:  nonstandard use of escape in a string literal
>> LINE 1: ...ealer_id,appraisal_id) * FROM appraisals_temp) TO 
'C:\autodr...

>>   ^
>> HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
>>
>> ERROR: relative path not allowed for COPY to file
>> SQL state: 42602
>>
>>
>> (The caret character is pointing to the M in FROM)
>
> I believe that on Windows you need to use double backslashes, i.e., 
'c:\
> \autodrs_appraisal_new.txt', although the regular slash may also 
work,

> i.e., 'c:/autodrs_appraisal_new.txt'.
>
> Joe
>
>
> ---(end of 
broadcast)---

> TIP 5: don't forget to increase your free space map settings
>
>

If this is the case, it is strange that the first copy statement works 

as that is also only using a single backslash.

Having said that, if I switch it to a forward slash it works without
error... odd, given that weenblows standard is backslash.

Thanks for the help though.

--
Paul Lambert
Database Administrator
AutoLedgers

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [SQL] Change the Default Database

2007-02-27 Thread Steve Midgley


Hi Rommel,

I thought I'd throw a little more input in, as an alternative to psql. 
I use Ruby on Rails with Postgres, and for testing, it drops the 
testing database, recreates it and loads the structures dumped from the 
development database, before running tests - to ensure a clean test 
every time (it also has a tool for re-building the development database 
programatically, which one can run before all this).


So, Rails has to solve the same problem you have, and it doesn't use 
psql to do it. My investigation into it's method is as so:


1) Initiate a connection to Pg server via Pg network port.
1.a) Dump dev database structures to file
2) Drop database 'test'
3) Create database 'test' (with correct parameters)
4) Initiate a new connection to Pg server via port, request connection 
to newly created 'test' database

5) Load dev database structures from file to 'test' database.

The key obviously is reconnecting over the port connection and 
specifying the database you wish to connect to - I think this is how 
PGAdmin does the job too (when you click on "SQL" you're making a port 
level connection to the server, specifying which database you want to 
talk with - from what I can tell, Pg doesn't let you operate across 
databases, as MS SQL does - one connection = one database).


I'm not nearly as expert as others on the list, so any corrections to 
the above analysis would be welcome.


Sincerely,

Steve


At 04:49 AM 2/27/2007, [EMAIL PROTECTED] wrote:

Date: Mon, 26 Feb 2007 18:02:38 -0400
From: "Rommel the iCeMAn" <[EMAIL PROTECTED]>
To: 
Subject: Re: Change Default Database
Message-ID: <[EMAIL PROTECTED]>

I seem to be blundering a lot today! I thought I was replying to the 
entire

list, didn't realize I replied to one person :-)

Nothing was wrong with my script, I assumed that since it was 
generated by
pgAdmin that I could run it inside pgAdmin. It works perfectly when I 
run it
using psql. It is very much like SQL Server's osql command line tool. 
Thanks

to everyone that helped ...

So I've learnt three valuable lessons:

1. It's better to execute scripts using the command line tool 'psql'.

2. Make sure I'm replying to the list and not to an individual.

3. Do not post HTML messages!!! :-))


Thanks again everyone.

Rommel Edwards
Software Developer,
Barbados, Caribbean.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

[SQL] GiST index question: performance

2007-03-05 Thread Steve Midgley


Hi,

First off, can I say how much I love GiST? It's already solved a few 
problems for me that seemed impossible to solve in real-time queries. 
Thanks to everyone who works on that project!


I'm developing a geographic index based on a set of zip code 
boundaries. Points of interest (POI) will fall within some boundaries 
and not others. I need to search to find which POI are within a 
specified boundary.


I think have two options (see below) and I'm wondering if anyone has an 
opinion or experience as to whether one or the other will have 
substantially different performance characteristics. I can obviously 
test when I get that far, but I'd prefer to try the anticipated faster 
route first, if anyone has existing experience they can share:


1) Index a series of circles of NN radius around each boundary marker 
(lat/long point). Run a search on POI for those that fall within any of 
the specified circles.


2) Index a set of polygons that mark the "minimum area" around the 
boundary markers in question. Run a search on POI that fall within this 
single polygon.


The polygon will have more points, but there will be more circles to 
search - my understanding of GiST is limited so I'm not sure if there's 
a performance benefit to searching many circles or a few polygons.


My tables are of this size:

# of POI: 50,000
# of zip blocks (with and without regions): 217,000
# of zip blocks in a given city (and hence in a given polygon): ~5

Any thoughts or ideas?

Thank you,

Steve

p.s. I could use a GIS system alongside of Postgres but performance and 
efficiency are key to this system, and it seems to me that raw GiST 
indexed SQL queries are going to be fastest and create the lowest load 
on the server?



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [SQL] GiST index question: performance

2007-03-05 Thread Steve Midgley


Thanks Oleg - very interesting stuff you are working on.

You may recall I exchanged emails with you on openfts a little while 
ago - my ISP that manages my Pg SQL server is (in my interests) 
concerned about installing anything non-standard (read: unstable) onto 
their server. I was able to get them to install your TSearch2 b/c it's 
been proven many times, but I'm hesitant to even bring up Q3C since 
it's less widely deployed.


The search method I proposed in my first email is not totally accurate 
but just searching circles with radii using a GiST index and standard 
Pg circle datatypes seems like a "close enough" solution for me (as 
opposed to Q3C's conical search intersections with a spherical 
projection). I realize that at higher latitudes my circles will be 
elliptical but our needs are for approximations that are very fast 
rather than accurate and the radii being searched are small relative to 
the size of the sphere (I.e. when searching Nome, find everything in 
+/- 40 miles and especially don't return Anchorage POI)..


It's an end user database, so if the query takes 500ms, that's really 
too long. On the Q3C site, I see that your measure of speed is 
processing many, many rows in 20 hours, which is a whole different 
ballgame. :)


Do you have a thought as to whether GiST is going to be faster/more 
efficient with Pg standard types of polygons or circles? I suppose I 
should just test out both, and quit wasting your time. I'll certainly 
repost to the list with whatever I uncover.


I really do appreciate the help you've provided.

Sincerely,

Steve



At 12:21 PM 3/5/2007, you wrote:

On Mon, 5 Mar 2007, Steve Midgley wrote:


Hi,

First off, can I say how much I love GiST? It's already solved a few 
problems for me that seemed impossible to solve in real-time queries. 
Thanks to everyone who works on that project!


Thanks, Steve !



I'm developing a geographic index based on a set of zip code 
boundaries. Points of interest (POI) will fall within some boundaries 
and not others. I need to search to find which POI are within a 
specified boundary.


You POI is what we call ConeSearch query in astronomy.
Please, take a look on Q3C algorithm available from http://q3c.sf.net.
Some information http://www.sai.msu.su/~megera/wiki/SkyPixelization

This is what we use in our Virtual Observatory project and we're able 
to

work with 10^9 objects on moderate hardware. It doesn't use GiST but
special pixelization scheme allow to use standard Btree.



I think have two options (see below) and I'm wondering if anyone has 
an opinion or experience as to whether one or the other will have 
substantially different performance characteristics. I can obviously 
test when I get that far, but I'd prefer to try the anticipated 
faster route first, if anyone has existing experience they can share:


1) Index a series of circles of NN radius around each boundary marker 
(lat/long point). Run a search on POI for those that fall within any 
of the specified circles.


2) Index a set of polygons that mark the "minimum area" around the 
boundary markers in question. Run a search on POI that fall within 
this single polygon.


The polygon will have more points, but there will be more circles to 
search - my understanding of GiST is limited so I'm not sure if 
there's a performance benefit to searching many circles or a few 
polygons.


My tables are of this size:

# of POI: 50,000
# of zip blocks (with and without regions): 217,000
# of zip blocks in a given city (and hence in a given polygon): ~5

Any thoughts or ideas?

Thank you,

Steve

p.s. I could use a GIS system alongside of Postgres but performance 
and efficiency are key to this system, and it seems to me that raw 
GiST indexed SQL queries are going to be fastest and create the 
lowest load on the server?



---(end of 
broadcast)---

TIP 7: You can help support the PostgreSQL project by donating at

  http://www.postgresql.org/about/donate


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [email protected], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: [SQL] GiST index question: performance

2007-03-06 Thread Steve Midgley

Hi Peter,

:)

All my Pg code is written via (or handed to) an abstraction layer, and 
I actually write no functions or stored procedures at all. I write 
using Rails, so in this case it's a Ruby library called ActiveRecord 
which has a Postgres module that allows me to talk via 
"ActiveRecord-speak" or via direct Postgres sql commands. (For example, 
AR has no idea how to create a GiST index, so I issue that DDL 
statement manually using the special syntax - also AR is not always so 
smart about SQL queries so tricky ones I write by hand).

Maybe I misunderstand Q3C completely but it looks like C code that has 
to be installed into the Postgres server itself - not a series of SQL 
functions that can implemented on an unmodified server. I think my ISP 
is fine with anything that gets installed via user-level privileges. 
Anything that requires root and/or anything that involves binary code 
they are more cautious about.

To be fair, I'm cautious about the same things, but given Oleg's 
reputation and contributions to Pg, I wouldn't be so concerned about 
Q3C specifically.

Am I ignorant of something fundamental in this conversation? I really 
do appreciate any education or insight here. Are C code "patches" or 
functions more of a risk to server stability/reliability than higher 
level code? Or am I speaking gibberish?

Thanks,

Steve

At 01:01 AM 3/6/2007, Peter Eisentraut wrote:

Steve Midgley wrote:
> my ISP that manages my Pg SQL server is (in my interests)
> concerned about installing anything non-standard (read: unstable)
> onto their server. I was able to get them to install your TSearch2
> b/c it's been proven many times, but I'm hesitant to even bring up
> Q3C since it's less widely deployed.

How do you manage to get your own code installed under that theory?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [SQL] A form of inheritance with PostgreSQL

2007-03-09 Thread Steve Midgley


Hi Greg,

While not in a C++ framework, you might find that it's not too hard to 
implement something similar in your system - It's called "Single Table 
Inheritance." References to the Ruby on Rails implementation here:


http://wiki.rubyonrails.org/rails/pages/SingleTableInheritance

It's based on Martin Fowler's Patterns of Enterprise Architecture book 
- please find references to his original patterns here:


http://www.martinfowler.com/eaaCatalog/singleTableInheritance.html

The key, I believe, is simply adding a "type" and a "parent_id" to the 
"class" table, so you can model all your types and their hierarchical 
relations. Fowler's diagram is pretty clear. I think then you would 
store the data in another table (or tables) and link into this 
inheritance structure to establish ancestry for any piece of data (some 
people try to store the data in this table too, but I think that's a 
mistake personally).


If I understand what you're trying to do, you can use this design 
pattern in your application language to implement an inheritance scheme 
without any special database features (i.e. in a SQL-standard manner).


I hope this is helpful,

Steve



At 12:28 AM 3/9/2007, [EMAIL PROTECTED] wrote:

Date: Thu, 08 Mar 2007 13:01:51 -0500
From: Greg Toombs <[EMAIL PROTECTED]>
To: [email protected]
Subject: A form of inheritance with PostgreSQL
Message-ID: <[EMAIL PROTECTED]>

 Hello.

I'm trying to figure out how to nicely implement a C++ class-like 
system

with PostgreSQL. Consider the following:

Tables Fruit, Apple, Orange

I want to design the foreign key scheme such that there are relations
between fruit and apple, and fruit and orange, that imply that apple 
is a

fruit, and orange is a fruit.

I don't want to eliminate the existence of Apple and Orange tables,
because there will be columns specific to both Apple and Orange; if I
include these columns in Fruit, then if Fruit is an Orange, the Apple
columns will be needlessly present in Apple rows.

The different ways of implementing this scheme that I've thought of 
(some

uglier than others):

- Have Fruit contain foreign keys to both Apple and Orange, and write 
a

check constraint in Fruit specifying that exactly one of (Apple FK,
Orange FK) needs to be non-null. The disadvantage of this method is 
that

it isn't exactly loosely coupled. For every other fruit type table I
implemented I'd have to go back and add a foreign key in Fruit.

- Have a foreign key in Apple to Fruit, and in Orange to Fruit; then
somehow create a constraint that imposes uniqueness on the union of
foreign keys in both Apple and Orange. To figure out what type of 
fruit a
Fruit row is, run a query for foreign keys in Orange and Apple 
matching
the primary key of Fruit. You'd also want to somehow create a 
constraint

that the result of this query should always return exactly one row
(perhaps with a trigger?)

Any advice will be appreciated! As I'm relatively new to Postgre, I 
might

need some help with the actual implementation as well.

Thank you.

- Greg



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [SQL] log file permissions?

2007-03-16 Thread Steve Crawford

Gerardo Herzig wrote:
> Hi dudes. I have my pg log file rotation configured so i have a
> psql_`today`.log. The thing is, can i configure postgres so the file
> permissions (today is 600) could by 640, so developers could login and
> tailf the logfile (without using the postgres superuser, course)?

This really belongs over in admin.

Are you logging through syslog or stdout (ie. what are the following
settings in postgresql.conf and are they in any way overridden by your
startup scripts)?:
syslog =
syslog_facility =
syslog_ident =

If your are logging through syslog, then postgres has no control over
those permissions. You will have to look at your syslog daemon
configuration.

Cheers,
Steve

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [SQL] Issue with copying data from a text file.

2007-03-21 Thread Steve Midgley


Hi,

I think I had the exact same problem as you do a while back and I 
solved it by removing the header row and the "CSV HEADER" clause of the 
statement. For the large files I had, it was easier (for me) to remove 
the header row than it was to escape out all the quotes (or regen the 
file):


COPY deal_lines_temp_load FROM
'c:/temp/autodrs_deal_lines.txt'
WITH DELIMITER AS '^';

I think the parser doesn't look for nor generate quoted rows except 
when CSV is specified.. It would be nice if there was a way to specify 
a "HEADER" row without invoking CSV parsing rules (friendly hint to 
core devs!) :)


Let us all know if that works!

Steve

At 03:14 AM 3/20/2007, you wrote:

Date: Tue, 20 Mar 2007 11:25:38 +0900
From: Paul Lambert <[EMAIL PROTECTED]>
To: [email protected]
Subject: Issue with copying data from a text file.
Message-ID: <[EMAIL PROTECTED]>

I have a procedure in place that copies data from a caret delimited 
text

file into a table storing some information.

One of the fields in the table contains an item description which may
contain item dimensions such as - 17" alloy wheels

The problem I am getting when I do my load is I believe due to the
presence of the double quotation marks giving the copy the impression
that it is to include the information following as a single text 
string
until it gets to the next set of double quotes. As a result, I get the 


following:

AutoDRS=#   COPY deal_lines_temp_load FROM
'c:/temp/autodrs_deal_lines.txt'
WITH DELIMITER AS '^' CSV HEADER;
ERROR:  value too long for type character varying(30)
CONTEXT:  COPY deal_lines_temp_load, line 87, column order_desc: "17 5 


spoke alloy wheels.^1291.18^117.38^983.69^1291.18^^C^^

The column as you can see is defined as a 30 character field, the load 


contains in this column ^17" 5 spoke alloy wheels.^

I note an option in the COPY command to specify the quote character,
defaulting to double quote. The problem being a single quote will also 

be used in the data, as will other characters. Is there any way to get 
a
copy to have no quote character? I.e. read the file and put whatever 
is
between the caret characters straight into the appropriate field 
exactly

as is.

TIA,
Paul.

--
Paul Lambert
Database Administrator
AutoLedgers



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [SQL] [pgsql-sql] Daily digest v1.2492 (19 messages)

2007-04-03 Thread Steve Midgley


Hi John,

It sounds like a disk-bound operation, so cpu is not maxed out. I'm not 
clear on all the details of your operation but it sounds like you're 
using Java to do row-by-row based inserts, selects and updates within a 
transaction, from a file. This can be a very slow process if you have 
many rows. The OS stats you describe fits that theory (but not 
conclusively).


If you are using (psuedo-)code such as:

Open file {
  Read line {
select from Pg: "select from [other_table] where val = 
[line[colN]]"
exec to Pg: "insert into [table] (col1, col2, ...) values 
(line[col1], line[col2]..."

  }
}

You can radically speed up such a system by using the "copy" 
(http://www.postgresql.org/docs/8.2/interactive/sql-copy.html) command 
to load all the data at once from the file into Pg and then do 
post-processing with Java/SQL to get all the fields looking right. 
Doing a bulk update with a join across several tables is so much faster 
than looping through them with a wrapper in Java (or other lang) you 
won't believe it.


I hope this helps and is on-topic for you.

Steve

At 09:38 AM 4/3/2007, [EMAIL PROTECTED] wrote:

Date: Tue, 03 Apr 2007 22:16:13 +0800
From: John Summerfield <[EMAIL PROTECTED]>
To: [email protected]
Subject: A long-running transaction
Message-ID: <[EMAIL PROTECTED]>

I have a Java (java 1.1) program that I wrote some years ago, to read
records from a text file and insert it into a ostgresql database.

One of the assumptions I made was that one file contained one day's
data, maybe as many as 1500 records, and I coded it to do the whole 
lot
as one transaction so either a single file was loaded in its entirity, 


or none of its data was.

I lost the Java code, but revived the idea and I've collected about 
two
years' data using (Linux) shell scripts, and loading the data using 
psql.


Then, I found the Java code on a disused hard disk:-)

I made the necessary changes for it to build in java 1.5, and used 
psql
to extract data from my new database in the correct format for the old 

program. This time, I have a little more data than I ever loaded at 
once

before:
[EMAIL PROTECTED]:~$ wc -l testdata
6242217 testdata
[EMAIL PROTECTED]:~$ \ls -hl testdata
-rw-r--r-- 1 summer summer 285M 2007-03-28 22:32 testdata
[EMAIL PROTECTED]:~$

Now, it wouldn't surprise me if postgresql used lots of memory - but 
how

much could it possibly need? My laptop, where I first tried this, has
1.25 Gbytes, so I could allow it some.

It wouldn't surprise me a lot if it used lots of memory and caused all 

sorts of problems, but no, it's chugging away, still using no more RAM 


than it could have had on my old Pentium 133 all those years ago.

In the process of checking it out, I've set it running on a machine 
with

a AMD Sempron(tm)   2400+ running Kubuntu 6.10 (kernel is
2.6.17-6-server-xen0) and 512 Mbytes of RAM.

This is the java program:-)
summer   pts/6:0.0 Thu205days  1:07   1:07
/usr/bin/gij-4.1 -cp /usr/s
It's been running five days so far, and I can see where it's up to by
attaching strace. It's reading 2k of the input file every few seconds.

Okay, clearly something's wrong, and I don't think it's all my crddu 
code.

No probs swapping:
[EMAIL PROTECTED]:~$ free
  total   used   free sharedbuffers 
cached
Mem:460800 456472   4328  0860 
262164

-/+ buffers/cache: 193448 267352
Swap:  14618722841461588
[EMAIL PROTECTED]:~$

It is hitting the disk pretty hard now on this machine, but the 
laptop's
still going too, and the disk seems to run about half the time, part 
of

a second running, part idle (but the intervals are getting shorter).

It struck me as fairly curious that neither postgresql nor the
application was hogging the CPU.

Perhaps the laptop is more interesting: look at the size of the buffer 
pool:

[EMAIL PROTECTED]:~> free
  total   used   free sharedbuffers 
cached
Mem:   12955281268548  26980  0   3976 
392388

-/+ buffers/cache: 872184 423344
Swap:  1941496  326561908840
[EMAIL PROTECTED]:~>
Again, no problem with over-use of RAM, and I'm logged on using KDE 
too

and that's running fine.

It's been running a little longer here:
summer   pts/2328Mar07  5days 25:12  25:11  java -cp
/home/summer/Classes/:/usr/share/p

This is Sun's Java 1.5 on OpenSUSE 10.2.


This is what suggested I should write:
[EMAIL PROTECTED]:~> procinfo
Linux 2.6.18.8-0.1-default ([EMAIL PROTECTED]) (gcc 4.1.2 20061115) #1
1CPU [Echidna.]

Memory:  TotalUsedFree  Shared Buffers
Mem:   1295528 1271720   23808   03716
Swap:  1941496   32656 1908840

Bootup: Tue Mar 27 18:50:19 2007Load avera

Re: [SQL] Question on interval

2007-04-20 Thread Steve Crawford

Rodrigo De León wrote:
> On 4/20/07, Wei Weng <[EMAIL PROTECTED]> wrote:
>> Hi all.
>>
>> How do I write a query that converts an interger to the interval type?
>>
>> Like convert integer 10 to INTERVAL '10 seconds'?
>>
>> The integer is a column in a table though, so it is more like convert
>> integer tbl.theInteger to INTERVAL 'tbl.theInteger seconds".
>>
>> Thanks!
>>
>>
>> Wei
> 
> select (10||' sec')::interval;

Or, if you prefer:
select 10*'1 second'::interval;

Cheers,
Steve


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[SQL] Selecting rows with "static" ordering

2007-04-26 Thread Steve Midgley


Hello,

I have a strange problem (or one that I've never had before anyway). I 
am searching for a list of "id's" for a given table (these id values 
are generated at run-time and held statically in an application-local 
variable).


From that application, I want to retrieve all those rows, and I want 
them in the order they are currently stored in that variable. So take 
for example this foreign application variable:


  ids = "3,2,5,1,4"

The application then executes this sql:

  select * from table where id in (3,2,5,1,4)

As-is, of course, the above query will return the 5 records in a 
semi-random (i.e. unpredictable/unreliable) order. And I don't want to 
just "order by id" - I want to "order by id(3,2,5,1,4)" (if you see 
what I mean)


Is there a "neat trick" that anyone knows for pulling this off in a 
single query? Basically right now I'm issuing 5 queries to the backend 
to ensure ordering but this horribly inefficient.


Any input or advice would be appreciated,

Steve



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [SQL] subtract a day from the NOW function

2007-06-07 Thread Steve Crawford

Scott Marlowe wrote:
> Campbell, Lance wrote:
>> Michael,
>> So based on your feedback would it be better to do option A or B below?
>>
>> 1) I have a timestamp field, "some_timestamp", in table "some_table".
>> 2) I want to compare field "some_timestamp" to the current date - 1 day.
>> I need to ignore hours, minutes and seconds.
>>   
> You might want to use date_trunc then:
> 
> select * from sometable where date_trunc('day',tiemstampfield) >
> date_trunc('day',now() - interval '1 day');
> 
> or something like that.

Beware in the "or something like that category" that PostgreSQL
considers "1 day" to be "24 hours" thus depending on whether the
timestampfield is with or without TZ and where you do your truncation
(before or after subtracting), you can end up with unexpected results in
the vicinity of DST changes:

select '2007-03-12'::timestamptz - '1 day'::interval;
?column?

 2007-03-10 23:00:00-08

select '2007-03-12'::timestamp - '1 day'::interval;
  ?column?
-
 2007-03-11 00:00:00

Especially note that truncating a timestamptz preserves the timezone
info so you will very likely need to address issues on the days that
Daylight Saving starts or ends:

select date_trunc('day',current_timestamp);
   date_trunc

 2007-06-07 00:00:00-07

Cheers,
Steve

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [SQL] subtract a day from the NOW function

2007-06-07 Thread Steve Crawford

Michael Glaesemann wrote:
> 
> On Jun 7, 2007, at 13:58 , Steve Crawford wrote:
> 
>> Beware in the "or something like that category" that PostgreSQL
>> considers "1 day" to be "24 hours"
> 
> Actually, recent versions of PostgreSQL take into account daylight
> saving time in accordance with the current PostgreSQL time zone setting,
> so '1 day'  in the context of timestamptz +/- interval may be 23, 24, or
> 25 hours

Interesting - thanks. That's one more thing I need to check when
upgrading my server. If my reading is correct, there are some subtle
gotchas here.

If I go back and try on a 7.4 machine it appears that interval makes a
DST correction if the interval includes a unit of "month" or greater but
does not make a correction for "week" or "day" intervals.

On 8.2 I'm seeing an adjustment if the DST adjustment includes units of
"day" or greater (ie. '1 day' and '1 day 01:00' get adjusted but '24
hours' and '25 hours' do not).

But PG doesn't follow the same rules in subtracting timestamptz values
so operations involving timestamps and intervals are (sometimes) not
reversible:

select timestamptz '2007-11-05' - timestamptz '2007-11-04';

?column?

 1 day 01:00:00

select timestamptz '2007-11-04' + interval '1 day 01:00:00';
?column?

 2007-11-05 01:00:00-08

Cheers,
Steve

---(end of broadcast)---
TIP 6: explain analyze is your friend

[SQL] Increment a sequence by more than one

2007-08-03 Thread Steve Midgley


Hi,

I'm writing an import app in a third party language. It's going to use 
"copy to" to move data from STDIN to a postgres (8.2) table. There are 
some complexities though: it's going to copy the records to a 
"versioned" table first, and then at a later time the records will be 
copied by a different procedure to the "live" table. The live table and 
versioned table are identical in terms of their field definitions. But 
there is no sequence associated with the versioned table (whose primary 
key is "id" plus "import_group_id", whereas the live table's pk is just 
"id"). So all versioned table entries must already "know" what their id 
would be in the live table. (This makes sense for other business 
process we have, but it's a bit of a problem in this instance).


My problem: I'd like to be able to grab a block of id's from the live 
table's pk sequence. So let's say my importer has 5,000 new rows to 
import and the current max pk in the live table is 540,203. I'd like to 
be able to increment the primary key sequence in such a way that I get 
a block of ids all to myself and the sequence is reset to 545,203 with 
a guarantee that all the id's between 540203 and 545203 are unused.


I'm guessing this can be done with a stored procedure, but if possible 
I'd like to be able to run this command from my third party app without 
calling a stored procedure (we try not to use stored procedures here 
b/c we code entirely in this third party language - if we had to, it's 
possible we could install a stored procedure though).


But since I've seen so much magic on display from people on this list, 
I'm going to ask if it's possible to do this solely from PG SQL sent 
from a third party language? The main tricky bit seems to be ensuring 
that everything is locked so two such increment calls at the same time 
don't yield overlapping blocks of ids. Is there a way to "lock" the 
sequence generator for the duration of a "nextval" and "setval" call? 
Since pk sequence functions like nextval cannot be rolled back, I'm 
guessing that "begin/end" won't do the job?


I.e:

-- need "magic lock" statement on pk sequence here
nextval
-- returns 540203
setval(545203)
-- now sequence is set to where I want it and I "own" 5000 id's
-- release magic lock here

My fallback is to just have a routine that calls "nextval" a bunch of 
times and stores all the id's it gets - they may or may not be 
sequential but they'll be unique. This is going to be a really slow way 
to get a large number of id's of course and just seems plain 
wrongheaded in many ways.


Any insights? All help is appreciated and input on a better way to 
solve the problem completely is of course welcome as well.


Sincerely,

Steve

1 2 >

1 - 100 of 191 matches

Mail list logo