Re: [GENERAL] Asychronous database replication

2005-09-16 Thread John DeSoi


On Sep 15, 2005, at 9:54 PM, Greg Stark wrote:

If you need data to propagate from the clients back to the server  
then things
get more complicated. Even then you could side step a lot of  
headaches if you
can structure the application in specific ways, such as  
guaranteeing that the

clients can only insert, never update records.


And even updates could be OK if the application can support the right  
partitioning of the data and only do it one place at a time. With  
some kinds of field based work it might be suitable to have global  
(read only) data along with data created in the field that is site/ 
client specific. As long as the data collected in the field is not  
being updated on the master, it could continue to be updated in the  
field and synced back to the master database.




John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Asychronous database replication

2005-09-16 Thread Steve Manes

Greg Stark wrote:

My first reaction to this description was to consider some sort of model where
the master database publishes text dumps of the master database which are
regularly downloaded and loaded on the slaves. The slaves treat those tables
as purely read-only reference tables. 


If you need data to propagate from the clients back to the server then things
get more complicated. Even then you could side step a lot of headaches if you
can structure the application in specific ways, such as guaranteeing that the
clients can only insert, never update records.


It's the latter, I'm afraid.  The master actually won't be modifying or 
inserting any data itself, just publishing it for the client databases 
in its domain.  Almost all data inserts/updates/deletes will occur on 
the leaf nodes, i.e. at the remote health clinics and MMUs (mobile 
medical units).  What we need to ensure is that if Patient X visits Site 
A on Monday that his records are there for a followup visit at Site B on 
Tuesday.


Even this has salient problems: for instance, Patient X visits Site B 
before Site A has had time to replicate its current data back to the 
master and Site B has pulled those updates.


The requirements scream ASP model except that this system needs to be 
functional for disaster management where it's likely there won't be any 
communications.  At least, that's the constraint I've been given.


This may turn out to be an issue of managing client expectations and 
some add'l infrastructure investment (i.e. better satellite 
communications on the MMUs and satellite backup for the fixed clinics).


We're at the very early head-banging stages of this project now so I 
have a fairly optimistic list of requirements to resolve.   This is an 
open source project though so it would be terrific if we could build it 
non-ASP.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[GENERAL] new.id has wrong value in INSERT RULE

2005-09-16 Thread Ian McFarland
Hi,

To explain the problem as clearly as I can I've included the schema, rule,
inserts, output and explanation below.

CREATE TABLE table_a (
id  serial  PRIMARY KEY,
nametext,   
active  boolean DEFAULT 't',
date_createdtimestamp   DEFAULT
CURRENT_TIMESTAMP
);


CREATE TABLE table_b (
id  serial
PRIMARY KEY,
table_a_id  int
REFERENCES table_a ON DELETE CASCADE,
yield   int
NOT NULL,
active  boolean
DEFAULT 't',
date_createdtimestamp
DEFAULT CURRENT_TIMESTAMP
);

-- to debug I've set 'yield' to be the value of new.id rather than the value
supplied in the insert statement (this is just to show what is happening
with new.id)
CREATE RULE table_b_insert_rule  AS ON INSERT TO table_b
DO (UPDATE table_b SET active = 'f', yield = new.id WHERE table_a_id
= new.table_a_id AND id != new.id;);


INSERT INTO table_a (id, name) VALUES (1, 'test1');

SELECT * FROM table_a;

 id | name  | active |date_created
+---++
  1 | test1 | t  | 2005-09-16 13:23:03.620813
(1 row)

INSERT INTO table_b (table_a_id, yield) VALUES (1, '100');

SELECT * FROM table_b;
 id | table_a_id | yield | active |date_created
++---++
  1 |  1 | 3 | f  | 2005-09-16 13:23:46.156202
(1 row)

The yield value is set to 3 rather than 1 as expected because the value of
new.id was 3.
This is totally unexpected behaviour, any help on getting to the bottom of
this is much appreciated.

It seems like table_b_id_seq is being incremented twice more that expected,
the first time you get 1 which is used as the id, 
then it seems like it is being incremented it twice more and that's where
the value of 3 is coming from. 
I've checked the increment value of the sequence as shown below and it is 1
as expected.

SELECT * FROM table_b_id_seq

 sequence_name  | last_value | increment_by |  max_value  |
min_value | cache_value | log_cnt | is_cycled | is_called
++--+-+-
--+-+-+---+---
 table_b_id_seq |  3 |1 | 9223372036854775807 |
1 |   1 |  30 | f | t
(1 row)


I tried changing the value of the sequence increment to 3 and retested, see
output below.

ALTER SEQUENCE table_b_id_seq INCREMENT 3;

SELECT * FROM table_b_id_seq ;
 sequence_name  | last_value | increment_by |  max_value  |
min_value | cache_value | log_cnt | is_cycled | is_called
++--+-+-
--+-+-+---+---
 table_b_id_seq |  3 |3 | 9223372036854775807 |
1 |   1 |  30 | f | t
(1 row)

INSERT INTO table_a (id, name) VALUES (2, 'test2');

SELECT * FROM table_a;
 id | name  | active |date_created
+---++
  1 | test1 | t  | 2005-09-16 13:23:03.620813
  2 | test2 | t  | 2005-09-16 13:35:06.244128
(2 rows)

INSERT INTO table_b (table_a_id, yield) VALUES (2, '100');

SELECT * FROM table_b;
 id | table_a_id | yield | active |date_created
++---++
  1 |  1 | 3 | f  | 2005-09-16 13:23:46.156202
  6 |  2 |12 | f  | 2005-09-16 13:35:36.843507
(2 rows)

It is clear to me that the value 6 in the id column is correct because I've
changed the increment to 3.
However the value of 12 in the yield column (Set by yield = new.id in the
rule) can only be obtained by SELECT nextval('table_b_id_seq') (or postgres
internal equiv) being called twice.

I'm using (PostgreSQL) 7.4.5 and can obviously work round this problem
easily but want to understnad what is going wrong.

thanks in advance

Ian


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Strange Bug in exim4 postgresql lookup code or libpq?

2005-09-16 Thread Alexander Neumann
Hi,

* Tom Lane <[EMAIL PROTECTED]> wrote:
> Alexander Neumann <[EMAIL PROTECTED]> writes:
> > In short terms: The result code is PGRES_FATAL_ERROR, but the error message
> > is an empty string.
> AFAIK that shouldn't happen, unless perhaps you are completely out of
> memory in the client-side process.  What shows up in the postmaster
> error log when this occurs?

=
2005-09-01 18:19:01 [5462] LOG:  connection received: host=127.0.0.1 port=41446
2005-09-01 18:19:01 [5462] LOG:  connection authorized: user=exim 
database=pfauadmin
2005-09-01 18:19:01 [5462] LOG:  statement: SELECT domain || ' : ' FROM 
exim_virtual_domains
2005-09-01 18:19:01 [5462] LOG:  statement: SELECT sa_enable FROM 
exim_virtual_addresses WHERE domain = '.org' AND (mailbox = 'x' OR 
mailbox = '*') ORDER BY mailbox DESC LIMIT 1
2005-09-01 18:19:01 [5462] LOG:  statement: SELECT sa_tempreject_score FROM 
exim_virtual_addresses WHERE domain = '.org' AND (mailbox = 'x' OR 
mailbox = '*') ORDER BY mailbox DESC LIMIT 1
2005-09-01 18:19:01 [5462] LOG:  unexpected EOF on client connection
=

The first query succeeds, the second fails. Both work and deliver exactly
the results as expected, when cut&pasted from the logs (either exim or
postgresql)...

Thanks for reading,
- Alexander



pgpvh4r8wP9ze.pgp
Description: PGP signature


Re: [GENERAL] Strange Bug in exim4 postgresql lookup code or libpq?

2005-09-16 Thread Alexander Neumann
Hi,

* Martijn van Oosterhout  wrote:
> On Thu, Sep 15, 2005 at 03:14:10PM +0200, Alexander Neumann wrote:
> > I'm using exim4 together with postgresql.  Sometimes, data lookups fail
> > mysteriously and on the exim side nobody was able to help.
> 
> The server log said:
> 2005-09-01 18:19:01 [5462] LOG:  statement: SELECT sa_tempreject_score FROM
> exim_virtual_addresses WHERE domain = '.org' AND (mailbox = 'x' OR
> mailbox = '*') ORDER BY mailbox DESC LIMIT 1
> 2005-09-01 18:19:01 [5462] LOG:  unexpected EOF on client connection
> 
> It's not clear if the EOF was caused by exim not closing down properly
> or something else. There's nothing special about the domains you've x'd
> out?

Nop. The query works and returns the expected results when cut&pasted from
the logs (either exim or postgresql). I even sniffed the smtp dialogue and
tried to reproduce it, without any success (it just works if i'm trying it).

So this is either caused by a strange (but reproduceable) race-condition or
it somehow depends on the connecting ip address of the other side's server.

Thanks for reading,
- Alexander


pgpwNbbLH6Pei.pgp
Description: PGP signature


Re: [GENERAL] Help trying to write my first plpgsql function...

2005-09-16 Thread hubert depesz lubaczewski
On 9/15/05, Bjørn T Johansen <[EMAIL PROTECTED]> wrote:
Yes, I did and I found an answer... :)

you did what? read the docs?
 But I am trying to use this function in a report designer and the result from the
select is in the way of the real data from the fetch... Is there a way around this?
and?
as i said: define the function as setof something.
for example (i'm wrinting without testnig, so there might be small mistakes)

create type srf_type_1 ( field1 int4, field2 text);

create or replace function test () returns setof srf_type_1 as
$BODY$
declare
temprec srf_type_1%ROWTYPE;
begin
for temprec in select id as field1, username as field2 from users where is_active = true loop
return next temprec;
end loop;
return;
end;
$BODY$
language 'plpgsql';

should work as select * from test();

hubert


[GENERAL] help needed for functions

2005-09-16 Thread Nitin Ahire
Hello All,
 
I am new to postgresql database. 
I am transferring current database from mssql database to postgresql 7.4 
 
I am facing problems for stored procedures. Is their any way so that I can transfer my existing stored procedure from mssql to postgre ?
 
I already read about functions & tried to implement it but I could not found any way to get multiple columns from a function. 
Also I would like to get help about selecting values from two tables using function.
 
 
Thanks
Nitin __Do You Yahoo!?Tired of spam?  Yahoo! Mail has the best spam protection around http://mail.yahoo.com 

Re: [GENERAL] help needed for functions

2005-09-16 Thread Dinesh Pandey








Yes, you can use “SETOF” for multiple records.

 

See Examples

---

 



CREATE TABLE department(id
int primary key, name text);

CREATE TABLE employee(id
int primary key, name text, salary int, departmentid int references department);

 

-

 

INSERT INTO department values (1, 'Management');

INSERT INTO department values (2, 'IT');

 

INSERT INTO employee values (1, 'John Smith', 3, 1);

INSERT INTO employee values (2, 'Jane Doe', 5, 1);

INSERT INTO employee values (3, 'Jack Jackson', 6, 2);

 

-

 

CREATE OR REPLACE FUNCTION
GetEmployees() 

RETURNS SETOF employee 

AS 

  'select * from employee;' 

LANGUAGE 'sql';

 

-

 

CREATE TYPE HOLDER AS (departmentid int, totalsalary int8);

 

-

 

CREATE or replace FUNCTION
SqlDepartmentSalaries() 

RETURNS SETOF holder 

AS 

'

    select departmentid, sum(salary) as totalsalary from
GetEmployees() as a group by departmentid

'

LANGUAGE 'sql';



select * from SqlDepartmentSalaries();

-

--We can do the same in PLPGSQL in this way.

 

CREATE OR REPLACE FUNCTION
PLpgSQLDepartmentSalaries() 

RETURNS SETOF holder 

AS 

'

DECLARE

    r holder%rowtype;

BEGIN

    FOR r in select departmentid, sum(salary) as totalsalary
from GetEmployees() group by departmentid 

    LOOP

    return next r;

    END LOOP;

    RETURN;

END

' 

LANGUAGE 'plpgsql';

 

-

 

 

Thanks
Dinesh Pandey











From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On
Behalf Of Nitin Ahire
Sent: Friday, September 16, 2005
7:21 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] help needed for
functions



 



Hello
All,





 





I
am new to postgresql database. 





I
am transferring current database from mssql database to postgresql 7.4 





 





I
am facing problems for stored procedures. Is their any way so that I can
transfer my existing stored procedure from mssql to postgre ?





 





I
already read about functions & tried to implement it but I could not found
any way to get multiple columns from a function. 





Also
I would like to get help about selecting values from two tables using function.





 





 





Thanks





Nitin




__
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 








Re: [GENERAL] help needed for functions

2005-09-16 Thread A. Kretschmer
am  16.09.2005, um  6:51:16 -0700 mailte Nitin Ahire folgendes:
> I am facing problems for stored procedures. Is their any way so that I
> can transfer my existing stored procedure from mssql to postgre ?

I guess: no.

>  
> I already read about functions & tried to implement it but I could not
> found any way to get multiple columns from a function.  Also I would

http://techdocs.postgresql.org/guides/SetReturningFunctions
http://www.postgresql.org/docs/current/static/xfunc-sql.html#AEN28835

Read this links!


> like to get help about selecting values from two tables using
> function.

Where exactly is the problem?


Regards, Andreas
-- 
Andreas Kretschmer(Kontakt: siehe Header)
Heynitz:  035242/47212,  D1: 0160/7141639
GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net
 ===Schollglas Unternehmensgruppe=== 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[GENERAL] Question about a query plan

2005-09-16 Thread Bill Moseley

PostgreSQL 7.4.8 on i486-pc-linux-gnu, Debian
Reading: http://www.postgresql.org/docs/8.0/static/performance-tips.html

I'm just starting to look at query plans, and I'm not understanding a
few things.  I don't have that many questions, but I'm including my
examples below, so it's a bit long.


First table is "class" (as in a class taught at a school) and has an
indexed column "class_time" as timestamp(0) with time zone.


First question is why the planner is not using an index scan when I
use "now()" or CURRENT_TIMESTAMP?


EXPLAIN ANALYZE select id from class where class_time > now();  
 
QUERY PLAN  
  

--
 Seq Scan on "class"  (cost=0.00..655.62 rows=414 width=4) (actual 
time=213.327..288.407 rows=28 loops=1)
   Filter: (class_time > now())

EXPLAIN ANALYZE select id from class where class_time > now()::timestamp(0) 
with time zone;
  QUERY PLAN
  

--
 Seq Scan on "class"  (cost=0.00..658.72 rows=414 width=4) (actual 
time=2.065..5.251 rows=28 loops=1)
   Filter: (class_time > (now())::timestamp(0) with time zone)


At first I thought the planner was related to the ration of rows the
planner was expecting to return to the total number of rows.  But
using < or > uses a a scan.  But if I do "class_time = now()" then it
uses an Index Scan.


But, if I specify the timestamp then it always uses an Index Scan:

select now()::timestamp(0) with time zone;
  now   

 2005-09-16 06:44:10-07

 EXPLAIN ANALYZE select id from class where class_time > '2005-09-16 
06:44:10-07';  
  QUERY PLAN
  

--
 Index Scan using class_class_time_index on "class"  (cost=0.00..191.17 
rows=50 width=4) (actual time=66.072..66.248 rows=28 loops=1)
   Index Cond: (class_time > '2005-09-16 06:44:10-07'::timestamp with time 
zone)



Ok now on to the second question.  I have two other related tables.
First, I have a table "person" which you can guess what it holds.  And
a link table instructors (a class can have more than one instructor):

 \d instructors
  Table "public.instructors"
 Column |  Type   | Modifiers 
+-+---
 person | integer | not null
 class  | integer | not null
Indexes:
"instructors_pkey" primary key, btree (person, "class")
"instructors_class_index" btree ("class")
"instructors_person_index" btree (person)
Foreign-key constraints:
"$2" FOREIGN KEY ("class") REFERENCES "class"(id)
"$1" FOREIGN KEY (person) REFERENCES person(id)



I want to find out who is teaching classes in the future:


EXPLAIN ANALYZE select person, class from instructors 
where instructors.class in 
(select class.id from class where class_time > now());
QUERY PLAN  
  

 Hash IN Join  (cost=656.65..687.64 rows=437 width=8) (actual 
time=31.741..33.443 rows=29 loops=1)
   Hash Cond: ("outer"."class" = "inner".id)
   ->  Seq Scan on instructors  (cost=0.00..20.08 rows=1308 width=8) (actual 
time=0.057..1.433 rows=1308 loops=1)
   ->  Hash  (cost=655.62..655.62 rows=414 width=4) (actual time=30.963..30.963 
rows=0 loops=1)
 ->  Seq Scan on "class"  (cost=0.00..655.62 rows=414 width=4) (actual 
time=18.716..30.892 rows=28 loops=1)
   Filter: (class_time > now())

Perhaps I'm reading that incorrectly, but the sub-select is returning
28 rows of "class.id".  Then why is it doing a Seq Scan on instructors
instead of an index scan?  If I innumerate all 28 classes I get an
Index Scan.



Finally, not really a question, but my goal is to show a count of classes 
taught by each in
instructor.  Perhaps there's a better query?

EXPLAIN select person, first_name, count(class) 
from instructors, person
where instructors.class in 
(select id from class where class_time > now() ) 
AND person.id = instructors.person
group by person, first_name;
  QUERY PLAN
  
--

Re: [GENERAL] Asychronous database replication

2005-09-16 Thread Peter Fein
Steve Manes wrote:
> Greg Stark wrote:
> 
>> My first reaction to this description was to consider some sort of
>> model where
>> the master database publishes text dumps of the master database which are
>> regularly downloaded and loaded on the slaves. The slaves treat those
>> tables
>> as purely read-only reference tables.
>> If you need data to propagate from the clients back to the server then
>> things
>> get more complicated. Even then you could side step a lot of headaches
>> if you
>> can structure the application in specific ways, such as guaranteeing
>> that the
>> clients can only insert, never update records.
> 
> 
> It's the latter, I'm afraid.  The master actually won't be modifying or
> inserting any data itself, just publishing it for the client databases
> in its domain.  Almost all data inserts/updates/deletes will occur on
> the leaf nodes, i.e. at the remote health clinics and MMUs (mobile
> medical units).  What we need to ensure is that if Patient X visits Site
> A on Monday that his records are there for a followup visit at Site B on
> Tuesday.
> 
> Even this has salient problems: for instance, Patient X visits Site B
> before Site A has had time to replicate its current data back to the
> master and Site B has pulled those updates.

What about doing updates in a peer-to-peer style?  Basically, each node
updates any others it comes in contact with (both with its local changes
and anything it's received from the master) and everyone pushes changes
back to the master when they can.  Sort of the way airplanes crossing
the ocean pass radio messages for each other. I'm assuming two things:

1) Communication b/w local nodes is easier / occurs more frequently than
communicating with the master.  It's easier for an MMU to make a local
call or visit a clinic than dial the sat phone.

2) Patients travel locally.  Patient X might visit Sites A and B a day
apart, but he's unlikely to visit Site C which is a few countries away
any time soon.  Basically, I don't think you need to update all nodes
with every record immediately.

For some early-morning reason, this made me think of distributed version
control, but I'm not entirely sure how one would use it in this case.
See svk.elixus.org.


-- 
Peter Fein [EMAIL PROTECTED] 773-575-0694

Basically, if you're not a utopianist, you're a schmuck. -J. Feldman

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Strange Bug in exim4 postgresql lookup code or libpq?

2005-09-16 Thread Tom Lane
Alexander Neumann <[EMAIL PROTECTED]> writes:
>> 2005-09-01 18:19:01 [5462] LOG:  unexpected EOF on client connection

> So this is either caused by a strange (but reproduceable) race-condition or
> it somehow depends on the connecting ip address of the other side's server.

The "unexpected EOF" indicates pretty clearly that the problem is on the
client side.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] new.id has wrong value in INSERT RULE

2005-09-16 Thread Tom Lane
"Ian McFarland" <[EMAIL PROTECTED]> writes:
> CREATE RULE table_b_insert_rule  AS ON INSERT TO table_b
>   DO (UPDATE table_b SET active = 'f', yield = new.id WHERE table_a_id
> = new.table_a_id AND id != new.id;);

Rules are macros.  Since what is substituted for new.id in this case is
a nextval() call, you have a multiple-evaluation problem.  The above is
a pretty horrid way to do things anyway --- you should probably be using
a trigger instead.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Create a pg table from CSV with header rows

2005-09-16 Thread Doug Bloebaum
On 9/15/05, Michael Fuhr <[EMAIL PROTECTED]> wrote:
On Thu, Sep 15, 2005 at 06:48:59PM -0400, Robert Fitzpatrick wrote:> Anyone know a package that can do this? Perferrably a Unix/Linux> package.It would be trivial to write a script in a language like Perl to
read the first line of a file and generate a CREATE TABLE statementfrom it, then issue a COPY command and send the rest of the file.Determining the columns' data types would be a different matter:if they weren't specified in the header then you'd have to guess
or perhaps make them all text.
I've attached a sample implementation of Michael's suggestion.




loadCsv2.pl
Description: Binary data

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Question about a query plan

2005-09-16 Thread Peter Eisentraut
Am Freitag, 16. September 2005 16:18 schrieb Bill Moseley:
> First question is why the planner is not using an index scan when I
> use "now()" or CURRENT_TIMESTAMP?
>
>
> EXPLAIN ANALYZE select id from class where class_time > now();
> QUERY PLAN
>
> ---
>--- Seq Scan on "class"  (cost=0.00..655.62
> rows=414 width=4) (actual time=213.327..288.407 rows=28 loops=1) Filter:
> (class_time > now())

The planner thinks your query will return 414 rows, so it thinks the 
sequential scan is faster.  In reality, your query only retuns 28 rows, so 
you need to create better statistics, either by running ANALYZE or VACUUM (or 
both) or tweaking the statistics parameters of the columns.

> Perhaps I'm reading that incorrectly, but the sub-select is returning
> 28 rows of "class.id".  Then why is it doing a Seq Scan on instructors
> instead of an index scan?  If I innumerate all 28 classes I get an
> Index Scan.

Again, bad statistics.  It thinks that the scan on instructors is going to 
return 1308 rows, which is probably not true.

> Finally, not really a question, but my goal is to show a count of classes
> taught by each in instructor.  Perhaps there's a better query?

You could probably rewrite it as a join, but as you could see, the planner 
rewrote it as a join internally anyway.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Question about a query plan

2005-09-16 Thread Thomas O'Connell


On Sep 16, 2005, at 9:18 AM, Bill Moseley wrote:


First question is why the planner is not using an index scan when I
use "now()" or CURRENT_TIMESTAMP?


It also used to be the case (pre-8.0; I couldn't find in the release  
notes whether this was an 8.0 or 8.1 fix) that now() and  
CURRENT_TIMESTAMP were not indexable, I think because of mutability.


For older versions of postgres, it's recommended that you determine  
the time in the client and use constant data in your query.


--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-469-5150
615-469-5151 (fax)
---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Asychronous database replication

2005-09-16 Thread Scott Ribe
> The requirements scream ASP model except that this system needs to be
> functional for disaster management where it's likely there won't be any
> communications.  At least, that's the constraint I've been given.

I'm not an expert on this, but just kicking around the idea, the approach I
think I'd look into:

- clients don't access the database directly

- there's a middleware layer and clients make higher-level RPC-type calls
whose semantics more closely match the client functionality

- then those calls can be logged and replicated...


-- 
Scott Ribe
[EMAIL PROTECTED]
http://www.killerbytes.com/
(303) 665-7007 voice



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Replication

2005-09-16 Thread Scott Ribe
> How about a third choice: you can also use a proven, reliable and tested
>   replication solution that is included in the core system because the
> core system basiclly provides it anyway.

Sure, but that one is spelled "Sybase", not "MySQL" ;-)


-- 
Scott Ribe
[EMAIL PROTECTED]
http://www.killerbytes.com/
(303) 665-7007 voice



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Question about a query plan

2005-09-16 Thread Bill Moseley
On Fri, Sep 16, 2005 at 04:45:57PM +0200, Peter Eisentraut wrote:
> The planner thinks your query will return 414 rows, so it thinks the 
> sequential scan is faster.  In reality, your query only retuns 28 rows, so 
> you need to create better statistics, either by running ANALYZE or VACUUM (or 
> both) or tweaking the statistics parameters of the columns.

I did wonder about the planner stats, so I had run ANALYZE on the
database with no change.  I just now ran VACUUM (and VACUUM ANALYZE),
and again see no change.  Perhaps my table is just too small for this
test.


-- 
Bill Moseley
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Question about a query plan

2005-09-16 Thread Bill Moseley
On Fri, Sep 16, 2005 at 10:02:28AM -0500, Thomas O'Connell wrote:
> 
> On Sep 16, 2005, at 9:18 AM, Bill Moseley wrote:
> 
> >First question is why the planner is not using an index scan when I
> >use "now()" or CURRENT_TIMESTAMP?
> 
> It also used to be the case (pre-8.0; I couldn't find in the release  
> notes whether this was an 8.0 or 8.1 fix) that now() and  
> CURRENT_TIMESTAMP were not indexable, I think because of mutability.
> 
> For older versions of postgres, it's recommended that you determine  
> the time in the client and use constant data in your query.

Interesting.  I have a few VIEWs that include now(), but I guess I
could adjust and pass in the date from the client.

Thanks for the tip.

-- 
Bill Moseley
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] Neither column can be NULL if the column is part of the combination of primary key columns?

2005-09-16 Thread Emi Lu

Greetings,

A question about "combination of several columns " as primary key .

A table test (
col1 timestamp not null default now(),
col2 timestamp,
primary key (col1, col2)
);

If I setup primary key as the *combination* of (col1, col2), the col2 
will be setup as "NOT NULL" automatically right?

(Although, I did not specify not null constraint for col2)

To allow *col2* to be NULL and make record combination (col1, col2) 
unique, I should setup

unique (col1, col2) ?

Thanks,
Emi





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Replication

2005-09-16 Thread Russ Brown

Scott Ribe wrote:

How about a third choice: you can also use a proven, reliable and tested
 replication solution that is included in the core system because the
core system basiclly provides it anyway.



Sure, but that one is spelled "Sybase", not "MySQL" ;-)




It's amazing how misunderstood my post was.

My third choice was a hypothetical future version of PostgreSQL, 
modified from its current form very slightly to include a form of 
replication 'out of the box': a couple of scripts to enable WAL log 
transfer and also a solution to the problem of WAL log delay mentioned 
by other posters.


I only mentioned MySQL because their 'out of the box' solution involves 
transferring the binlogs, which is similar to the method of transferring 
the PostgreSQL WAL logs, and it just made me think. That's all. I wasn't 
comparing, I wasn't suggesting MySQL is better than PostgreSQL. I wasn't 
suggesting that they have the 'ultimate' solution. I wasn't even 
suggesting that they have a good solution. It just made me think. That's 
all.


Well, I've learned my lesson. Next time I post I'll be sure not to 
mention MySQL in any way, shape or form.


--

Russ

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Neither column can be NULL if the column is part of the combination of primary key columns?

2005-09-16 Thread Jaime Casanova
> 
> To allow *col2* to be NULL and make record combination (col1, col2)
> unique, I should setup
> unique (col1, col2) ?
> 
> Thanks,
> Emi
> 

no, you must create a partial unique index...

create unique index indice_col2 on table(col1, col2) where col2 is not null;

-- 
Atentamente,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] Setting WHERE on a VIEW with aggregate function.

2005-09-16 Thread Bill Moseley
I have a view to generate a list of instructors and a count of their
future classes.

"instructors" is a link table between "class" and "person".

CREATE VIEW future_instructor_counts
AS
SELECT  person.id AS person_id,
first_name,
last_name,
count(instructors.class) AS class_count

  FROM  class, instructors, person

 WHERE  class.id= instructors.class AND
person.id   = instructors.person
AND class_time > now()

  GROUP BY  person_id, first_name, last_name;


I have two very basic SQL questions:

1) With an aggregate function in the query, is there any way to remove
the "AND class_time > now()" so that timestamp can be passed in the
select?  That is, I'd like to be able to do this?

select * from instructor_counts where class_time > now();

But class_time is not part of the VIEW so that's not valid.  And if it
was included then I don't have an aggregate function any more - no
more grouping.


2) I think I'm missing something obvious.  I know that I need to
specify all my non-aggregate columns in the "GROUP BY", but I don't
under stand why.  Really, the results are just grouped only by
person.id so why the need to specify the other columns.

And if you don't specify all the columns then Postgresql reports:

  ERROR:  column "person.id" must appear in the GROUP BY 
clause or be used in an aggregate function

Is there a reason Postgresql doesn't just add the column
automatically?  It does in other cases (like a missing table in a
join).

Thanks


-- 
Bill Moseley
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Create a pg table from CSV with header rows

2005-09-16 Thread David Fetter
On Fri, Sep 16, 2005 at 10:41:59AM -0400, Doug Bloebaum wrote:
> On 9/15/05, Michael Fuhr <[EMAIL PROTECTED]> wrote:
> > 
> > On Thu, Sep 15, 2005 at 06:48:59PM -0400, Robert Fitzpatrick wrote:
> > > Anyone know a package that can do this? Perferrably a Unix/Linux
> > > package.
> > 
> > It would be trivial to write a script in a language like Perl to
> > read the first line of a file and generate a CREATE TABLE statement
> > from it, then issue a COPY command and send the rest of the file.
> > Determining the columns' data types would be a different matter:
> > if they weren't specified in the header then you'd have to guess
> > or perhaps make them all text.
> 
> 
> I've attached a sample implementation of Michael's suggestion.

I've attached what I hope is a slightly improved one :)

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!
#!/usr/bin/perl -w

use DBI;
use Text::CSV_XS;
use File::Basename;

my $csv_file = shift or die "Usage: $0 \n";
my $fh; # Filehandle.
open $fh, '<', $csv_file # 3-argument open in case of name weirness.
or die "$0: Can't open $csv_file for read: $!\n";

my ($table_name) = fileparse($csv_file,qr{\..*});
$table_name = quote_odd_name($table_name);
warn "table_name=$table_name\n";

my $dbh = DBI->connect(
  'dbi:Pg:dbname=dbname', 'user', 'pass', # should this be user input?
  { Auto_commit => 0 }
);
my $csv = Text::CSV_XS->new();

# create the table based on names from the header row
# columns will arbitrarily be of type TEXT, because VARCHAR(20)
# is a pretty silly limitation.

my $header = <$fh>;
$csv->parse($header) or die "$0: parse of header row failed\n";
my @columns = $csv->fields();
my $sql = "CREATE TABLE $table_name (\n  "
. join(",\n  ", map {quote_odd_name($_).' TEXT'} @columns)
. "\n)";
warn "\n$sql\n";
$dbh->do($sql);

# add rows of data using COPY
$dbh->do("COPY $table_name FROM STDIN WITH DELIMITER AS ','");
my $row_count = 0;
while(<$fh>) {
  $dbh->pg_putline($_);
  $row_count++;
}
close $fh;
$dbh->pg_endcopy;
$dbh->commit;
$dbh->disconnect;

warn "$0: Added $row_count rows to $table_name\n";

sub quote_odd_name {
my $name = shift;
my $sth = $dbh->prepare_cached('SELECT quote_ident(?)');
$sth->execute($name);
my $row = $sth->fetchrow_hashref;
$sth->finish;
return $row->{quote_ident};
}

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Asychronous database replication

2005-09-16 Thread Greg Stark
John DeSoi <[EMAIL PROTECTED]> writes:

> > If you need data to propagate from the clients back to the server  then 
> > things
> > get more complicated. Even then you could side step a lot of  headaches if 
> > you
> > can structure the application in specific ways, such as  guaranteeing that 
> > the
> > clients can only insert, never update records.
> 
> And even updates could be OK if the application can support the right
> partitioning of the data and only do it one place at a time. With  some kinds
> of field based work it might be suitable to have global  (read only) data 
> along
> with data created in the field that is site/ client specific. As long as the
> data collected in the field is not  being updated on the master, it could
> continue to be updated in the  field and synced back to the master database.

Sure, though then you have to deal with what data to display on the client
end. The most recently downloaded master data or the locally updated data?
What about after you upload your local data when you're not sure whether the
master data has been reconciled? Not impossible but it would be more work.

But I find a surprisingly high fraction of applications are very amenable to
being handled as insert-only. A medical application strikes me as something
someone is all the more likely to be happy with an insert-only model.

So instead of allowing having remote users to modify data directly you only
allow them to "request" an update. Then when they look at the record it still
makes logical sense to see the old data, along with their "requested" updates.

Essentially, any replication system is based on insert-only queues. If you can
design the application around that you avoid having to implement some sort of
mapping to some hiding that.

-- 
greg


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Setting WHERE on a VIEW with aggregate function.

2005-09-16 Thread Roger Hand
I have a view to generate a list of instructors and a count of their
> future classes.
> 
> "instructors" is a link table between "class" and "person".
> 
> CREATE VIEW future_instructor_counts
> AS
> SELECT  person.id AS person_id,
> first_name,
> last_name,
> count(instructors.class) AS class_count
> 
>   FROM  class, instructors, person
> 
>  WHERE  class.id= instructors.class AND
> person.id   = instructors.person
> AND class_time > now()
> 
>   GROUP BY  person_id, first_name, last_name;

The trick is to do the data aggregation separately, then JOIN in whatever other 
fields you want.

Something like this:

CREATE VIEW future_instructor_counts
AS
SELECT  * FROM 

(SELECT 
 person.id AS person_id,
first_name,
last_name) personinfo

INNER JOIN 

(SELECT class.id FROM class
WHERE class_time > now() ) classes

INNER JOIN

(SELECT 
 id, count(class) AS class_count 
FROM instructors GROUP BY id) classcount

ON personinfo.person_id = instructors.id
AND classes.id = instructors.id

In many cases when using aggregate functions you get just the fields you need 
from the agg function (typically an id plus the aggregate result) and JOIN with 
other tables (or even the same table) to get other info such as first_name, 
last_name, etc.

Otherwise, if you GROUP BY additional fields so you can get them in the output, 
you may be making the db do additional work.

> 1) With an aggregate function in the query, is there any way to remove
> the "AND class_time > now()" so that timestamp can be passed in the
> select?  That is, I'd like to be able to do this?
> 
> select * from instructor_counts where class_time > now();
> 
> But class_time is not part of the VIEW so that's not valid.

No problem, just make it a part of the view. See the classes section below.

CREATE VIEW future_instructor_counts
AS
SELECT  * FROM 

(SELECT 
 person.id AS person_id,
first_name,
last_name) personinfo

INNER JOIN 

-- Add class_time field!
(SELECT class.id, class_time FROM class
WHERE class_time > now() ) classes

INNER JOIN

(SELECT 
 id, count(class) AS class_count 
FROM instructors GROUP BY id) classcount

ON personinfo.person_id = instructors.id
AND classes.id = instructors.id

[Disclaimer: I've not tested this code at all. It could help if you sent table 
definitions and maybe even dummy
data via insert commands.]

>  And if it was included then I don't have an aggregate function any more - no
> more grouping.

If you do the agg function separately like this that isn't an issue. You join 
tables to get whatever fields you'd like to have in your output.
 
> 2) I think I'm missing something obvious.  I know that I need to
> specify all my non-aggregate columns in the "GROUP BY", but I don't
> under stand why.  Really, the results are just grouped only by
> person.id so why the need to specify the other columns.
> 
> And if you don't specify all the columns then Postgresql reports:
> 
>   ERROR:  column "person.id" must appear in the GROUP BY 
> clause or be used in an aggregate function
> 
> Is there a reason Postgresql doesn't just add the column
> automatically?  It does in other cases (like a missing table in a
> join).

As I mention above, if you GROUP BY additional fields just to get them in the 
output, you may be making the db do additional work.

I seem to remember that in a later SQL standard (ie, after SQL-99 but I could 
be wrong) I believe it allows you to specify additional fields in SELECT that 
are not in the GROUP BY clause. But PG isn't there yet. 

-Roger

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Bill Moseley
Sent: Friday, September 16, 2005 11:30 AM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Setting WHERE on a VIEW with aggregate function.


I have a view to generate a list of instructors and a count of their
future classes.

"instructors" is a link table between "class" and "person".

CREATE VIEW future_instructor_counts
AS
SELECT  person.id AS person_id,
first_name,
last_name,
count(instructors.class) AS class_count

  FROM  class, instructors, person

 WHERE  class.id= instructors.class AND
person.id   = instructors.person
AND class_time > now()

  GROUP BY  person_id, first_name, last_name;


I have two very basic SQL questions:

1) With an aggregate function in the query, is there any way to remove
the "AND class_time > now()" so that timestamp can be passed in the
select?  That is, I'd like to be able to do this?

select * from inst

[GENERAL] Restoring just a table or row from a backup copy.

2005-09-16 Thread Fernando Lujan
Is there a way to do that?

Thanks in advance.

Fernando Lujan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] Anyone use Eclipse?

2005-09-16 Thread Josh Berkus
People:

The Eclipse project is interested in having PostgreSQL people contribute to 
their Data Services plug in.   Do we have any java hackers in the 
community using Eclipse?  Anyone interested?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Partial dates

2005-09-16 Thread Bruno Wolff III
On Wed, Sep 14, 2005 at 00:09:58 -0400,
  Joe <[EMAIL PROTECTED]> wrote:
> 
> I'm not sure I agree with the need to "fix" or "sanitize" the data.  The 
> columns in question are used mostly for publication dates.  While you may 
> be able to find a full release date for recent books, they are generally 
> listed as published on a given year, period.  Most monthly magazines only 
> have a month-year of publication.  And of course, daily newspapers and 
> Internet articles usually have a full day-month-year.  In fact, the MySQL 
> solution didn't address quarterly or bi-monthly publications as that data 
> was only captured as one of the months in the period--as opposed to Mar/Apr 
> 2005 or First Quarter 2005 (or worse: Winter 2004). As Tom Lane argued, it 
> seems I'll have to bite the bullet and create a new datatype.  The only 
> other alternative I see would be to split the publication date into three 
> columns and that's rather ugly.

It seems like in your case the dates are best expressed as ranges and that
you could store the information you have using an earliest possible date
and a lastest possible date. If there are extra constraints based on how
far apart the begin and end dates are you could implement them as well.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Restoring just a table or row from a backup copy.

2005-09-16 Thread Scott Marlowe
On Fri, 2005-09-16 at 14:13, Fernando Lujan wrote:
> Is there a way to do that?

pg_dumpall --help

pg_restore restores a PostgreSQL database from an archive created by
pg_dump.
 
Usage:
  pg_restore [OPTION]... [FILE]

(SNIP)

  -t, --table=NAME restore named table

(SNIP)



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Replication

2005-09-16 Thread Scott Marlowe
On Fri, 2005-09-16 at 12:51, Russ Brown wrote:
> Scott Ribe wrote:
> >>How about a third choice: you can also use a proven, reliable and tested
> >>  replication solution that is included in the core system because the
> >>core system basiclly provides it anyway.
> > 
> > 
> > Sure, but that one is spelled "Sybase", not "MySQL" ;-)
> > 
> > 
> 
> It's amazing how misunderstood my post was.
> 
> My third choice was a hypothetical future version of PostgreSQL, 
> modified from its current form very slightly to include a form of 
> replication 'out of the box': a couple of scripts to enable WAL log 
> transfer and also a solution to the problem of WAL log delay mentioned 
> by other posters.
> 
> I only mentioned MySQL because their 'out of the box' solution involves 
> transferring the binlogs, which is similar to the method of transferring 
> the PostgreSQL WAL logs, and it just made me think. That's all. I wasn't 
> comparing, I wasn't suggesting MySQL is better than PostgreSQL. I wasn't 
> suggesting that they have the 'ultimate' solution. I wasn't even 
> suggesting that they have a good solution. It just made me think. That's 
> all.
> 
> Well, I've learned my lesson. Next time I post I'll be sure not to 
> mention MySQL in any way, shape or form.

Actually, I would just suggest to not hold it up as an example of how
things should be done.  That would work for me.

There was a time, 5 to 10 years ago, when MySQL AB spent a LOT of energy
demonizing PostgreSQL to make themselves look better.  There were pages
of misinformation in their documentation about how PostgreSQL was
basically crap, and MySQL did everything right, and a lot of people
spent a lot of time debunking that.

MySQL AB now plays better with others, and hasn't engaged in the kind of
character assassination they once did, but there's STILL a sore spot for
most PostgreSQL users and developers there, because they used to have to
spend a lot of energy and time explaining that what was on the MySQL
site was lies and misinformation.  A LOT of time.  And it did hurt
PostgreSQL, in terms of keeping people away from it.

So, there's an almost automatic response triggered by someone mentioning
how MySQL does things, especially if they're perceived to be holding
MySQL up as an example to the PostgreSQL community on how things should
be done.

In my original post, my main point wasn't just against MySQL, it was
against the philosophy that just because replication is  included and
part of the core of a database, it doesn't mean that it's reliable or
well tested.  And MySQL is a fine example of that.  Their replication
really does have a lot of issues.

So, feel free to mention MySQL, but know that mostly when it's mentioned
here, it's mentioned as an example of how things shouldn't be done.  In
terms of coding, marketing, testing, or licensing.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] Duplicate Values or Not?!

2005-09-16 Thread John Seberg
I recently tried to CREATE a UNIQUE INDEX and could
not, due to duplicate values:

CREATE UNIQUE INDEX usr_login ON usr (login);

To try to find the offending row(s), I then executed
the following:

SELECT count(*), login FROM usr GROUP BY login ORDER
BY 1 DESC;

The GROUP BY didn't group anything, indicating to me
that there were no duplicate values. There were the
same number of rows in this query as a simple SELECT
count(*) FROM usr.

This tells me that Postgresql is not using the same
method for determining duplicates when GROUPING and
INDEXing.

I dig a little deeper. The best candidate I find for a
possible duplicate are caused by characters that did
not translate well. IIRC, the basis was the name Pena,
which looked like Pe?a. I'm thinking the original data
was not encoded properly, or my export didn't handle
encodings properly, etc. The two Penas used different
characters in the 3rd position, neither of which were
translated correctly.

I loaded up data from another database vendor (4th
Dimension), into a 8.0.3 Postgresql, which I had
compiled from source with the default configuration.
This was on Yellow Dog Linux 4.0.1.

I brought the same data into a 8.0.1 on Max OS X
(binary from entropy.ch) and did NOT have this UNIQUE
INDEX failure. 

I'm sure my problems are deeper than the INDEX
failure, involving the accuracy of the conversion,
but, short term, I would like to know what is
different? They both are SQL_ASCII databases. I tried
importing into a UNICODE database, but that really a
mess of errors (during COPY).

I realize I need to learn about encodings, my source
data, etc., but I'm looking for hints. Anybody
experienced in exported 4th Dimension data containing
a certain amount of foriegn language text?

Thanks,




__ 
Yahoo! Mail - PC Magazine Editors' Choice 2005 
http://mail.yahoo.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] pg_ctl reload breaks our client

2005-09-16 Thread Marc Munro
Yesterday a client application broke immediately after we issued a
pg_ctl reload command.  The only change we had made was to pg_hba.conf
to enable trusted connections from localhost.

My question is, how should the client application be affected by such a
reload?

My impression was that the client should be totally unaware of a reload,
but reality does not bear this out.

Any ideas/informed responses will be welcomed.  I suspect that this has
uncovered a bug in our client but without knowing what the client
experience shuold be, it's hard to narrow down where the bug may lie.

Thanks.

__
Marc Munro


signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Replication

2005-09-16 Thread Russ Brown

Scott Marlowe wrote:

On Fri, 2005-09-16 at 12:51, Russ Brown wrote:


Scott Ribe wrote:


How about a third choice: you can also use a proven, reliable and tested
replication solution that is included in the core system because the
core system basiclly provides it anyway.



Sure, but that one is spelled "Sybase", not "MySQL" ;-)




It's amazing how misunderstood my post was.

My third choice was a hypothetical future version of PostgreSQL, 
modified from its current form very slightly to include a form of 
replication 'out of the box': a couple of scripts to enable WAL log 
transfer and also a solution to the problem of WAL log delay mentioned 
by other posters.


I only mentioned MySQL because their 'out of the box' solution involves 
transferring the binlogs, which is similar to the method of transferring 
the PostgreSQL WAL logs, and it just made me think. That's all. I wasn't 
comparing, I wasn't suggesting MySQL is better than PostgreSQL. I wasn't 
suggesting that they have the 'ultimate' solution. I wasn't even 
suggesting that they have a good solution. It just made me think. That's 
all.


Well, I've learned my lesson. Next time I post I'll be sure not to 
mention MySQL in any way, shape or form.



Actually, I would just suggest to not hold it up as an example of how
things should be done.  That would work for me.



I didn't!!


There was a time, 5 to 10 years ago, when MySQL AB spent a LOT of energy
demonizing PostgreSQL to make themselves look better.  There were pages
of misinformation in their documentation about how PostgreSQL was
basically crap, and MySQL did everything right, and a lot of people
spent a lot of time debunking that.



I remember that time, and I remember being very annoyed about it. I am 
still frustrated now by people who will believe the FUD that was spread 
at the time and won't even consider PostgreSQL as a result. That is 
basically why the company I work for uses MySQL, and simply will not 
consider changing, no matter how hard I try to make it happen.



MySQL AB now plays better with others, and hasn't engaged in the kind of
character assassination they once did, but there's STILL a sore spot for
most PostgreSQL users and developers there, because they used to have to
spend a lot of energy and time explaining that what was on the MySQL
site was lies and misinformation.  A LOT of time.  And it did hurt
PostgreSQL, in terms of keeping people away from it.



Indeed. As I say above, that's why my company is staying away from it, 
despite my best efforts.



So, there's an almost automatic response triggered by someone mentioning
how MySQL does things, especially if they're perceived to be holding
MySQL up as an example to the PostgreSQL community on how things should
be done.



I've noticed that. I've been reading (and occasionally posting to) this 
list for a few years now, and it's the one and only thing about it that 
bugs me. Other lists bug me in a lot of ways due to attitudes of some of 
the major contributors, but I've always enjoyed this list greatly 
primarily *because* of the major contributors (yourself included). It's 
educational, friendly and very helpful. I learn a lot from this list and 
enjoy the discussion.


This one thing bugs me because I'm not generally an emotionally reactive 
person: I prefer to consider things fairly before responding, which is 
why I frequently don't respond at all to things if I don't think it 
would help matters. I feel that it is an extremely unwise policy to 
automatically assume that what your competitors are doing is worse than 
what you are doing, and that there's nothing you can learn from them. 
That's how you get left behind. That's not to say that there *is* a 
great deal that PostgreSQL can learn from MySQL, but one should not 
assume that there is nothing.


From my readings on this list the majority of examples of people using 
MySQL as an example of how PostgreSQL should be doing things have been 
misguided/wrong/trolling etc. However, from time to time a legitimate 
example is raised, and in those situations the responses given have 
sometimes been disappointing.



In my original post, my main point wasn't just against MySQL, it was
against the philosophy that just because replication is  included and
part of the core of a database, it doesn't mean that it's reliable or
well tested.  And MySQL is a fine example of that.  Their replication
really does have a lot of issues.



Indeed. But just to stress the point, I wasn't stating that the included 
replication in MySQL was any good (though it's not terrible as we're 
using it heavily in an extremely high-volume situation with few 
problems), I was just bringing up the idea of getting a decent 
replication solution included in PostgreSQL for relatively little effort.



So, feel free to mention MySQL, but know that mostly when it's mentioned
here, it's mentioned as an example of how things shouldn't be done.  In
terms of coding, marketing, testing, or l

Re: [GENERAL] pg_ctl reload breaks our client

2005-09-16 Thread Michael Fuhr
On Fri, Sep 16, 2005 at 01:28:13PM -0700, Marc Munro wrote:
> Yesterday a client application broke immediately after we issued a
> pg_ctl reload command.

How did the client break?  What behavior did it exhibit?  Were there
any errors in the server's logs?  Can you duplicate the problem?
What version of PostgreSQL are you using, and on what platform?

-- 
Michael Fuhr

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Replication

2005-09-16 Thread Matthew Terenzio


On Sep 16, 2005, at 4:30 PM, Russ Brown wrote:


just because replication is  included and
part of the core of a database, it doesn't mean that it's reliable or
well tested.


I just jumped into this thread, and this statement just triggers one 
thought.


Many Postgres end users rely on experts like those on this list for the 
best advice on how to use Postgres and what might be the best solution 
for non-core needs such as replication.


Inclusion does probably make those users feel comfort that the experts 
for that particular software or product deems it one of , if not the 
best solution for the given problem.


It's not always true and may not be right for this situation, since 
replication needs for different situations may vary widely.


But inclusion does imply some type of endorsement.

For instance I feel pretty comfortable with TSeach2 though I don't know 
much about. That fact that it comes in the contribs is an endorsement.
I'd be confused if the consensus was that there is a better, as easy to 
use and license compatible full text index available and was for some 
reason NOT included.


Of course, I'd still ask the list. . .


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Setting WHERE on a VIEW with aggregate function.

2005-09-16 Thread Bill Moseley
On Fri, Sep 16, 2005 at 12:06:19PM -0700, Roger Hand wrote:
> > select * from instructor_counts where class_time > now();
> > 
> > But class_time is not part of the VIEW so that's not valid.
> 
> No problem, just make it a part of the view. See the classes section below.
> 
> CREATE VIEW future_instructor_counts
> AS
> SELECT  * FROM 
> 
>   (SELECT 
>person.id AS person_id,
> first_name,
> last_name) personinfo
> 
>   INNER JOIN 
> 
>   -- Add class_time field!
>   (SELECT class.id, class_time FROM class
>   WHERE class_time > now() ) classes
> 
>   INNER JOIN
> 
>   (SELECT 
>id, count(class) AS class_count 
>   FROM instructors GROUP BY id) classcount
> 
>   ON personinfo.person_id = instructors.id
>   AND classes.id = instructors.id

I couldn't get that to work -- Postgresql isn't that helpful just
reporting "ERROR:  syntax error at or near ";" at character 496" even
after adding a FROM in the first select.  So, I'm stabbing in the dark
to get it to work.

> [Disclaimer: I've not tested this code at all. It could help if you sent 
> table definitions and maybe even dummy
> data via insert commands.]

Ok -- this should be cut-n-paste:

CREATE TABLE class (
id  integer PRIMARY KEY,
class_time  timestamp(0) with time zone,
nametext
);

CREATE TABLE person (
id  integer PRIMARY KEY,
first_name  text
);

create table instructors (
person  integer NOT NULL REFERENCES person,
class   integer NOT NULL REFERENCES class,
PRIMARY KEY  (person, class)
);

INSERT INTO person (id,first_name) values (1,'Joe');
INSERT INTO person (id,first_name) values (2,'Mary');
INSERT INTO person (id,first_name) values (3,'Bob');
INSERT INTO person (id,first_name) values (4,'Cindy');

INSERT INTO class (id,name, class_time) values (1,'Math', now());
INSERT INTO class (id,name, class_time) values (2,'Math', now() + interval '1 
day');
INSERT INTO class (id,name, class_time) values (3,'Science', now());
INSERT INTO class (id,name, class_time) values (4,'PE', now() + interval '1 
day');

INSERT INTO instructors (person, class) values (1,1);  -- joe teaches math now

INSERT INTO instructors (person, class) values (1,2);  -- joe teaches math 
tomorrow
INSERT INTO instructors (person, class) values (2,2);  --   with Mary

INSERT INTO instructors (person, class) values (3,3);  -- Bob teaches science 
now
INSERT INTO instructors (person, class) values (4,3);  -- Cindy teaches science 
tomorrow

-- view

CREATE VIEW instructor_counts
AS
SELECT  person.id AS person_id,
first_name,
count(instructors.class) AS class_count

  FROM  class, instructors, person

 WHERE  class.id= instructors.class AND
person.id   = instructors.person
-- AND class_time > now()

  GROUP BY  person_id, first_name; 


select * from instructor_counts order by class_count desc;

-- Returns:

 person_id | first_name | class_count 
---++-
 1 | Joe|   2
 2 | Mary   |   1
 3 | Bob|   1
 4 | Cindy  |   1
(4 rows)

My GOAL above is to be able to add a WHERE class_time > $some_time.


Here's were I left off, which I never could get to work.
The individual selects work, but seems like I need to be say
c.class_id = i.class in addition.  But I can't even get
this without syntax errors:

CREATE VIEW instructor_counts
AS
SELECT  * 

FROM

(SELECT person.id AS person_id, first_name
 FROM person) p

INNER JOIN

(SELECT class.id AS class_id, class_time
 FROM class) c

INNER JOIN

(SELECT person, count(class) AS class_count
 FROM instructors GROUP BY person) i

ON ( p.person_id = i.person);


That also looks like the selects are going to be full table scans.




-- 
Bill Moseley
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] pg_ctl reload breaks our client

2005-09-16 Thread Marc Munro
Michael,
It is Postgres 7.3.6.  The client is a multi-threaded C++ client.  The
breakage was that one group of connections simply stopped.  Others
contined without problem.  It is not clear exactly what was going on.

Nothing in our application logs gives us any clue to this.

As for reproducibility, it has hapenned before in test environments when
we have bounced the datanase.  This is not too shocking as I would
expect the client to notice this :-)  It is a little more shocking when
it's a reload.  Or maybe I have simply misunderstood what reload does.

I am simply looking for clues here and don't expect definitive answers.
That's why I was a little vague.

Am I right though, in thinking that a reload shuold be pretty much
invisible to the client, or will certain operations fail and require a
re-try?

__
Marc

On Fri, 2005-09-16 at 14:40 -0600, Michael Fuhr wrote:
> On Fri, Sep 16, 2005 at 01:28:13PM -0700, Marc Munro wrote:
> > Yesterday a client application broke immediately after we issued a
> > pg_ctl reload command.
> 
> How did the client break?  What behavior did it exhibit?  Were there
> any errors in the server's logs?  Can you duplicate the problem?
> What version of PostgreSQL are you using, and on what platform?
> 


signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Question about a query plan

2005-09-16 Thread Martijn van Oosterhout
On Fri, Sep 16, 2005 at 10:18:19AM -0700, Bill Moseley wrote:
> On Fri, Sep 16, 2005 at 10:02:28AM -0500, Thomas O'Connell wrote:
> > For older versions of postgres, it's recommended that you determine  
> > the time in the client and use constant data in your query.
> 
> Interesting.  I have a few VIEWs that include now(), but I guess I
> could adjust and pass in the date from the client.

I sometimes use the constant 'now' instead, since it obviously isn't
mutable. It's a bit tricky because sometimes it expands when you define
the statement. You have to do 'now'::text usually and even then you
should check...
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


pgpS4f2swTZWj.pgp
Description: PGP signature


Re: [GENERAL] pg_ctl reload breaks our client

2005-09-16 Thread Michael Fuhr
On Fri, Sep 16, 2005 at 02:16:29PM -0700, Marc Munro wrote:
> It is Postgres 7.3.6.  The client is a multi-threaded C++ client.  The
> breakage was that one group of connections simply stopped.  Others
> contined without problem.  It is not clear exactly what was going on.

How did the connections "stop"?  Were the connections broken, causing
queries to fail?  Or did queries block and never return?  Or something
else?  What was happening that shouldn't happen, or what wasn't
happening that should happen?

If the connections were still active but not returning, did you do
a process trace on the connection's postmaster or attach a debugger
to it to see what it was doing?

Could the timing of the problem have been coincidence?  Have you
ever seen the problem without a reload?  How often do you see the
problem after a reload?  Do you know for certain that the application
was working immediately before the reload and not working immediately
after it?

What operating system are you using?

> Nothing in our application logs gives us any clue to this.

What about the postmaster logs?

> As for reproducibility, it has hapenned before in test environments when
> we have bounced the datanase.  This is not too shocking as I would
> expect the client to notice this :-)  It is a little more shocking when
> it's a reload.  Or maybe I have simply misunderstood what reload does.

Can you reproduce the problem with a reload?  A stop and start will
terminate client connections, but a reload shouldn't.

-- 
Michael Fuhr

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] pg_ctl reload breaks our client

2005-09-16 Thread Marc Munro
Michael,
Many thanks for your response; it is much appreciated.  My responses are
embedded below:

On Fri, 2005-09-16 at 17:10 -0600, Michael Fuhr wrote:
> On Fri, Sep 16, 2005 at 02:16:29PM -0700, Marc Munro wrote:
> > It is Postgres 7.3.6.  The client is a multi-threaded C++ client.  The
> > breakage was that one group of connections simply stopped.  Others
> > contined without problem.  It is not clear exactly what was going on.
> 
> How did the connections "stop"?  Were the connections broken, causing
> queries to fail?  Or did queries block and never return?  Or something
> else?  What was happening that shouldn't happen, or what wasn't
> happening that should happen?

From the server side, there were simply connections (1 or 2) that
appeared idle.  From the client side it looked like a query had been
initiated but the client thread was stuck in a library call (as near as
we can tell).  This, vague though it is, is as much as I know right now.
We were unable to do much debugging as it is a production system and the
priority was to get it back up.

> If the connections were still active but not returning, did you do
> a process trace on the connection's postmaster or attach a debugger
> to it to see what it was doing?

No, time pressure prevented this.

> Could the timing of the problem have been coincidence?  Have you
> ever seen the problem without a reload?  How often do you see the
> problem after a reload?  Do you know for certain that the application
> was working immediately before the reload and not working immediately
> after it?

It *could* be coincidence, but the problem began within 5 seconds of the
reload.  Coincidence is unlikely.

> What operating system are you using?

Linux 2.4.20 smp i686

> 
> > Nothing in our application logs gives us any clue to this.
> 
> What about the postmaster logs?

Ah, now there's another story.  Unavailable I'm afraid.  Resolving that
is also on my priority list.

> > As for reproducibility, it has hapenned before in test environments when
> > we have bounced the datanase.  This is not too shocking as I would
> > expect the client to notice this :-)  It is a little more shocking when
> > it's a reload.  Or maybe I have simply misunderstood what reload does.
> 
> Can you reproduce the problem with a reload?  A stop and start will
> terminate client connections, but a reload shouldn't.

This is not currently seen as a priority (the work-around of "don't do
that" is seen as sufficient).  I'm simply hoping to get someone to say
for sure that the client app should not be able to tell that a reload
has happened.  At that point I may be able to raise the priority of this
issue.  

I would certainly like to do more investigation.  If postgresql hackers
are interested in this strange event (please tell me for sure that it
*is* strange) that may also help me to get the necessary resources to
run more tests.

Thanks again.

__
Marc Munro




signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Setting WHERE on a VIEW with aggregate function.

2005-09-16 Thread Roger Hand
>> > I have a view to generate a list of instructors and a count of their
>> > future classes.

>> > select * from instructor_counts where class_time > now();
>> > 
>> > But class_time is not part of the VIEW so that's not valid.
>> 
>> No problem, just make it a part of the view. 
 
"No problem", I said ... famous last words.

> select * from instructor_counts order by class_count desc;
> 
> -- Returns:
> 
>  person_id | first_name | class_count 
> ---++-
>  1 | Joe|   2
>  2 | Mary   |   1
>  3 | Bob|   1
>  4 | Cindy  |   1
> (4 rows)
> 
> My GOAL above is to be able to add a WHERE class_time > $some_time.

Armed with your table ddl and sample data I can see the problem more clearly. 

Unfortunately class_time cannot be a column in the view output.  For example, 
look at the "Joe" line above ... if he teaches two classes which "class_time" 
would it show? 

Since class_time can't be a column in the view output it can't be used in a 
WHERE clause.

So it would appear to me that you won't able to meet your goal by simply using 
a view.

However, there are other solutions you can use which may or may not be 
appropriate. I can think of three at the moment.

#1: Function Solution:

To use functions you may first need to run this at command line:

createlang plpgsql electric


Then create a function that you can pass in a date to:

CREATE FUNCTION getclasscount("timestamp") RETURNS "refcursor" AS '
DECLARE curs refcursor; 
BEGIN
OPEN curs FOR 
SELECT  * FROM 
  (SELECT 
   person, count(class) AS class_count 
  FROM instructors
  INNER JOIN class
ON class.id = instructors.person
  WHERE class.class_time > $1
 GROUP BY person
  ) classcount

  INNER JOIN

  (SELECT 
   person.id AS person_id,
first_name FROM person
   ) personinfo

  ON personinfo.person_id = classcount.person
RETURN curs;
END;
' LANGUAGE 'plpgsql';

This would be the best solution if you are in control of the application source 
code. In Java, for example, it's relatively simple to call this function and 
return the result as a result set. If you're working in Java I'd be glad to 
show you same sample code.

If you really do need a view for some reason, then this wouldn't work.

#2: Simplify the Select Criteria Solution:
=

A slightly less flexible approach, but one that may be workable, would be to 
add two boolean columns (with default value of '0') to the class table: 
"completed" and "current". Then once a semester you run a simple query that 
updates them. Something like:

UPDATE class SET current = '1' where class_time = '2005-09-01';
UPDATE class SET completed = '1' where class_time < '2005-09-01';

Then the view would be:

CREATE VIEW vclasscount AS
SELECT  * FROM 
  (SELECT 
   person, count(class) AS class_count 
  FROM instructors
  INNER JOIN
  (SELECT id FROM class 
  WHERE class.completed = '0'
AND class.current = '0') futureclasses
ON futureclasses.id = instructors.class
 GROUP BY person
  ) classcount

  INNER JOIN

  (SELECT 
   person.id AS person_id,
first_name FROM person
   ) personinfo

  ON personinfo.person_id = classcount.person

This would be a nice simple solution, but may not be flexible enough in that 
you can't specify an ad-hoc date or date-range ... you would only see future 
class count.

3: Use a Temp Table
===
Again, if you have control of application logic, you could:

1) SELECT * INTO futureclasses FROM class where class_time > ?

2) Then make the view against futureclasses rather than classes.

Good luck!

-Roger

> Bill Moseley


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] Divide a float4 by 1 - what is going on???????

2005-09-16 Thread Daniel Schuchardt

Hi people,

can anyone explain what is going on here : ?

CIMSOFT=# CREATE TEMP TABLE test (n1 FLOAT4);
CREATE TABLE
CIMSOFT=# INSERT INTO test (n1) VALUES (2.456677);
INSERT 6571521 1
CIMSOFT=# SELECT * FROM test;
  n1   
-

2.45668
(1 row)

CIMSOFT=# SELECT n1/1 FROM test;
?column?
--
2.45667695999146
(1 row)

Why I get so many digits by a division with one? Anybody should have 
learned that everything / 1 = everything ;-)


Thnx,
Daniel

PS :

PostgreSQL 8.0.0 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 
(mingw


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [GENERAL] pg_ctl reload breaks our client

2005-09-16 Thread Michael Fuhr
On Fri, Sep 16, 2005 at 04:34:46PM -0700, Marc Munro wrote:
> On Fri, 2005-09-16 at 17:10 -0600, Michael Fuhr wrote:
> > Can you reproduce the problem with a reload?  A stop and start will
> > terminate client connections, but a reload shouldn't.
> 
> This is not currently seen as a priority (the work-around of "don't do
> that" is seen as sufficient).  I'm simply hoping to get someone to say
> for sure that the client app should not be able to tell that a reload
> has happened.  At that point I may be able to raise the priority of this
> issue.  

As far as I know clients shouldn't notice a reload (which is effected
via a SIGHUP); I just did some tests and didn't see any problems.
However, I don't know much about the inner workings of PostgreSQL
so I can't say for sure.  Maybe one of the developers will comment.

-- 
Michael Fuhr

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Divide a float4 by 1 - what is going on???????

2005-09-16 Thread Dann Corbit
Float provides 6-7 digits of precision.
I see nothing surprising down below.

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:pgsql-general-
> [EMAIL PROTECTED] On Behalf Of Daniel Schuchardt
> Sent: Friday, September 16, 2005 5:13 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] Divide a float4 by 1 - what is going on???
> 
> Hi people,
> 
> can anyone explain what is going on here : ?
> 
> CIMSOFT=# CREATE TEMP TABLE test (n1 FLOAT4);
> CREATE TABLE
> CIMSOFT=# INSERT INTO test (n1) VALUES (2.456677);
> INSERT 6571521 1
> CIMSOFT=# SELECT * FROM test;
>n1
> -
>  2.45668
> (1 row)
> 
> CIMSOFT=# SELECT n1/1 FROM test;
>  ?column?
> --
>  2.45667695999146
> (1 row)
> 
> Why I get so many digits by a division with one? Anybody should have
> learned that everything / 1 = everything ;-)
> 
> Thnx,
> Daniel
> 
> PS :
> 
> PostgreSQL 8.0.0 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC)
3.4.2
> (mingw
> 
> ---(end of
broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Divide a float4 by 1 - what is going on???????

2005-09-16 Thread Michael Fuhr
On Sat, Sep 17, 2005 at 02:12:45AM +0200, Daniel Schuchardt wrote:
> CIMSOFT=# SELECT n1/1 FROM test;
> ?column?
> --
> 2.45667695999146
> (1 row)
> 
> Why I get so many digits by a division with one? Anybody should have 
> learned that everything / 1 = everything ;-)

Looks like the division is being done in double precision (float8)
and you're seeing the effects of an inexact representation.

test=> SELECT 2.456677::real / 1;
 ?column? 
--
 2.45667695999146
(1 row)

test=> SELECT 2.456677::real / 1::real;
 ?column? 
--
  2.45668
(1 row)

-- 
Michael Fuhr

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Setting WHERE on a VIEW with aggregate function.

2005-09-16 Thread Bill Moseley
On Fri, Sep 16, 2005 at 04:56:25PM -0700, Roger Hand wrote:
> So it would appear to me that you won't able to meet your goal by simply 
> using a view.
> 
> However, there are other solutions you can use which may or may not be 
> appropriate. I can think of three at the moment.
> 
> #1: Function Solution:

I'll take a look at this.  It's about time I started to learn about
functions a bit.

> This would be the best solution if you are in control of the
> application source code. In Java, for example, it's relatively
> simple to call this function and return the result as a result set.
> If you're working in Java I'd be glad to show you same sample code.

I'm using Perl and DBI (really Class::DBI but DBI is not far away).

> #2: Simplify the Select Criteria Solution:
> =
> 
> A slightly less flexible approach, but one that may be workable, would be to 
> add two boolean columns (with default value of '0') to the class table: 
> "completed" and "current". Then once a semester you run a simple query that 
> updates them. Something like:
> 
> UPDATE class SET current = '1' where class_time = '2005-09-01';
> UPDATE class SET completed = '1' where class_time < '2005-09-01';

Classes start daily (and at different hours).  I could cron once an
hour I suppose, but I'd rather not de-normalize the data.

Maybe I can just create three views (future, recent, old) and live
with that.

The temporary table is another possibility I'll look into.

Thank you very much for spending time on this.  I really appreciate
it.


-- 
Bill Moseley
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] pg_ctl reload breaks our client

2005-09-16 Thread brew

Marc

> Yesterday a client application broke immediately after we issued a
> pg_ctl reload command.  The only change we had made was to pg_hba.conf
> to enable trusted connections from localhost.

Can you change pg_hba.conf back to what it had been prior and do a reload
again and check if the clients start working?

I've gotten confused and shot myself in the foot when setting pg_hba.conf
a few times, myself.

brew

 ==
  Strange Brew   ([EMAIL PROTECTED])
  Check out my Stock Option Covered Call website  http://www.callpix.com
 and my Musician's Online Database Exchange http://www.TheMode.com
 ==


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Question about a query plan

2005-09-16 Thread Tom Lane
"Thomas O'Connell" <[EMAIL PROTECTED]> writes:
> On Sep 16, 2005, at 9:18 AM, Bill Moseley wrote:
>> First question is why the planner is not using an index scan when I
>> use "now()" or CURRENT_TIMESTAMP?

> It also used to be the case (pre-8.0; I couldn't find in the release  
> notes whether this was an 8.0 or 8.1 fix) that now() and  
> CURRENT_TIMESTAMP were not indexable, I think because of mutability.

8.0 fix.  It's not that they're "not indexable" per se, it's that
pre-8.0 planners punted when it came to estimating what fraction of rows
would meet a condition like "timestampcol > now()" --- and the default
estimate for such things doesn't favor an indexscan.  The 8.0 change is
to go ahead and run the function and see what value it's returning now
(pardon the pun) and then compare that to the column's statistical
histogram to derive a rows estimate.  It's entirely likely that we'll
get ragged on eventually because of cases where this procedure generates
bad estimates ... but for the moment it seems like a win.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Setting WHERE on a VIEW with aggregate function.

2005-09-16 Thread Tom Lane
Bill Moseley <[EMAIL PROTECTED]> writes:
> I couldn't get that to work -- Postgresql isn't that helpful just
> reporting "ERROR:  syntax error at or near ";" at character 496" even
> after adding a FROM in the first select.  So, I'm stabbing in the dark
> to get it to work.

[ not directly on topic, but... ]  I think the answer to that is "use a
newer version of Postgres".  8.0 and up provide fairly specific pointers
for syntax errors.  Silly example:

regression=# select foo, bar
regression-# , baz,
regression-# zip zap zot
regression-# blah blah;
ERROR:  syntax error at or near "zap" at character 28
LINE 3: zip zap zot
^
regression=# 

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] pg_ctl reload breaks our client

2005-09-16 Thread Tom Lane
Michael Fuhr <[EMAIL PROTECTED]> writes:
> On Fri, Sep 16, 2005 at 04:34:46PM -0700, Marc Munro wrote:
>> On Fri, 2005-09-16 at 17:10 -0600, Michael Fuhr wrote:
>> This is not currently seen as a priority (the work-around of "don't do
>> that" is seen as sufficient).  I'm simply hoping to get someone to say
>> for sure that the client app should not be able to tell that a reload
>> has happened.  At that point I may be able to raise the priority of this
>> issue.  

> As far as I know clients shouldn't notice a reload (which is effected
> via a SIGHUP); I just did some tests and didn't see any problems.

Existing client connections should not be able to notice a reload that
changes pg_hba.conf or pg_ident.conf; however they definitely *should*
notice a reload that changes postgresql.conf (at least for parameters
that aren't overridden by other cases, such as a SET in the current
session).  So the blanket statement Marc is making is simply wrong.

Whether there is a bug here is impossible to say given the limited
amount of information provided.  I'd not expect a reload to cause
an existing connection to become totally dysfunctional, which is
what Marc seems to be claiming ... but without more evidence or
a test case, there's not much to be done.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Duplicate Values or Not?!

2005-09-16 Thread Greg Stark
John Seberg <[EMAIL PROTECTED]> writes:

> I recently tried to CREATE a UNIQUE INDEX and could
> not, due to duplicate values:
> 
> CREATE UNIQUE INDEX usr_login ON usr (login);
> 
> To try to find the offending row(s), I then executed
> the following:
> 
> SELECT count(*), login FROM usr GROUP BY login ORDER
> BY 1 DESC;
> 
> The GROUP BY didn't group anything, indicating to me
> that there were no duplicate values. There were the
> same number of rows in this query as a simple SELECT
> count(*) FROM usr.
> 
> This tells me that Postgresql is not using the same
> method for determining duplicates when GROUPING and
> INDEXing.

You might try running the GROUP BY query after doing:

set enable_hashagg = false;
select ...

With that false it would have to sort the results which should be exactly the
same code as the index is using. I think.


That doesn't really answer the rest of your questions. The short of it is that
setting the encoding doesn't magically make your data encoded in that
encoding. If your client sends it one encoding but claims it's unicode then
Postgres will happily store it in a UNICODE database and it'll be garbage.

Maybe someone else will have more specific advice on that front.


-- 
greg


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq