Re: [GENERAL] Asychronous database replication
On Sep 15, 2005, at 9:54 PM, Greg Stark wrote: If you need data to propagate from the clients back to the server then things get more complicated. Even then you could side step a lot of headaches if you can structure the application in specific ways, such as guaranteeing that the clients can only insert, never update records. And even updates could be OK if the application can support the right partitioning of the data and only do it one place at a time. With some kinds of field based work it might be suitable to have global (read only) data along with data created in the field that is site/ client specific. As long as the data collected in the field is not being updated on the master, it could continue to be updated in the field and synced back to the master database. John DeSoi, Ph.D. http://pgedit.com/ Power Tools for PostgreSQL ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Asychronous database replication
Greg Stark wrote: My first reaction to this description was to consider some sort of model where the master database publishes text dumps of the master database which are regularly downloaded and loaded on the slaves. The slaves treat those tables as purely read-only reference tables. If you need data to propagate from the clients back to the server then things get more complicated. Even then you could side step a lot of headaches if you can structure the application in specific ways, such as guaranteeing that the clients can only insert, never update records. It's the latter, I'm afraid. The master actually won't be modifying or inserting any data itself, just publishing it for the client databases in its domain. Almost all data inserts/updates/deletes will occur on the leaf nodes, i.e. at the remote health clinics and MMUs (mobile medical units). What we need to ensure is that if Patient X visits Site A on Monday that his records are there for a followup visit at Site B on Tuesday. Even this has salient problems: for instance, Patient X visits Site B before Site A has had time to replicate its current data back to the master and Site B has pulled those updates. The requirements scream ASP model except that this system needs to be functional for disaster management where it's likely there won't be any communications. At least, that's the constraint I've been given. This may turn out to be an issue of managing client expectations and some add'l infrastructure investment (i.e. better satellite communications on the MMUs and satellite backup for the fixed clinics). We're at the very early head-banging stages of this project now so I have a fairly optimistic list of requirements to resolve. This is an open source project though so it would be terrific if we could build it non-ASP. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] new.id has wrong value in INSERT RULE
Hi, To explain the problem as clearly as I can I've included the schema, rule, inserts, output and explanation below. CREATE TABLE table_a ( id serial PRIMARY KEY, nametext, active boolean DEFAULT 't', date_createdtimestamp DEFAULT CURRENT_TIMESTAMP ); CREATE TABLE table_b ( id serial PRIMARY KEY, table_a_id int REFERENCES table_a ON DELETE CASCADE, yield int NOT NULL, active boolean DEFAULT 't', date_createdtimestamp DEFAULT CURRENT_TIMESTAMP ); -- to debug I've set 'yield' to be the value of new.id rather than the value supplied in the insert statement (this is just to show what is happening with new.id) CREATE RULE table_b_insert_rule AS ON INSERT TO table_b DO (UPDATE table_b SET active = 'f', yield = new.id WHERE table_a_id = new.table_a_id AND id != new.id;); INSERT INTO table_a (id, name) VALUES (1, 'test1'); SELECT * FROM table_a; id | name | active |date_created +---++ 1 | test1 | t | 2005-09-16 13:23:03.620813 (1 row) INSERT INTO table_b (table_a_id, yield) VALUES (1, '100'); SELECT * FROM table_b; id | table_a_id | yield | active |date_created ++---++ 1 | 1 | 3 | f | 2005-09-16 13:23:46.156202 (1 row) The yield value is set to 3 rather than 1 as expected because the value of new.id was 3. This is totally unexpected behaviour, any help on getting to the bottom of this is much appreciated. It seems like table_b_id_seq is being incremented twice more that expected, the first time you get 1 which is used as the id, then it seems like it is being incremented it twice more and that's where the value of 3 is coming from. I've checked the increment value of the sequence as shown below and it is 1 as expected. SELECT * FROM table_b_id_seq sequence_name | last_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called ++--+-+- --+-+-+---+--- table_b_id_seq | 3 |1 | 9223372036854775807 | 1 | 1 | 30 | f | t (1 row) I tried changing the value of the sequence increment to 3 and retested, see output below. ALTER SEQUENCE table_b_id_seq INCREMENT 3; SELECT * FROM table_b_id_seq ; sequence_name | last_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called ++--+-+- --+-+-+---+--- table_b_id_seq | 3 |3 | 9223372036854775807 | 1 | 1 | 30 | f | t (1 row) INSERT INTO table_a (id, name) VALUES (2, 'test2'); SELECT * FROM table_a; id | name | active |date_created +---++ 1 | test1 | t | 2005-09-16 13:23:03.620813 2 | test2 | t | 2005-09-16 13:35:06.244128 (2 rows) INSERT INTO table_b (table_a_id, yield) VALUES (2, '100'); SELECT * FROM table_b; id | table_a_id | yield | active |date_created ++---++ 1 | 1 | 3 | f | 2005-09-16 13:23:46.156202 6 | 2 |12 | f | 2005-09-16 13:35:36.843507 (2 rows) It is clear to me that the value 6 in the id column is correct because I've changed the increment to 3. However the value of 12 in the yield column (Set by yield = new.id in the rule) can only be obtained by SELECT nextval('table_b_id_seq') (or postgres internal equiv) being called twice. I'm using (PostgreSQL) 7.4.5 and can obviously work round this problem easily but want to understnad what is going wrong. thanks in advance Ian ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Strange Bug in exim4 postgresql lookup code or libpq?
Hi, * Tom Lane <[EMAIL PROTECTED]> wrote: > Alexander Neumann <[EMAIL PROTECTED]> writes: > > In short terms: The result code is PGRES_FATAL_ERROR, but the error message > > is an empty string. > AFAIK that shouldn't happen, unless perhaps you are completely out of > memory in the client-side process. What shows up in the postmaster > error log when this occurs? = 2005-09-01 18:19:01 [5462] LOG: connection received: host=127.0.0.1 port=41446 2005-09-01 18:19:01 [5462] LOG: connection authorized: user=exim database=pfauadmin 2005-09-01 18:19:01 [5462] LOG: statement: SELECT domain || ' : ' FROM exim_virtual_domains 2005-09-01 18:19:01 [5462] LOG: statement: SELECT sa_enable FROM exim_virtual_addresses WHERE domain = '.org' AND (mailbox = 'x' OR mailbox = '*') ORDER BY mailbox DESC LIMIT 1 2005-09-01 18:19:01 [5462] LOG: statement: SELECT sa_tempreject_score FROM exim_virtual_addresses WHERE domain = '.org' AND (mailbox = 'x' OR mailbox = '*') ORDER BY mailbox DESC LIMIT 1 2005-09-01 18:19:01 [5462] LOG: unexpected EOF on client connection = The first query succeeds, the second fails. Both work and deliver exactly the results as expected, when cut&pasted from the logs (either exim or postgresql)... Thanks for reading, - Alexander pgpvh4r8wP9ze.pgp Description: PGP signature
Re: [GENERAL] Strange Bug in exim4 postgresql lookup code or libpq?
Hi, * Martijn van Oosterhout wrote: > On Thu, Sep 15, 2005 at 03:14:10PM +0200, Alexander Neumann wrote: > > I'm using exim4 together with postgresql. Sometimes, data lookups fail > > mysteriously and on the exim side nobody was able to help. > > The server log said: > 2005-09-01 18:19:01 [5462] LOG: statement: SELECT sa_tempreject_score FROM > exim_virtual_addresses WHERE domain = '.org' AND (mailbox = 'x' OR > mailbox = '*') ORDER BY mailbox DESC LIMIT 1 > 2005-09-01 18:19:01 [5462] LOG: unexpected EOF on client connection > > It's not clear if the EOF was caused by exim not closing down properly > or something else. There's nothing special about the domains you've x'd > out? Nop. The query works and returns the expected results when cut&pasted from the logs (either exim or postgresql). I even sniffed the smtp dialogue and tried to reproduce it, without any success (it just works if i'm trying it). So this is either caused by a strange (but reproduceable) race-condition or it somehow depends on the connecting ip address of the other side's server. Thanks for reading, - Alexander pgpwNbbLH6Pei.pgp Description: PGP signature
Re: [GENERAL] Help trying to write my first plpgsql function...
On 9/15/05, Bjørn T Johansen <[EMAIL PROTECTED]> wrote: Yes, I did and I found an answer... :) you did what? read the docs? But I am trying to use this function in a report designer and the result from the select is in the way of the real data from the fetch... Is there a way around this? and? as i said: define the function as setof something. for example (i'm wrinting without testnig, so there might be small mistakes) create type srf_type_1 ( field1 int4, field2 text); create or replace function test () returns setof srf_type_1 as $BODY$ declare temprec srf_type_1%ROWTYPE; begin for temprec in select id as field1, username as field2 from users where is_active = true loop return next temprec; end loop; return; end; $BODY$ language 'plpgsql'; should work as select * from test(); hubert
[GENERAL] help needed for functions
Hello All, I am new to postgresql database. I am transferring current database from mssql database to postgresql 7.4 I am facing problems for stored procedures. Is their any way so that I can transfer my existing stored procedure from mssql to postgre ? I already read about functions & tried to implement it but I could not found any way to get multiple columns from a function. Also I would like to get help about selecting values from two tables using function. Thanks Nitin __Do You Yahoo!?Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: [GENERAL] help needed for functions
Yes, you can use “SETOF” for multiple records. See Examples --- CREATE TABLE department(id int primary key, name text); CREATE TABLE employee(id int primary key, name text, salary int, departmentid int references department); - INSERT INTO department values (1, 'Management'); INSERT INTO department values (2, 'IT'); INSERT INTO employee values (1, 'John Smith', 3, 1); INSERT INTO employee values (2, 'Jane Doe', 5, 1); INSERT INTO employee values (3, 'Jack Jackson', 6, 2); - CREATE OR REPLACE FUNCTION GetEmployees() RETURNS SETOF employee AS 'select * from employee;' LANGUAGE 'sql'; - CREATE TYPE HOLDER AS (departmentid int, totalsalary int8); - CREATE or replace FUNCTION SqlDepartmentSalaries() RETURNS SETOF holder AS ' select departmentid, sum(salary) as totalsalary from GetEmployees() as a group by departmentid ' LANGUAGE 'sql'; select * from SqlDepartmentSalaries(); - --We can do the same in PLPGSQL in this way. CREATE OR REPLACE FUNCTION PLpgSQLDepartmentSalaries() RETURNS SETOF holder AS ' DECLARE r holder%rowtype; BEGIN FOR r in select departmentid, sum(salary) as totalsalary from GetEmployees() group by departmentid LOOP return next r; END LOOP; RETURN; END ' LANGUAGE 'plpgsql'; - Thanks Dinesh Pandey From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nitin Ahire Sent: Friday, September 16, 2005 7:21 PM To: pgsql-general@postgresql.org Subject: [GENERAL] help needed for functions Hello All, I am new to postgresql database. I am transferring current database from mssql database to postgresql 7.4 I am facing problems for stored procedures. Is their any way so that I can transfer my existing stored procedure from mssql to postgre ? I already read about functions & tried to implement it but I could not found any way to get multiple columns from a function. Also I would like to get help about selecting values from two tables using function. Thanks Nitin __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: [GENERAL] help needed for functions
am 16.09.2005, um 6:51:16 -0700 mailte Nitin Ahire folgendes: > I am facing problems for stored procedures. Is their any way so that I > can transfer my existing stored procedure from mssql to postgre ? I guess: no. > > I already read about functions & tried to implement it but I could not > found any way to get multiple columns from a function. Also I would http://techdocs.postgresql.org/guides/SetReturningFunctions http://www.postgresql.org/docs/current/static/xfunc-sql.html#AEN28835 Read this links! > like to get help about selecting values from two tables using > function. Where exactly is the problem? Regards, Andreas -- Andreas Kretschmer(Kontakt: siehe Header) Heynitz: 035242/47212, D1: 0160/7141639 GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net ===Schollglas Unternehmensgruppe=== ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Question about a query plan
PostgreSQL 7.4.8 on i486-pc-linux-gnu, Debian Reading: http://www.postgresql.org/docs/8.0/static/performance-tips.html I'm just starting to look at query plans, and I'm not understanding a few things. I don't have that many questions, but I'm including my examples below, so it's a bit long. First table is "class" (as in a class taught at a school) and has an indexed column "class_time" as timestamp(0) with time zone. First question is why the planner is not using an index scan when I use "now()" or CURRENT_TIMESTAMP? EXPLAIN ANALYZE select id from class where class_time > now(); QUERY PLAN -- Seq Scan on "class" (cost=0.00..655.62 rows=414 width=4) (actual time=213.327..288.407 rows=28 loops=1) Filter: (class_time > now()) EXPLAIN ANALYZE select id from class where class_time > now()::timestamp(0) with time zone; QUERY PLAN -- Seq Scan on "class" (cost=0.00..658.72 rows=414 width=4) (actual time=2.065..5.251 rows=28 loops=1) Filter: (class_time > (now())::timestamp(0) with time zone) At first I thought the planner was related to the ration of rows the planner was expecting to return to the total number of rows. But using < or > uses a a scan. But if I do "class_time = now()" then it uses an Index Scan. But, if I specify the timestamp then it always uses an Index Scan: select now()::timestamp(0) with time zone; now 2005-09-16 06:44:10-07 EXPLAIN ANALYZE select id from class where class_time > '2005-09-16 06:44:10-07'; QUERY PLAN -- Index Scan using class_class_time_index on "class" (cost=0.00..191.17 rows=50 width=4) (actual time=66.072..66.248 rows=28 loops=1) Index Cond: (class_time > '2005-09-16 06:44:10-07'::timestamp with time zone) Ok now on to the second question. I have two other related tables. First, I have a table "person" which you can guess what it holds. And a link table instructors (a class can have more than one instructor): \d instructors Table "public.instructors" Column | Type | Modifiers +-+--- person | integer | not null class | integer | not null Indexes: "instructors_pkey" primary key, btree (person, "class") "instructors_class_index" btree ("class") "instructors_person_index" btree (person) Foreign-key constraints: "$2" FOREIGN KEY ("class") REFERENCES "class"(id) "$1" FOREIGN KEY (person) REFERENCES person(id) I want to find out who is teaching classes in the future: EXPLAIN ANALYZE select person, class from instructors where instructors.class in (select class.id from class where class_time > now()); QUERY PLAN Hash IN Join (cost=656.65..687.64 rows=437 width=8) (actual time=31.741..33.443 rows=29 loops=1) Hash Cond: ("outer"."class" = "inner".id) -> Seq Scan on instructors (cost=0.00..20.08 rows=1308 width=8) (actual time=0.057..1.433 rows=1308 loops=1) -> Hash (cost=655.62..655.62 rows=414 width=4) (actual time=30.963..30.963 rows=0 loops=1) -> Seq Scan on "class" (cost=0.00..655.62 rows=414 width=4) (actual time=18.716..30.892 rows=28 loops=1) Filter: (class_time > now()) Perhaps I'm reading that incorrectly, but the sub-select is returning 28 rows of "class.id". Then why is it doing a Seq Scan on instructors instead of an index scan? If I innumerate all 28 classes I get an Index Scan. Finally, not really a question, but my goal is to show a count of classes taught by each in instructor. Perhaps there's a better query? EXPLAIN select person, first_name, count(class) from instructors, person where instructors.class in (select id from class where class_time > now() ) AND person.id = instructors.person group by person, first_name; QUERY PLAN --
Re: [GENERAL] Asychronous database replication
Steve Manes wrote: > Greg Stark wrote: > >> My first reaction to this description was to consider some sort of >> model where >> the master database publishes text dumps of the master database which are >> regularly downloaded and loaded on the slaves. The slaves treat those >> tables >> as purely read-only reference tables. >> If you need data to propagate from the clients back to the server then >> things >> get more complicated. Even then you could side step a lot of headaches >> if you >> can structure the application in specific ways, such as guaranteeing >> that the >> clients can only insert, never update records. > > > It's the latter, I'm afraid. The master actually won't be modifying or > inserting any data itself, just publishing it for the client databases > in its domain. Almost all data inserts/updates/deletes will occur on > the leaf nodes, i.e. at the remote health clinics and MMUs (mobile > medical units). What we need to ensure is that if Patient X visits Site > A on Monday that his records are there for a followup visit at Site B on > Tuesday. > > Even this has salient problems: for instance, Patient X visits Site B > before Site A has had time to replicate its current data back to the > master and Site B has pulled those updates. What about doing updates in a peer-to-peer style? Basically, each node updates any others it comes in contact with (both with its local changes and anything it's received from the master) and everyone pushes changes back to the master when they can. Sort of the way airplanes crossing the ocean pass radio messages for each other. I'm assuming two things: 1) Communication b/w local nodes is easier / occurs more frequently than communicating with the master. It's easier for an MMU to make a local call or visit a clinic than dial the sat phone. 2) Patients travel locally. Patient X might visit Sites A and B a day apart, but he's unlikely to visit Site C which is a few countries away any time soon. Basically, I don't think you need to update all nodes with every record immediately. For some early-morning reason, this made me think of distributed version control, but I'm not entirely sure how one would use it in this case. See svk.elixus.org. -- Peter Fein [EMAIL PROTECTED] 773-575-0694 Basically, if you're not a utopianist, you're a schmuck. -J. Feldman ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Strange Bug in exim4 postgresql lookup code or libpq?
Alexander Neumann <[EMAIL PROTECTED]> writes: >> 2005-09-01 18:19:01 [5462] LOG: unexpected EOF on client connection > So this is either caused by a strange (but reproduceable) race-condition or > it somehow depends on the connecting ip address of the other side's server. The "unexpected EOF" indicates pretty clearly that the problem is on the client side. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] new.id has wrong value in INSERT RULE
"Ian McFarland" <[EMAIL PROTECTED]> writes: > CREATE RULE table_b_insert_rule AS ON INSERT TO table_b > DO (UPDATE table_b SET active = 'f', yield = new.id WHERE table_a_id > = new.table_a_id AND id != new.id;); Rules are macros. Since what is substituted for new.id in this case is a nextval() call, you have a multiple-evaluation problem. The above is a pretty horrid way to do things anyway --- you should probably be using a trigger instead. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Create a pg table from CSV with header rows
On 9/15/05, Michael Fuhr <[EMAIL PROTECTED]> wrote: On Thu, Sep 15, 2005 at 06:48:59PM -0400, Robert Fitzpatrick wrote:> Anyone know a package that can do this? Perferrably a Unix/Linux> package.It would be trivial to write a script in a language like Perl to read the first line of a file and generate a CREATE TABLE statementfrom it, then issue a COPY command and send the rest of the file.Determining the columns' data types would be a different matter:if they weren't specified in the header then you'd have to guess or perhaps make them all text. I've attached a sample implementation of Michael's suggestion. loadCsv2.pl Description: Binary data ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Question about a query plan
Am Freitag, 16. September 2005 16:18 schrieb Bill Moseley: > First question is why the planner is not using an index scan when I > use "now()" or CURRENT_TIMESTAMP? > > > EXPLAIN ANALYZE select id from class where class_time > now(); > QUERY PLAN > > --- >--- Seq Scan on "class" (cost=0.00..655.62 > rows=414 width=4) (actual time=213.327..288.407 rows=28 loops=1) Filter: > (class_time > now()) The planner thinks your query will return 414 rows, so it thinks the sequential scan is faster. In reality, your query only retuns 28 rows, so you need to create better statistics, either by running ANALYZE or VACUUM (or both) or tweaking the statistics parameters of the columns. > Perhaps I'm reading that incorrectly, but the sub-select is returning > 28 rows of "class.id". Then why is it doing a Seq Scan on instructors > instead of an index scan? If I innumerate all 28 classes I get an > Index Scan. Again, bad statistics. It thinks that the scan on instructors is going to return 1308 rows, which is probably not true. > Finally, not really a question, but my goal is to show a count of classes > taught by each in instructor. Perhaps there's a better query? You could probably rewrite it as a join, but as you could see, the planner rewrote it as a join internally anyway. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Question about a query plan
On Sep 16, 2005, at 9:18 AM, Bill Moseley wrote: First question is why the planner is not using an index scan when I use "now()" or CURRENT_TIMESTAMP? It also used to be the case (pre-8.0; I couldn't find in the release notes whether this was an 8.0 or 8.1 fix) that now() and CURRENT_TIMESTAMP were not indexable, I think because of mutability. For older versions of postgres, it's recommended that you determine the time in the client and use constant data in your query. -- Thomas F. O'Connell Co-Founder, Information Architect Sitening, LLC Strategic Open Source: Open Your i™ http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-469-5150 615-469-5151 (fax) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Asychronous database replication
> The requirements scream ASP model except that this system needs to be > functional for disaster management where it's likely there won't be any > communications. At least, that's the constraint I've been given. I'm not an expert on this, but just kicking around the idea, the approach I think I'd look into: - clients don't access the database directly - there's a middleware layer and clients make higher-level RPC-type calls whose semantics more closely match the client functionality - then those calls can be logged and replicated... -- Scott Ribe [EMAIL PROTECTED] http://www.killerbytes.com/ (303) 665-7007 voice ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Replication
> How about a third choice: you can also use a proven, reliable and tested > replication solution that is included in the core system because the > core system basiclly provides it anyway. Sure, but that one is spelled "Sybase", not "MySQL" ;-) -- Scott Ribe [EMAIL PROTECTED] http://www.killerbytes.com/ (303) 665-7007 voice ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Question about a query plan
On Fri, Sep 16, 2005 at 04:45:57PM +0200, Peter Eisentraut wrote: > The planner thinks your query will return 414 rows, so it thinks the > sequential scan is faster. In reality, your query only retuns 28 rows, so > you need to create better statistics, either by running ANALYZE or VACUUM (or > both) or tweaking the statistics parameters of the columns. I did wonder about the planner stats, so I had run ANALYZE on the database with no change. I just now ran VACUUM (and VACUUM ANALYZE), and again see no change. Perhaps my table is just too small for this test. -- Bill Moseley [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Question about a query plan
On Fri, Sep 16, 2005 at 10:02:28AM -0500, Thomas O'Connell wrote: > > On Sep 16, 2005, at 9:18 AM, Bill Moseley wrote: > > >First question is why the planner is not using an index scan when I > >use "now()" or CURRENT_TIMESTAMP? > > It also used to be the case (pre-8.0; I couldn't find in the release > notes whether this was an 8.0 or 8.1 fix) that now() and > CURRENT_TIMESTAMP were not indexable, I think because of mutability. > > For older versions of postgres, it's recommended that you determine > the time in the client and use constant data in your query. Interesting. I have a few VIEWs that include now(), but I guess I could adjust and pass in the date from the client. Thanks for the tip. -- Bill Moseley [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Neither column can be NULL if the column is part of the combination of primary key columns?
Greetings, A question about "combination of several columns " as primary key . A table test ( col1 timestamp not null default now(), col2 timestamp, primary key (col1, col2) ); If I setup primary key as the *combination* of (col1, col2), the col2 will be setup as "NOT NULL" automatically right? (Although, I did not specify not null constraint for col2) To allow *col2* to be NULL and make record combination (col1, col2) unique, I should setup unique (col1, col2) ? Thanks, Emi ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Replication
Scott Ribe wrote: How about a third choice: you can also use a proven, reliable and tested replication solution that is included in the core system because the core system basiclly provides it anyway. Sure, but that one is spelled "Sybase", not "MySQL" ;-) It's amazing how misunderstood my post was. My third choice was a hypothetical future version of PostgreSQL, modified from its current form very slightly to include a form of replication 'out of the box': a couple of scripts to enable WAL log transfer and also a solution to the problem of WAL log delay mentioned by other posters. I only mentioned MySQL because their 'out of the box' solution involves transferring the binlogs, which is similar to the method of transferring the PostgreSQL WAL logs, and it just made me think. That's all. I wasn't comparing, I wasn't suggesting MySQL is better than PostgreSQL. I wasn't suggesting that they have the 'ultimate' solution. I wasn't even suggesting that they have a good solution. It just made me think. That's all. Well, I've learned my lesson. Next time I post I'll be sure not to mention MySQL in any way, shape or form. -- Russ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Neither column can be NULL if the column is part of the combination of primary key columns?
> > To allow *col2* to be NULL and make record combination (col1, col2) > unique, I should setup > unique (col1, col2) ? > > Thanks, > Emi > no, you must create a partial unique index... create unique index indice_col2 on table(col1, col2) where col2 is not null; -- Atentamente, Jaime Casanova (DBA: DataBase Aniquilator ;) ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Setting WHERE on a VIEW with aggregate function.
I have a view to generate a list of instructors and a count of their future classes. "instructors" is a link table between "class" and "person". CREATE VIEW future_instructor_counts AS SELECT person.id AS person_id, first_name, last_name, count(instructors.class) AS class_count FROM class, instructors, person WHERE class.id= instructors.class AND person.id = instructors.person AND class_time > now() GROUP BY person_id, first_name, last_name; I have two very basic SQL questions: 1) With an aggregate function in the query, is there any way to remove the "AND class_time > now()" so that timestamp can be passed in the select? That is, I'd like to be able to do this? select * from instructor_counts where class_time > now(); But class_time is not part of the VIEW so that's not valid. And if it was included then I don't have an aggregate function any more - no more grouping. 2) I think I'm missing something obvious. I know that I need to specify all my non-aggregate columns in the "GROUP BY", but I don't under stand why. Really, the results are just grouped only by person.id so why the need to specify the other columns. And if you don't specify all the columns then Postgresql reports: ERROR: column "person.id" must appear in the GROUP BY clause or be used in an aggregate function Is there a reason Postgresql doesn't just add the column automatically? It does in other cases (like a missing table in a join). Thanks -- Bill Moseley [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Create a pg table from CSV with header rows
On Fri, Sep 16, 2005 at 10:41:59AM -0400, Doug Bloebaum wrote: > On 9/15/05, Michael Fuhr <[EMAIL PROTECTED]> wrote: > > > > On Thu, Sep 15, 2005 at 06:48:59PM -0400, Robert Fitzpatrick wrote: > > > Anyone know a package that can do this? Perferrably a Unix/Linux > > > package. > > > > It would be trivial to write a script in a language like Perl to > > read the first line of a file and generate a CREATE TABLE statement > > from it, then issue a COPY command and send the rest of the file. > > Determining the columns' data types would be a different matter: > > if they weren't specified in the header then you'd have to guess > > or perhaps make them all text. > > > I've attached a sample implementation of Michael's suggestion. I've attached what I hope is a slightly improved one :) Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote! #!/usr/bin/perl -w use DBI; use Text::CSV_XS; use File::Basename; my $csv_file = shift or die "Usage: $0 \n"; my $fh; # Filehandle. open $fh, '<', $csv_file # 3-argument open in case of name weirness. or die "$0: Can't open $csv_file for read: $!\n"; my ($table_name) = fileparse($csv_file,qr{\..*}); $table_name = quote_odd_name($table_name); warn "table_name=$table_name\n"; my $dbh = DBI->connect( 'dbi:Pg:dbname=dbname', 'user', 'pass', # should this be user input? { Auto_commit => 0 } ); my $csv = Text::CSV_XS->new(); # create the table based on names from the header row # columns will arbitrarily be of type TEXT, because VARCHAR(20) # is a pretty silly limitation. my $header = <$fh>; $csv->parse($header) or die "$0: parse of header row failed\n"; my @columns = $csv->fields(); my $sql = "CREATE TABLE $table_name (\n " . join(",\n ", map {quote_odd_name($_).' TEXT'} @columns) . "\n)"; warn "\n$sql\n"; $dbh->do($sql); # add rows of data using COPY $dbh->do("COPY $table_name FROM STDIN WITH DELIMITER AS ','"); my $row_count = 0; while(<$fh>) { $dbh->pg_putline($_); $row_count++; } close $fh; $dbh->pg_endcopy; $dbh->commit; $dbh->disconnect; warn "$0: Added $row_count rows to $table_name\n"; sub quote_odd_name { my $name = shift; my $sth = $dbh->prepare_cached('SELECT quote_ident(?)'); $sth->execute($name); my $row = $sth->fetchrow_hashref; $sth->finish; return $row->{quote_ident}; } ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Asychronous database replication
John DeSoi <[EMAIL PROTECTED]> writes: > > If you need data to propagate from the clients back to the server then > > things > > get more complicated. Even then you could side step a lot of headaches if > > you > > can structure the application in specific ways, such as guaranteeing that > > the > > clients can only insert, never update records. > > And even updates could be OK if the application can support the right > partitioning of the data and only do it one place at a time. With some kinds > of field based work it might be suitable to have global (read only) data > along > with data created in the field that is site/ client specific. As long as the > data collected in the field is not being updated on the master, it could > continue to be updated in the field and synced back to the master database. Sure, though then you have to deal with what data to display on the client end. The most recently downloaded master data or the locally updated data? What about after you upload your local data when you're not sure whether the master data has been reconciled? Not impossible but it would be more work. But I find a surprisingly high fraction of applications are very amenable to being handled as insert-only. A medical application strikes me as something someone is all the more likely to be happy with an insert-only model. So instead of allowing having remote users to modify data directly you only allow them to "request" an update. Then when they look at the record it still makes logical sense to see the old data, along with their "requested" updates. Essentially, any replication system is based on insert-only queues. If you can design the application around that you avoid having to implement some sort of mapping to some hiding that. -- greg ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Setting WHERE on a VIEW with aggregate function.
I have a view to generate a list of instructors and a count of their > future classes. > > "instructors" is a link table between "class" and "person". > > CREATE VIEW future_instructor_counts > AS > SELECT person.id AS person_id, > first_name, > last_name, > count(instructors.class) AS class_count > > FROM class, instructors, person > > WHERE class.id= instructors.class AND > person.id = instructors.person > AND class_time > now() > > GROUP BY person_id, first_name, last_name; The trick is to do the data aggregation separately, then JOIN in whatever other fields you want. Something like this: CREATE VIEW future_instructor_counts AS SELECT * FROM (SELECT person.id AS person_id, first_name, last_name) personinfo INNER JOIN (SELECT class.id FROM class WHERE class_time > now() ) classes INNER JOIN (SELECT id, count(class) AS class_count FROM instructors GROUP BY id) classcount ON personinfo.person_id = instructors.id AND classes.id = instructors.id In many cases when using aggregate functions you get just the fields you need from the agg function (typically an id plus the aggregate result) and JOIN with other tables (or even the same table) to get other info such as first_name, last_name, etc. Otherwise, if you GROUP BY additional fields so you can get them in the output, you may be making the db do additional work. > 1) With an aggregate function in the query, is there any way to remove > the "AND class_time > now()" so that timestamp can be passed in the > select? That is, I'd like to be able to do this? > > select * from instructor_counts where class_time > now(); > > But class_time is not part of the VIEW so that's not valid. No problem, just make it a part of the view. See the classes section below. CREATE VIEW future_instructor_counts AS SELECT * FROM (SELECT person.id AS person_id, first_name, last_name) personinfo INNER JOIN -- Add class_time field! (SELECT class.id, class_time FROM class WHERE class_time > now() ) classes INNER JOIN (SELECT id, count(class) AS class_count FROM instructors GROUP BY id) classcount ON personinfo.person_id = instructors.id AND classes.id = instructors.id [Disclaimer: I've not tested this code at all. It could help if you sent table definitions and maybe even dummy data via insert commands.] > And if it was included then I don't have an aggregate function any more - no > more grouping. If you do the agg function separately like this that isn't an issue. You join tables to get whatever fields you'd like to have in your output. > 2) I think I'm missing something obvious. I know that I need to > specify all my non-aggregate columns in the "GROUP BY", but I don't > under stand why. Really, the results are just grouped only by > person.id so why the need to specify the other columns. > > And if you don't specify all the columns then Postgresql reports: > > ERROR: column "person.id" must appear in the GROUP BY > clause or be used in an aggregate function > > Is there a reason Postgresql doesn't just add the column > automatically? It does in other cases (like a missing table in a > join). As I mention above, if you GROUP BY additional fields just to get them in the output, you may be making the db do additional work. I seem to remember that in a later SQL standard (ie, after SQL-99 but I could be wrong) I believe it allows you to specify additional fields in SELECT that are not in the GROUP BY clause. But PG isn't there yet. -Roger -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Bill Moseley Sent: Friday, September 16, 2005 11:30 AM To: pgsql-general@postgresql.org Subject: [GENERAL] Setting WHERE on a VIEW with aggregate function. I have a view to generate a list of instructors and a count of their future classes. "instructors" is a link table between "class" and "person". CREATE VIEW future_instructor_counts AS SELECT person.id AS person_id, first_name, last_name, count(instructors.class) AS class_count FROM class, instructors, person WHERE class.id= instructors.class AND person.id = instructors.person AND class_time > now() GROUP BY person_id, first_name, last_name; I have two very basic SQL questions: 1) With an aggregate function in the query, is there any way to remove the "AND class_time > now()" so that timestamp can be passed in the select? That is, I'd like to be able to do this? select * from inst
[GENERAL] Restoring just a table or row from a backup copy.
Is there a way to do that? Thanks in advance. Fernando Lujan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Anyone use Eclipse?
People: The Eclipse project is interested in having PostgreSQL people contribute to their Data Services plug in. Do we have any java hackers in the community using Eclipse? Anyone interested? -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Partial dates
On Wed, Sep 14, 2005 at 00:09:58 -0400, Joe <[EMAIL PROTECTED]> wrote: > > I'm not sure I agree with the need to "fix" or "sanitize" the data. The > columns in question are used mostly for publication dates. While you may > be able to find a full release date for recent books, they are generally > listed as published on a given year, period. Most monthly magazines only > have a month-year of publication. And of course, daily newspapers and > Internet articles usually have a full day-month-year. In fact, the MySQL > solution didn't address quarterly or bi-monthly publications as that data > was only captured as one of the months in the period--as opposed to Mar/Apr > 2005 or First Quarter 2005 (or worse: Winter 2004). As Tom Lane argued, it > seems I'll have to bite the bullet and create a new datatype. The only > other alternative I see would be to split the publication date into three > columns and that's rather ugly. It seems like in your case the dates are best expressed as ranges and that you could store the information you have using an earliest possible date and a lastest possible date. If there are extra constraints based on how far apart the begin and end dates are you could implement them as well. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Restoring just a table or row from a backup copy.
On Fri, 2005-09-16 at 14:13, Fernando Lujan wrote: > Is there a way to do that? pg_dumpall --help pg_restore restores a PostgreSQL database from an archive created by pg_dump. Usage: pg_restore [OPTION]... [FILE] (SNIP) -t, --table=NAME restore named table (SNIP) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Replication
On Fri, 2005-09-16 at 12:51, Russ Brown wrote: > Scott Ribe wrote: > >>How about a third choice: you can also use a proven, reliable and tested > >> replication solution that is included in the core system because the > >>core system basiclly provides it anyway. > > > > > > Sure, but that one is spelled "Sybase", not "MySQL" ;-) > > > > > > It's amazing how misunderstood my post was. > > My third choice was a hypothetical future version of PostgreSQL, > modified from its current form very slightly to include a form of > replication 'out of the box': a couple of scripts to enable WAL log > transfer and also a solution to the problem of WAL log delay mentioned > by other posters. > > I only mentioned MySQL because their 'out of the box' solution involves > transferring the binlogs, which is similar to the method of transferring > the PostgreSQL WAL logs, and it just made me think. That's all. I wasn't > comparing, I wasn't suggesting MySQL is better than PostgreSQL. I wasn't > suggesting that they have the 'ultimate' solution. I wasn't even > suggesting that they have a good solution. It just made me think. That's > all. > > Well, I've learned my lesson. Next time I post I'll be sure not to > mention MySQL in any way, shape or form. Actually, I would just suggest to not hold it up as an example of how things should be done. That would work for me. There was a time, 5 to 10 years ago, when MySQL AB spent a LOT of energy demonizing PostgreSQL to make themselves look better. There were pages of misinformation in their documentation about how PostgreSQL was basically crap, and MySQL did everything right, and a lot of people spent a lot of time debunking that. MySQL AB now plays better with others, and hasn't engaged in the kind of character assassination they once did, but there's STILL a sore spot for most PostgreSQL users and developers there, because they used to have to spend a lot of energy and time explaining that what was on the MySQL site was lies and misinformation. A LOT of time. And it did hurt PostgreSQL, in terms of keeping people away from it. So, there's an almost automatic response triggered by someone mentioning how MySQL does things, especially if they're perceived to be holding MySQL up as an example to the PostgreSQL community on how things should be done. In my original post, my main point wasn't just against MySQL, it was against the philosophy that just because replication is included and part of the core of a database, it doesn't mean that it's reliable or well tested. And MySQL is a fine example of that. Their replication really does have a lot of issues. So, feel free to mention MySQL, but know that mostly when it's mentioned here, it's mentioned as an example of how things shouldn't be done. In terms of coding, marketing, testing, or licensing. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Duplicate Values or Not?!
I recently tried to CREATE a UNIQUE INDEX and could not, due to duplicate values: CREATE UNIQUE INDEX usr_login ON usr (login); To try to find the offending row(s), I then executed the following: SELECT count(*), login FROM usr GROUP BY login ORDER BY 1 DESC; The GROUP BY didn't group anything, indicating to me that there were no duplicate values. There were the same number of rows in this query as a simple SELECT count(*) FROM usr. This tells me that Postgresql is not using the same method for determining duplicates when GROUPING and INDEXing. I dig a little deeper. The best candidate I find for a possible duplicate are caused by characters that did not translate well. IIRC, the basis was the name Pena, which looked like Pe?a. I'm thinking the original data was not encoded properly, or my export didn't handle encodings properly, etc. The two Penas used different characters in the 3rd position, neither of which were translated correctly. I loaded up data from another database vendor (4th Dimension), into a 8.0.3 Postgresql, which I had compiled from source with the default configuration. This was on Yellow Dog Linux 4.0.1. I brought the same data into a 8.0.1 on Max OS X (binary from entropy.ch) and did NOT have this UNIQUE INDEX failure. I'm sure my problems are deeper than the INDEX failure, involving the accuracy of the conversion, but, short term, I would like to know what is different? They both are SQL_ASCII databases. I tried importing into a UNICODE database, but that really a mess of errors (during COPY). I realize I need to learn about encodings, my source data, etc., but I'm looking for hints. Anybody experienced in exported 4th Dimension data containing a certain amount of foriegn language text? Thanks, __ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] pg_ctl reload breaks our client
Yesterday a client application broke immediately after we issued a pg_ctl reload command. The only change we had made was to pg_hba.conf to enable trusted connections from localhost. My question is, how should the client application be affected by such a reload? My impression was that the client should be totally unaware of a reload, but reality does not bear this out. Any ideas/informed responses will be welcomed. I suspect that this has uncovered a bug in our client but without knowing what the client experience shuold be, it's hard to narrow down where the bug may lie. Thanks. __ Marc Munro signature.asc Description: This is a digitally signed message part
Re: [GENERAL] Replication
Scott Marlowe wrote: On Fri, 2005-09-16 at 12:51, Russ Brown wrote: Scott Ribe wrote: How about a third choice: you can also use a proven, reliable and tested replication solution that is included in the core system because the core system basiclly provides it anyway. Sure, but that one is spelled "Sybase", not "MySQL" ;-) It's amazing how misunderstood my post was. My third choice was a hypothetical future version of PostgreSQL, modified from its current form very slightly to include a form of replication 'out of the box': a couple of scripts to enable WAL log transfer and also a solution to the problem of WAL log delay mentioned by other posters. I only mentioned MySQL because their 'out of the box' solution involves transferring the binlogs, which is similar to the method of transferring the PostgreSQL WAL logs, and it just made me think. That's all. I wasn't comparing, I wasn't suggesting MySQL is better than PostgreSQL. I wasn't suggesting that they have the 'ultimate' solution. I wasn't even suggesting that they have a good solution. It just made me think. That's all. Well, I've learned my lesson. Next time I post I'll be sure not to mention MySQL in any way, shape or form. Actually, I would just suggest to not hold it up as an example of how things should be done. That would work for me. I didn't!! There was a time, 5 to 10 years ago, when MySQL AB spent a LOT of energy demonizing PostgreSQL to make themselves look better. There were pages of misinformation in their documentation about how PostgreSQL was basically crap, and MySQL did everything right, and a lot of people spent a lot of time debunking that. I remember that time, and I remember being very annoyed about it. I am still frustrated now by people who will believe the FUD that was spread at the time and won't even consider PostgreSQL as a result. That is basically why the company I work for uses MySQL, and simply will not consider changing, no matter how hard I try to make it happen. MySQL AB now plays better with others, and hasn't engaged in the kind of character assassination they once did, but there's STILL a sore spot for most PostgreSQL users and developers there, because they used to have to spend a lot of energy and time explaining that what was on the MySQL site was lies and misinformation. A LOT of time. And it did hurt PostgreSQL, in terms of keeping people away from it. Indeed. As I say above, that's why my company is staying away from it, despite my best efforts. So, there's an almost automatic response triggered by someone mentioning how MySQL does things, especially if they're perceived to be holding MySQL up as an example to the PostgreSQL community on how things should be done. I've noticed that. I've been reading (and occasionally posting to) this list for a few years now, and it's the one and only thing about it that bugs me. Other lists bug me in a lot of ways due to attitudes of some of the major contributors, but I've always enjoyed this list greatly primarily *because* of the major contributors (yourself included). It's educational, friendly and very helpful. I learn a lot from this list and enjoy the discussion. This one thing bugs me because I'm not generally an emotionally reactive person: I prefer to consider things fairly before responding, which is why I frequently don't respond at all to things if I don't think it would help matters. I feel that it is an extremely unwise policy to automatically assume that what your competitors are doing is worse than what you are doing, and that there's nothing you can learn from them. That's how you get left behind. That's not to say that there *is* a great deal that PostgreSQL can learn from MySQL, but one should not assume that there is nothing. From my readings on this list the majority of examples of people using MySQL as an example of how PostgreSQL should be doing things have been misguided/wrong/trolling etc. However, from time to time a legitimate example is raised, and in those situations the responses given have sometimes been disappointing. In my original post, my main point wasn't just against MySQL, it was against the philosophy that just because replication is included and part of the core of a database, it doesn't mean that it's reliable or well tested. And MySQL is a fine example of that. Their replication really does have a lot of issues. Indeed. But just to stress the point, I wasn't stating that the included replication in MySQL was any good (though it's not terrible as we're using it heavily in an extremely high-volume situation with few problems), I was just bringing up the idea of getting a decent replication solution included in PostgreSQL for relatively little effort. So, feel free to mention MySQL, but know that mostly when it's mentioned here, it's mentioned as an example of how things shouldn't be done. In terms of coding, marketing, testing, or l
Re: [GENERAL] pg_ctl reload breaks our client
On Fri, Sep 16, 2005 at 01:28:13PM -0700, Marc Munro wrote: > Yesterday a client application broke immediately after we issued a > pg_ctl reload command. How did the client break? What behavior did it exhibit? Were there any errors in the server's logs? Can you duplicate the problem? What version of PostgreSQL are you using, and on what platform? -- Michael Fuhr ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Replication
On Sep 16, 2005, at 4:30 PM, Russ Brown wrote: just because replication is included and part of the core of a database, it doesn't mean that it's reliable or well tested. I just jumped into this thread, and this statement just triggers one thought. Many Postgres end users rely on experts like those on this list for the best advice on how to use Postgres and what might be the best solution for non-core needs such as replication. Inclusion does probably make those users feel comfort that the experts for that particular software or product deems it one of , if not the best solution for the given problem. It's not always true and may not be right for this situation, since replication needs for different situations may vary widely. But inclusion does imply some type of endorsement. For instance I feel pretty comfortable with TSeach2 though I don't know much about. That fact that it comes in the contribs is an endorsement. I'd be confused if the consensus was that there is a better, as easy to use and license compatible full text index available and was for some reason NOT included. Of course, I'd still ask the list. . . ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Setting WHERE on a VIEW with aggregate function.
On Fri, Sep 16, 2005 at 12:06:19PM -0700, Roger Hand wrote: > > select * from instructor_counts where class_time > now(); > > > > But class_time is not part of the VIEW so that's not valid. > > No problem, just make it a part of the view. See the classes section below. > > CREATE VIEW future_instructor_counts > AS > SELECT * FROM > > (SELECT >person.id AS person_id, > first_name, > last_name) personinfo > > INNER JOIN > > -- Add class_time field! > (SELECT class.id, class_time FROM class > WHERE class_time > now() ) classes > > INNER JOIN > > (SELECT >id, count(class) AS class_count > FROM instructors GROUP BY id) classcount > > ON personinfo.person_id = instructors.id > AND classes.id = instructors.id I couldn't get that to work -- Postgresql isn't that helpful just reporting "ERROR: syntax error at or near ";" at character 496" even after adding a FROM in the first select. So, I'm stabbing in the dark to get it to work. > [Disclaimer: I've not tested this code at all. It could help if you sent > table definitions and maybe even dummy > data via insert commands.] Ok -- this should be cut-n-paste: CREATE TABLE class ( id integer PRIMARY KEY, class_time timestamp(0) with time zone, nametext ); CREATE TABLE person ( id integer PRIMARY KEY, first_name text ); create table instructors ( person integer NOT NULL REFERENCES person, class integer NOT NULL REFERENCES class, PRIMARY KEY (person, class) ); INSERT INTO person (id,first_name) values (1,'Joe'); INSERT INTO person (id,first_name) values (2,'Mary'); INSERT INTO person (id,first_name) values (3,'Bob'); INSERT INTO person (id,first_name) values (4,'Cindy'); INSERT INTO class (id,name, class_time) values (1,'Math', now()); INSERT INTO class (id,name, class_time) values (2,'Math', now() + interval '1 day'); INSERT INTO class (id,name, class_time) values (3,'Science', now()); INSERT INTO class (id,name, class_time) values (4,'PE', now() + interval '1 day'); INSERT INTO instructors (person, class) values (1,1); -- joe teaches math now INSERT INTO instructors (person, class) values (1,2); -- joe teaches math tomorrow INSERT INTO instructors (person, class) values (2,2); -- with Mary INSERT INTO instructors (person, class) values (3,3); -- Bob teaches science now INSERT INTO instructors (person, class) values (4,3); -- Cindy teaches science tomorrow -- view CREATE VIEW instructor_counts AS SELECT person.id AS person_id, first_name, count(instructors.class) AS class_count FROM class, instructors, person WHERE class.id= instructors.class AND person.id = instructors.person -- AND class_time > now() GROUP BY person_id, first_name; select * from instructor_counts order by class_count desc; -- Returns: person_id | first_name | class_count ---++- 1 | Joe| 2 2 | Mary | 1 3 | Bob| 1 4 | Cindy | 1 (4 rows) My GOAL above is to be able to add a WHERE class_time > $some_time. Here's were I left off, which I never could get to work. The individual selects work, but seems like I need to be say c.class_id = i.class in addition. But I can't even get this without syntax errors: CREATE VIEW instructor_counts AS SELECT * FROM (SELECT person.id AS person_id, first_name FROM person) p INNER JOIN (SELECT class.id AS class_id, class_time FROM class) c INNER JOIN (SELECT person, count(class) AS class_count FROM instructors GROUP BY person) i ON ( p.person_id = i.person); That also looks like the selects are going to be full table scans. -- Bill Moseley [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] pg_ctl reload breaks our client
Michael, It is Postgres 7.3.6. The client is a multi-threaded C++ client. The breakage was that one group of connections simply stopped. Others contined without problem. It is not clear exactly what was going on. Nothing in our application logs gives us any clue to this. As for reproducibility, it has hapenned before in test environments when we have bounced the datanase. This is not too shocking as I would expect the client to notice this :-) It is a little more shocking when it's a reload. Or maybe I have simply misunderstood what reload does. I am simply looking for clues here and don't expect definitive answers. That's why I was a little vague. Am I right though, in thinking that a reload shuold be pretty much invisible to the client, or will certain operations fail and require a re-try? __ Marc On Fri, 2005-09-16 at 14:40 -0600, Michael Fuhr wrote: > On Fri, Sep 16, 2005 at 01:28:13PM -0700, Marc Munro wrote: > > Yesterday a client application broke immediately after we issued a > > pg_ctl reload command. > > How did the client break? What behavior did it exhibit? Were there > any errors in the server's logs? Can you duplicate the problem? > What version of PostgreSQL are you using, and on what platform? > signature.asc Description: This is a digitally signed message part
Re: [GENERAL] Question about a query plan
On Fri, Sep 16, 2005 at 10:18:19AM -0700, Bill Moseley wrote: > On Fri, Sep 16, 2005 at 10:02:28AM -0500, Thomas O'Connell wrote: > > For older versions of postgres, it's recommended that you determine > > the time in the client and use constant data in your query. > > Interesting. I have a few VIEWs that include now(), but I guess I > could adjust and pass in the date from the client. I sometimes use the constant 'now' instead, since it obviously isn't mutable. It's a bit tricky because sometimes it expands when you define the statement. You have to do 'now'::text usually and even then you should check... -- Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgpS4f2swTZWj.pgp Description: PGP signature
Re: [GENERAL] pg_ctl reload breaks our client
On Fri, Sep 16, 2005 at 02:16:29PM -0700, Marc Munro wrote: > It is Postgres 7.3.6. The client is a multi-threaded C++ client. The > breakage was that one group of connections simply stopped. Others > contined without problem. It is not clear exactly what was going on. How did the connections "stop"? Were the connections broken, causing queries to fail? Or did queries block and never return? Or something else? What was happening that shouldn't happen, or what wasn't happening that should happen? If the connections were still active but not returning, did you do a process trace on the connection's postmaster or attach a debugger to it to see what it was doing? Could the timing of the problem have been coincidence? Have you ever seen the problem without a reload? How often do you see the problem after a reload? Do you know for certain that the application was working immediately before the reload and not working immediately after it? What operating system are you using? > Nothing in our application logs gives us any clue to this. What about the postmaster logs? > As for reproducibility, it has hapenned before in test environments when > we have bounced the datanase. This is not too shocking as I would > expect the client to notice this :-) It is a little more shocking when > it's a reload. Or maybe I have simply misunderstood what reload does. Can you reproduce the problem with a reload? A stop and start will terminate client connections, but a reload shouldn't. -- Michael Fuhr ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] pg_ctl reload breaks our client
Michael, Many thanks for your response; it is much appreciated. My responses are embedded below: On Fri, 2005-09-16 at 17:10 -0600, Michael Fuhr wrote: > On Fri, Sep 16, 2005 at 02:16:29PM -0700, Marc Munro wrote: > > It is Postgres 7.3.6. The client is a multi-threaded C++ client. The > > breakage was that one group of connections simply stopped. Others > > contined without problem. It is not clear exactly what was going on. > > How did the connections "stop"? Were the connections broken, causing > queries to fail? Or did queries block and never return? Or something > else? What was happening that shouldn't happen, or what wasn't > happening that should happen? From the server side, there were simply connections (1 or 2) that appeared idle. From the client side it looked like a query had been initiated but the client thread was stuck in a library call (as near as we can tell). This, vague though it is, is as much as I know right now. We were unable to do much debugging as it is a production system and the priority was to get it back up. > If the connections were still active but not returning, did you do > a process trace on the connection's postmaster or attach a debugger > to it to see what it was doing? No, time pressure prevented this. > Could the timing of the problem have been coincidence? Have you > ever seen the problem without a reload? How often do you see the > problem after a reload? Do you know for certain that the application > was working immediately before the reload and not working immediately > after it? It *could* be coincidence, but the problem began within 5 seconds of the reload. Coincidence is unlikely. > What operating system are you using? Linux 2.4.20 smp i686 > > > Nothing in our application logs gives us any clue to this. > > What about the postmaster logs? Ah, now there's another story. Unavailable I'm afraid. Resolving that is also on my priority list. > > As for reproducibility, it has hapenned before in test environments when > > we have bounced the datanase. This is not too shocking as I would > > expect the client to notice this :-) It is a little more shocking when > > it's a reload. Or maybe I have simply misunderstood what reload does. > > Can you reproduce the problem with a reload? A stop and start will > terminate client connections, but a reload shouldn't. This is not currently seen as a priority (the work-around of "don't do that" is seen as sufficient). I'm simply hoping to get someone to say for sure that the client app should not be able to tell that a reload has happened. At that point I may be able to raise the priority of this issue. I would certainly like to do more investigation. If postgresql hackers are interested in this strange event (please tell me for sure that it *is* strange) that may also help me to get the necessary resources to run more tests. Thanks again. __ Marc Munro signature.asc Description: This is a digitally signed message part
Re: [GENERAL] Setting WHERE on a VIEW with aggregate function.
>> > I have a view to generate a list of instructors and a count of their >> > future classes. >> > select * from instructor_counts where class_time > now(); >> > >> > But class_time is not part of the VIEW so that's not valid. >> >> No problem, just make it a part of the view. "No problem", I said ... famous last words. > select * from instructor_counts order by class_count desc; > > -- Returns: > > person_id | first_name | class_count > ---++- > 1 | Joe| 2 > 2 | Mary | 1 > 3 | Bob| 1 > 4 | Cindy | 1 > (4 rows) > > My GOAL above is to be able to add a WHERE class_time > $some_time. Armed with your table ddl and sample data I can see the problem more clearly. Unfortunately class_time cannot be a column in the view output. For example, look at the "Joe" line above ... if he teaches two classes which "class_time" would it show? Since class_time can't be a column in the view output it can't be used in a WHERE clause. So it would appear to me that you won't able to meet your goal by simply using a view. However, there are other solutions you can use which may or may not be appropriate. I can think of three at the moment. #1: Function Solution: To use functions you may first need to run this at command line: createlang plpgsql electric Then create a function that you can pass in a date to: CREATE FUNCTION getclasscount("timestamp") RETURNS "refcursor" AS ' DECLARE curs refcursor; BEGIN OPEN curs FOR SELECT * FROM (SELECT person, count(class) AS class_count FROM instructors INNER JOIN class ON class.id = instructors.person WHERE class.class_time > $1 GROUP BY person ) classcount INNER JOIN (SELECT person.id AS person_id, first_name FROM person ) personinfo ON personinfo.person_id = classcount.person RETURN curs; END; ' LANGUAGE 'plpgsql'; This would be the best solution if you are in control of the application source code. In Java, for example, it's relatively simple to call this function and return the result as a result set. If you're working in Java I'd be glad to show you same sample code. If you really do need a view for some reason, then this wouldn't work. #2: Simplify the Select Criteria Solution: = A slightly less flexible approach, but one that may be workable, would be to add two boolean columns (with default value of '0') to the class table: "completed" and "current". Then once a semester you run a simple query that updates them. Something like: UPDATE class SET current = '1' where class_time = '2005-09-01'; UPDATE class SET completed = '1' where class_time < '2005-09-01'; Then the view would be: CREATE VIEW vclasscount AS SELECT * FROM (SELECT person, count(class) AS class_count FROM instructors INNER JOIN (SELECT id FROM class WHERE class.completed = '0' AND class.current = '0') futureclasses ON futureclasses.id = instructors.class GROUP BY person ) classcount INNER JOIN (SELECT person.id AS person_id, first_name FROM person ) personinfo ON personinfo.person_id = classcount.person This would be a nice simple solution, but may not be flexible enough in that you can't specify an ad-hoc date or date-range ... you would only see future class count. 3: Use a Temp Table === Again, if you have control of application logic, you could: 1) SELECT * INTO futureclasses FROM class where class_time > ? 2) Then make the view against futureclasses rather than classes. Good luck! -Roger > Bill Moseley ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Divide a float4 by 1 - what is going on???????
Hi people, can anyone explain what is going on here : ? CIMSOFT=# CREATE TEMP TABLE test (n1 FLOAT4); CREATE TABLE CIMSOFT=# INSERT INTO test (n1) VALUES (2.456677); INSERT 6571521 1 CIMSOFT=# SELECT * FROM test; n1 - 2.45668 (1 row) CIMSOFT=# SELECT n1/1 FROM test; ?column? -- 2.45667695999146 (1 row) Why I get so many digits by a division with one? Anybody should have learned that everything / 1 = everything ;-) Thnx, Daniel PS : PostgreSQL 8.0.0 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 (mingw ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] pg_ctl reload breaks our client
On Fri, Sep 16, 2005 at 04:34:46PM -0700, Marc Munro wrote: > On Fri, 2005-09-16 at 17:10 -0600, Michael Fuhr wrote: > > Can you reproduce the problem with a reload? A stop and start will > > terminate client connections, but a reload shouldn't. > > This is not currently seen as a priority (the work-around of "don't do > that" is seen as sufficient). I'm simply hoping to get someone to say > for sure that the client app should not be able to tell that a reload > has happened. At that point I may be able to raise the priority of this > issue. As far as I know clients shouldn't notice a reload (which is effected via a SIGHUP); I just did some tests and didn't see any problems. However, I don't know much about the inner workings of PostgreSQL so I can't say for sure. Maybe one of the developers will comment. -- Michael Fuhr ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Divide a float4 by 1 - what is going on???????
Float provides 6-7 digits of precision. I see nothing surprising down below. > -Original Message- > From: [EMAIL PROTECTED] [mailto:pgsql-general- > [EMAIL PROTECTED] On Behalf Of Daniel Schuchardt > Sent: Friday, September 16, 2005 5:13 PM > To: pgsql-general@postgresql.org > Subject: [GENERAL] Divide a float4 by 1 - what is going on??? > > Hi people, > > can anyone explain what is going on here : ? > > CIMSOFT=# CREATE TEMP TABLE test (n1 FLOAT4); > CREATE TABLE > CIMSOFT=# INSERT INTO test (n1) VALUES (2.456677); > INSERT 6571521 1 > CIMSOFT=# SELECT * FROM test; >n1 > - > 2.45668 > (1 row) > > CIMSOFT=# SELECT n1/1 FROM test; > ?column? > -- > 2.45667695999146 > (1 row) > > Why I get so many digits by a division with one? Anybody should have > learned that everything / 1 = everything ;-) > > Thnx, > Daniel > > PS : > > PostgreSQL 8.0.0 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 > (mingw > > ---(end of broadcast)--- > TIP 4: Have you searched our list archives? > >http://archives.postgresql.org ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Divide a float4 by 1 - what is going on???????
On Sat, Sep 17, 2005 at 02:12:45AM +0200, Daniel Schuchardt wrote: > CIMSOFT=# SELECT n1/1 FROM test; > ?column? > -- > 2.45667695999146 > (1 row) > > Why I get so many digits by a division with one? Anybody should have > learned that everything / 1 = everything ;-) Looks like the division is being done in double precision (float8) and you're seeing the effects of an inexact representation. test=> SELECT 2.456677::real / 1; ?column? -- 2.45667695999146 (1 row) test=> SELECT 2.456677::real / 1::real; ?column? -- 2.45668 (1 row) -- Michael Fuhr ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Setting WHERE on a VIEW with aggregate function.
On Fri, Sep 16, 2005 at 04:56:25PM -0700, Roger Hand wrote: > So it would appear to me that you won't able to meet your goal by simply > using a view. > > However, there are other solutions you can use which may or may not be > appropriate. I can think of three at the moment. > > #1: Function Solution: I'll take a look at this. It's about time I started to learn about functions a bit. > This would be the best solution if you are in control of the > application source code. In Java, for example, it's relatively > simple to call this function and return the result as a result set. > If you're working in Java I'd be glad to show you same sample code. I'm using Perl and DBI (really Class::DBI but DBI is not far away). > #2: Simplify the Select Criteria Solution: > = > > A slightly less flexible approach, but one that may be workable, would be to > add two boolean columns (with default value of '0') to the class table: > "completed" and "current". Then once a semester you run a simple query that > updates them. Something like: > > UPDATE class SET current = '1' where class_time = '2005-09-01'; > UPDATE class SET completed = '1' where class_time < '2005-09-01'; Classes start daily (and at different hours). I could cron once an hour I suppose, but I'd rather not de-normalize the data. Maybe I can just create three views (future, recent, old) and live with that. The temporary table is another possibility I'll look into. Thank you very much for spending time on this. I really appreciate it. -- Bill Moseley [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] pg_ctl reload breaks our client
Marc > Yesterday a client application broke immediately after we issued a > pg_ctl reload command. The only change we had made was to pg_hba.conf > to enable trusted connections from localhost. Can you change pg_hba.conf back to what it had been prior and do a reload again and check if the clients start working? I've gotten confused and shot myself in the foot when setting pg_hba.conf a few times, myself. brew == Strange Brew ([EMAIL PROTECTED]) Check out my Stock Option Covered Call website http://www.callpix.com and my Musician's Online Database Exchange http://www.TheMode.com == ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Question about a query plan
"Thomas O'Connell" <[EMAIL PROTECTED]> writes: > On Sep 16, 2005, at 9:18 AM, Bill Moseley wrote: >> First question is why the planner is not using an index scan when I >> use "now()" or CURRENT_TIMESTAMP? > It also used to be the case (pre-8.0; I couldn't find in the release > notes whether this was an 8.0 or 8.1 fix) that now() and > CURRENT_TIMESTAMP were not indexable, I think because of mutability. 8.0 fix. It's not that they're "not indexable" per se, it's that pre-8.0 planners punted when it came to estimating what fraction of rows would meet a condition like "timestampcol > now()" --- and the default estimate for such things doesn't favor an indexscan. The 8.0 change is to go ahead and run the function and see what value it's returning now (pardon the pun) and then compare that to the column's statistical histogram to derive a rows estimate. It's entirely likely that we'll get ragged on eventually because of cases where this procedure generates bad estimates ... but for the moment it seems like a win. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Setting WHERE on a VIEW with aggregate function.
Bill Moseley <[EMAIL PROTECTED]> writes: > I couldn't get that to work -- Postgresql isn't that helpful just > reporting "ERROR: syntax error at or near ";" at character 496" even > after adding a FROM in the first select. So, I'm stabbing in the dark > to get it to work. [ not directly on topic, but... ] I think the answer to that is "use a newer version of Postgres". 8.0 and up provide fairly specific pointers for syntax errors. Silly example: regression=# select foo, bar regression-# , baz, regression-# zip zap zot regression-# blah blah; ERROR: syntax error at or near "zap" at character 28 LINE 3: zip zap zot ^ regression=# regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] pg_ctl reload breaks our client
Michael Fuhr <[EMAIL PROTECTED]> writes: > On Fri, Sep 16, 2005 at 04:34:46PM -0700, Marc Munro wrote: >> On Fri, 2005-09-16 at 17:10 -0600, Michael Fuhr wrote: >> This is not currently seen as a priority (the work-around of "don't do >> that" is seen as sufficient). I'm simply hoping to get someone to say >> for sure that the client app should not be able to tell that a reload >> has happened. At that point I may be able to raise the priority of this >> issue. > As far as I know clients shouldn't notice a reload (which is effected > via a SIGHUP); I just did some tests and didn't see any problems. Existing client connections should not be able to notice a reload that changes pg_hba.conf or pg_ident.conf; however they definitely *should* notice a reload that changes postgresql.conf (at least for parameters that aren't overridden by other cases, such as a SET in the current session). So the blanket statement Marc is making is simply wrong. Whether there is a bug here is impossible to say given the limited amount of information provided. I'd not expect a reload to cause an existing connection to become totally dysfunctional, which is what Marc seems to be claiming ... but without more evidence or a test case, there's not much to be done. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Duplicate Values or Not?!
John Seberg <[EMAIL PROTECTED]> writes: > I recently tried to CREATE a UNIQUE INDEX and could > not, due to duplicate values: > > CREATE UNIQUE INDEX usr_login ON usr (login); > > To try to find the offending row(s), I then executed > the following: > > SELECT count(*), login FROM usr GROUP BY login ORDER > BY 1 DESC; > > The GROUP BY didn't group anything, indicating to me > that there were no duplicate values. There were the > same number of rows in this query as a simple SELECT > count(*) FROM usr. > > This tells me that Postgresql is not using the same > method for determining duplicates when GROUPING and > INDEXing. You might try running the GROUP BY query after doing: set enable_hashagg = false; select ... With that false it would have to sort the results which should be exactly the same code as the index is using. I think. That doesn't really answer the rest of your questions. The short of it is that setting the encoding doesn't magically make your data encoded in that encoding. If your client sends it one encoding but claims it's unicode then Postgres will happily store it in a UNICODE database and it'll be garbage. Maybe someone else will have more specific advice on that front. -- greg ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq