Re: [GENERAL] JDBC in PostgreSql for Linux
You've got your url wrong - it should be "://" after postrgesql instead of "@" I hope, it helps... Dima Kallol Nandi wrote: Hi, This is the code that I am using for native JDBC Driver to connect to PostgreSql in Linux. BTW the version of Postgres is 7.2.2 and the jar file is jdbc7.1-1.2.jar. * import java.io.*; import java.sql.*; import java.util.Properties; import java.util.Vector; import org.postgresql.Driver; public class JDBCConnection { public static void main(String args[]) { System.out.println("Testing"); try { String server ="10.100.102.31"; String port ="5432"; String database="notes"; String url="jdbc:postgresql@"+server+":"+port+":"+database; String userid="postgres"; String password="postgres"; Class.forName("org.postgresql.Driver"); System.out.println("Successful 1 ! "); Connection con=DriverManager.getConnection(url,userid,password); Statement stmt = con.createStatement(); ResultSet rset = stmt.executeQuery ("Select count(*) as c from testtable"); rset.next(); int currval = rset.getInt("c"); System.out.println("Successful 2 ! "); System.out.println(currval); con.close(); } catch(Exception ex) { System.out.println("Error problem!"); System.out.println(ex.getMessage()); ex.printStackTrace(); return; } } } * I am getting the error mentioned i my earlier mail in the following line : Connection con=DriverManager.getConnection(url,userid,password); The error is : ** I get the following error : Driver not found for URL: jdbc:[EMAIL PROTECTED]:5432:notes java.sql.SQLException: Driver not found for URL: jdbc:[EMAIL PROTECTED]:5432:notes at 0x4028115f: java.lang.Throwable.Throwable(java.lang.String) (/usr/lib/libgcj.so.3) at 0x402740d2: java.lang.Exception.Exception(java.lang.String) (/usr/lib/libgcj.so.3) at 0x40316294: java.sql.SQLException.SQLException(java.lang.String, java.lang.String, int) (/usr/lib/libgcj.so.3) at 0x40316244: java.sql.SQLException.SQLException(java.lang.String) (/usr/lib/libgcj.so.3) at 0x40316102: java.sql.DriverManager.getConnection(java.lang.String, java.util.Properties) (/usr/lib/libgcj.so.3) at 0x4031603a: java.sql.DriverManager.getConnection(java.lang.String, java.lang.String, java.lang.String) (/usr/lib/libgcj.so.3) at 0x4039d347: ffi_call_SYSV (/usr/lib/libgcj.so.3) at 0x4039d307: ffi_raw_call (/usr/lib/libgcj.so.3) at 0x40248528: _Jv_InterpMethod.continue1(_Jv_InterpMethodInvocation) (/usr/lib/libgcj.so.3) at 0x40248e34: _Jv_InterpMethod.run(ffi_cif, void, ffi_raw, _Jv_InterpMethodInvocation) (/usr/lib/libgcj.so.3) at 0x40246424: _Jv_InterpMethod.run_normal(ffi_cif, void, ffi_raw, void) (/usr/lib/libgcj.so.3) at 0x4039d1bc: ?? (??:0) at 0x4025b308: gnu.gcj.runtime.FirstThread.call_main() (/usr/lib/libgcj.so.3) at 0x402c60b1: gnu.gcj.runtime.FirstThread.run() (/usr/lib/libgcj.so.3) at 0x40267fdc: _Jv_ThreadRun(java.lang.Thread) (/usr/lib/libgcj.so.3) at 0x4023478c: _Jv_RunMain(java.lang.Class, byte const, int, byte const, boolean) (/usr/lib/libgcj.so.3) at 0x08048900: ?? (??:0) at 0x420158d4: ?? (??:0) at 0x080486c1: ?? (??:0) * Thanks and Regards, Kallol. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Thomas Kellerer Sent: Friday, June 20, 2003 4:12 PM To: [EMAIL PROTECTED] Subject: Re: [GENERAL] JDBC in PostgreSql for Linux Kallol Nandi schrieb: I am running a Java application on Linux which connects to the Postgresql on Linux using jdbcodbc bridge. But this is the error I am getting : I have also tried running a Java application using the Native JDBC Driver. I get the following error : Driver not found for URL: jdbc:[EMAIL PROTECTED]:5432:notes java.sql.SQLException: Driver not found for URL: jdbc:[EMAIL PROTECTED]:5432:notes at 0x4028115f: java.lang.Throwable.Throwable(java.lang.String) (/usr/lib/libgcj.so.3) at 0x402740d2: java.lang.Exception.Exception(java.lang.String) (/usr/lib/libgcj.so.3) at 0x40316294: java.sql.SQLException.SQLException(java.lang.String, java.lang.String, int) (/usr/lib/libgcj.so.3) at 0x40316244: java.sql.SQLException.SQLException(java.lang.String) (/usr/lib/libgcj.so.3) at 0x40316102: java.sql.DriverManager.getConnection(java.lang.String, java.util.Properties) (/usr/lib/libgcj.so.3) at 0x4031603a: java.sql.DriverManager.getConnection(java.lang.String, java.lang.String, java.lang.String) (/usr/lib/libgcj.so.3) at 0x4039d347: ffi_call_SYSV (/usr/lib/libgcj.so.3) at 0x4039d307: ffi_raw_call (/usr/lib/libgcj.so.3) at 0x40248528: _Jv_InterpMethod.continu
Re: [GENERAL] Backwards index scan
If you make an opclass that orders in the reverse order you can use that opclass in creating the index (which effectively can give you an index like x, y desc by using the new opclass on y). There was some talk recently about whether we should provide such opclasses as builtins or contrib items. Ah! Nice :-) I did not think about it... Thanks a lot for the hit! Dima ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[GENERAL] Backwards index scan
I am not sure if this is really a bug, but it certainly looks like one to me... I have a table that looks something like this: create table huge_table ( int x, int y ); create index huge_table_idx on huge_table (x,y); It contains about 80 million rows... I am trying to get those rows that have a particular value for x, ordered by y in descending order (and I am looking to get just a few first ones), so I am running a query like: declare mycursor cursor for select * from huge_table where x=10 order by x desc, y desc; fetch 10 from mycursor; this query takes 10 to 20 minutes! This is because there are *lots* (a few million) of matches for x=10, and _bt_first () scans through them *all* sequentually to get to the last one. Now, if I change the query to look like: declare mycursor cursor for select * from huge_table where x > 9 and x < 11 order by x desc, y desc; (which is the same thing) then fetch 10 from mycursor; returns right away (under a second), just as I expected. I understand that with the generic approach to operators in postgres it is, probably, not very feasible to try and teach _bt_first () to handle this situation automatically (it would need to know how to get next/previous value for every indexable type)... I guess, that could be done by adding another kind of strategy to pg_amop for example... Another way to work around this would be to allow ordering spec to be a part of CREATE INDEX (I know, that informix does that for example) - so that, I could do create index huge_table_idx on huge_table (x, y desc); ... and then select * from huge_table where x=10 order x, y desc; would not require a backwards scan to begin with. Can something like this be done? What do you think? Thanks! Dima ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Postgresql "FIFO" Tables, How-To ?
Ouch, this means that for every insert we would have to trigger a procedure which will: COUNT IF > Limit DELETE OLDEST This would be pretty much damn ressource intensive on a table with million of records, would not it ? You can keep the count in a table on the side, and have it updated by the same trigger (after insert or delete)... Dima ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Postgresql "FIFO" Tables, How-To ?
Sean Chittenden wrote: store 10mil+ syslog messages this might not be the right tool. I'm just mentioning it because it perhaps the way the rrd keeps track of wrap-around might be a good way to implement this in postgres. Hmm. Using the cycling feature of a sequence, couldn't you create a trigger which either inserts (if, e.g., the value of the trigger is not there) or updates (if the value of the trigger is there)? I'm not sure how to do it efficiently, but I haven't thought about it very much. I use this very approach. CREATE SEQUENCE syslog_id_seq INCREMENT 1 MINVALUE 1 MAXVALUE 25 CYCLE; CREATE TABLE syslog ( id INT NOT NULL, msg TEXT NOT NULL ); CREATE UNIQUE INDEX syslog_id_udx ON syslog(id); CREATE FUNCTION syslog_ins(TEXT) RETURNS INT EXTERNAL SECURITY DEFINER AS ' DECLARE a_msg ALIAS FOR $1; v_id syslog.id%TYPE; BEGIN v_id := NEXTVAL(''syslog_id_seq''::TEXT); PERFORM TRUE FROM syslog WHERE id = v_id; IF FOUND THEN UPDATE syslog SET msg = a_msg WHERE id = v_id; ELSE INSERT INTO syslog (id,msg) VALUES (id,msg); END IF; RETURN v_id; ' LANGUAGE 'plpgsql'; I believe, you can save one query by replacing 'if exists then update else insert' part with just 'delete unconditionally then insert' Though this is the inefficient way of doing this. If you wanted to be really slick about it and incur some upfront disk space, populate the table with your 25 rows of bogus data, empty strings, then use the following instead to save yourself a SELECT (which is only of use for the first 25 syslog msgs, then it becomes a given after the sequence wraps): CREATE FUNCTION syslog_ins(TEXT) RETURNS INT EXTERNAL SECURITY DEFINER AS ' DECLARE a_msg ALIAS FOR $1; v_id syslog.id%TYPE; BEGIN v_id := NEXTVAL(''syslog_id_seq''::TEXT); UPDATE syslog SET msg = a_msg WHERE id = v_id; RETURN v_id; ' LANGUAGE 'plpgsql'; ... or you could have another sequence (with no limit, and no cycle) to count the number of inserts - you'd then increment both in the trigger, and, if the insert count is greater then the limit you'd update, else insert. ... or you could do it with the single sequence still, if you get rid of the limit and cycle, and just do if nextval >= limit then update ... where id = nextval % limit else insert Dima ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] 2 connections 1 transaction
The short answer is - there is no way you can do it. Different connections in postgres (and in every other DB engine I heard of) can never share the same transaction. As far as I can see, the only way to do what you want is to rethink your architechture so that the clients never talk directly to the database, and always go through the app server... or, the other way around - get rid of the app server, and just have every client go to the database directly. You can' thave it both ways. Dima Daniel Schuchardt wrote: Hi @ all, Our software consists of a Client-Side App and a Application Server. Every client owns a direct connection to the PSql-Server and for every Client the Application-Server also creates a connection to the PSql-Server. The problem is that it is nescesary that the Client and the Application-Server are in the same transaction. But how say connection x to be in the same transaction like connection y? Thanks for help, Daniel ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] FATAL 2: open of /var/lib/pgsql/data/pg_clog/0EE3
Tom Lane wrote: Proves nothing, since ANALYZE only touches a random sample of the rows. Ok, I understand... Thanks. If you get that behavior with VACUUM, or a full-table SELECT (say, "SELECT count(*) FROM foo"), then it'd be interesting. I never got it with select - only with vacuum and/or analyze... I you suggesting it should/could happen with select, or was that just meant to be an example of a full table scan? I just did select count (*) from that table, and it worked... What range of file names do you actually have in pg_clog/, anyway? Well ... *today* there seem to be files between and 00EC Is that range supposed to stay the same or does it vary? ... because that problem I had happened yesterday, and I have restarted the server since then... Thanks! Dima ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] FATAL 2: open of /var/lib/pgsql/data/pg_clog/0EE3
Tom Lane wrote: Dmitry Tkach <[EMAIL PROTECTED]> writes: Well ... *today* there seem to be files between and 00EC Is that range supposed to stay the same or does it vary? It will vary, but not quickly --- each file represents 1 million transactions. If the problem is erratic with VACUUM or SELECT COUNT(*), then the only speculation I have is flaky hardware: you must be reading different xids from the table at different times. Oops... I just got it again - right after doing that select count (*), that *worked*, I tried analyze the same table, and got: FATAL 2: open of /var/lib/pgsql/data/pg_clog/0980 failed: No such file or directory I tried it again, right away, and got the same error, complaining about 02B0 this time I tried select count (*) again, and it worked. I restarted the server, did analyze, and it worked too... Any ideas? ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] Limited varchar, unlimited varchar, or text?
After looking at the docs on the character datatypes I noticed that if you don't specify a limit on the varchar type it will accept strings of any length. If that's the case, what's the difference between it and text? Actually, I'd like to know this too :-) I think that there is no difference really... But what confuses me is - why are there two completely separate types? Is it just to keep the standards happy? Or is there some hidden difference in the behaviour? For example, there used to be a 'datetime' in 7.2, that was just an alias for timestamp without timezone - so that: create table times (t timestamp without time zone, d datetime); \d times Column |Type | Modifiers +-+--- t | timestamp without time zone | d | timestamp without time zone | But if I try the same thing with text and varchar, I get two different type - text and character varying... Could somebody who knows shed some light on this? Thanks! Dima ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] Limited varchar, unlimited varchar, or text?
Curtis Hawthorne wrote: Hi, I'm setting up a table for a new project and have a question about choosing a data type for one of the columns. It will be for a username that is retrieved from an LDAP server. I know that I'll want to use either varchar or text. The problem with using varchar is I don't know for sure how long the username may be and I don't like just picking a large number for the limit and hoping I don't need to change it in the future. After looking at the docs on the character datatypes I noticed that if you don't specify a limit on the varchar type it will accept strings of any length. If that's the case, what's the difference between it and text? According the the page there's no performance difference between the types so I would lean towards using unlimited varchar or text to avoid having an arbitrary limit, but are there any other hidden problems with using these types? If not, which one should I use? No :-) Just use text. The only difference between text and varchar(10) is that the latter will not let you insert strings longer than 10 characters. Dima ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] Postgres unique index checking and atomic transactions
Greg Stark wrote: So I have to adjust a primary key by adding one to every existing record. Obviously this isn't a routine operation, my data model isn't that messed up. It's a one-time manual operation. However when I tried to do the equivalent of: update tab set pk = pk + 1 I got ERROR: Cannot insert a duplicate key into unique index tab_pkey Is that right? Obviously after completing the query there would be no duplicate keys. Is this a case where I would need deferred constraints to allow this? Even for immediate constraints shouldn't a single sql update be able to go ahead as long as it leaves things in a consistent state? I tend to agree with you, that that's how it should be... I don't know what the standards have to say about it though. You cannot have unique constraints deferred either - only FKs, because the uniqueness is checked right when you attempt to insert the key into the index, and that cannot wait till the end of transaction, because then your current transaction would not be able to use that index (it would be nice to be able to postpone the insertin till the end of the statement though - for performance reasons - but that's not the way it works) :-( The good news though is that, if you drop (or disable) your pk index before the update, and recreate (reindex) afterwards, your update statement should actually perform better ... Dima ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Query analyse
The first query is able to use the index on nr_proponente, because the condition involves that column directly, the second query is not, because the index only contains the values of nt_proponente, not results of trunc(..)/ Try replacing that condition with something like pa.nr_proponente BETWEEN op.nr_proponente AND op.nr_proponente + 0.1 I hope, it helps... Dima Elielson Fontanezi wrote: Good morning! First of all, my envoronment is: Linux netlab142.prodam 2.4.8-26mdk #1 Sun Sep 23 17:06:39 CEST 2001 i686 unknown pg_ctl (PostgreSQL) 7.2.1 I would like some suggestions on how to speed up a query. Both of the queries below are identical except that one of them use the *trunc* function. You can see that the TRUNC function rise hardly up the query response time in the second query. That shouldn´t be happen. Only because a trunc function? What can I be in that case? What does it happen? Sure, there are indexes: CREATE INDEX idx_proposta_2 ON proposta USING btree (in_situacao_proposta); CREATE INDEX idx_proposta_4 ON proposta USING btree (nr_proponente); And pa.nr_proponente is fk and op.nr_proponte is pk. These are the queries: 1o. That is ok. DEBUG: query: select pa.nr_projeto, pa.dc_denom_projeto, pa.nr_proponente, pa.dc_coordenador, op.dc_proponente from proposta pa inner join orgao_proponente op on (pa.nr_proponente = op.nr_proponente) where pa.in_situacao_proposta <> 'E' ORDER BY 1 DESC; DEBUG: QUERY STATISTICS ! system usage stats: ! 0.015904 elapsed 0.00 user 0.02 system sec ! [0.01 user 0.02 sys total] ! 0/0 [0/0] filesystem blocks in/out ! 143/42 [353/172] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 0/0 [0/0] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 88 read, 0 written, buffer hit rate = 89.19 % ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written 2o. But I need to use the trunc function: DEBUG: query: select pa.nr_projeto, pa.dc_denom_projeto, pa.nr_proponente, pa.dc_coordenador, op.dc_proponente from proposta pa inner join orgao_proponente op on (trunc(pa.nr_proponente/10,0)*10 = op.nr_proponente) where pa.in_situacao_proposta <> 'E' ORDER BY 1 DESC; DEBUG: QUERY STATISTICS ! system usage stats: ! 104.665005 elapsed 10.09 user 0.42 system sec ! [10.10 user 0.42 sys total] ! 0/0 [0/0] filesystem blocks in/out ! 141/50 [352/180] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 0/0 [0/0] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 7408 read, 0 written, buffer hit rate = 13.23 % ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] CREATE TABLE with REFERENCE
kay-uwe.genz wrote: Hi @ all, i've a little problem with two tables and FOREIGN KEYs. I've read about this long time ago, but didn't remember me where. Well, I hope you can help me. I've create two TABLEs "counties" and "cities". "Countries" have a row "capital" is REFERENCEd "cities". "cities" have a row country REFERENCEd "countries", where a save the country the city is placed. And now PG couldn't create the TABLEs, because the referenced table doesn't exists in time of creation. Is there another method of creating than the ALTER TABLE the first table after the second is living? No. But what's wrong with ALTER TABLE? Second question. Is there a method of INSERT INTO both tables VALUES without group them in the same Transaction? No (assuming, that you are talking about inserting a new country and a capital at the same time, and that the country's capital column cannot be null). But what's wrong with transactions? Dima ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] CREATE TABLE with REFERENCE
Jonathan Bartlett wrote: Why not just drop the "references" clause? I mean, the point of having transactions is to guarantee integrity within a transaction, if you're not going to have that, why even bother with the clause? Quite the opposite - the point is to guaratee the integrity *outside* the transaction. You can set the constraints to be 'deferred', so that the referential integrity only gets verified at the time you commit your transaction- this way you can allow 'temporary' violations of the constraints inside your transactions, while still being guaranteed that all the data that actually gets committed satisfies all of your constraints. Most of my databases don't even user "references", just because I like the flexibility, and I have multitable keys (keys that can refer to rows from multiple tables). Not much to brag about :-) Dima ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Can trigger return value
Deepa K wrote: Hi, Thanks for your prompt reply. I think I didn't explained the problem clearly. Actually when a client (from an application like java) tries to access the server database which is in network How could I solve the problem. Is 'RAISE EXCEPTION' solves the above problem. If so please explain me. You application will receive an exception with an error message after it tries to do something, that is disallowed by your trogger. I hope, it helps... Dima ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] CREATE TABLE with REFERENCE
Jonathan Bartlett wrote: In the few instances where I go the other way, it's limited to 2 or 3 tables, and I do separate joins combined with a UNION. If you can combine your queries with a union, your table layouts must be very similar if not identical. Why not put everything into the same table then, and just have an FK between that table and the notes? When you delete an object, how do you make sure, that the notes that refer to it get deleted too? I only soft-delete items. What do you mean by soft-delete? Leaving orphaned notes behind? Bad idea... When you insert a note, how do you know the object it is referring to exists? Because it is always added from my note_html code, which only works for working objects. How can you be sure? What if you get hit by a bus, and another admin, who is not as knowledgeable as you are goes to the db, and runs an insert with plain sql? What if, while one connection runs your 'note_html' code, another one deletes an object you are annotating? Let's say, theoretically, somehow some data got in there which wasn't attached to anything. First of all, the only way that could happen is if there was some really broken code, Not necessarily. This can easily happen with concurrent transactions (see above). Also, even if it was indeed only possible because of a broken code, you are not saying that your code is bug-free, are you? Even, if it was, postgres is not, and your filesystem and OS are not either. If the database crashes in the middle of your insert, you'll end up having inconsistent data. but second of all, what would the harm be? Obviously it's a bug, but constraints don't prevent bugs totally either. Constraints don't prevent bugs (nothing does). They *do* prevent data corruption though, and ensure the consistency of your data. If I understand your statement ('what would the harm be') correctly, and you just don't care about your data consistency, then, I guess, you are rigfht - you don't need any constraints... but, in that case, I don't think you need a transactional database to begin with. If all you want from the database is being able to run queries, you are better off running grep on a bunch of text files, or with some light-weight sql tool, like mysql or sqllight - either of those will perfrom a lot better, because they do not bother with the overhead of having to care about your data consistency, and concurrent access. When you insert a new object, how can you be sure there is no object in another table with the same id? We all use the same sequence. Right. What if somebody forgets to use that sequence? What if you load your database from a backup and forget to reinit the sequence? The common way to do this kind of thing is (depending on the application, and particular object's properties) either to merge your five tables into one (possibly, adding an object_type column) or to split your notes table into five (one for each object table), and then make the notes reference the appropriate object. Yes, but the tables have NOTHING to do with each other. If that was the case, you would not be able to combine them with a union, as you said you do... I'm not going to merge my Payments table with my Sponsors table. That would just be nuts. No, it would not. Application logic has nothing to do with your database schema. You need to design the schema to ensure effectiveness and reliability. Then, you design your application on top of it, that handles the business logic. From the database perspective, there is no difference between payments and sponsors, as long as both have the same (or similar) sets of attributes. Iterpreting those attributes is not database's job. Splitting the notes table would be pointless. Why do it? Because that would make it possible to use the constraints. Also, if one adopts your earlier point, it can also be argued, that it is equally 'nuts' to have notes about Payments stored together with notes about Sponsors. Those notes have just as much to do with each other as the objects they annotate. :-) If you insist that Payments must be separate from Sponsors, the same exact argument should be applied to their respective notes The way I have it set up now, it takes _1 line of code_ to add note-taking capabilities to my forms. It would *still* be one line of code with either of the approaches I suggested. Your code doesn't really have to be affected at all (although, I think, it would really benefit from adding the object_type argument to your note_html() function, but even that is not necessary) Why would I want to abandon that just to clutter up my schema? You don't want either of that (abandon, or clutter) :-) You want that same one line of code, working against the properly designed and normalized sql schema, that lets you rely on the database top ensure your data consistency and access efficiency. Then, if I want to enhance the note_html int
Re: [GENERAL] CREATE TABLE with REFERENCE
Jonathan Bartlett wrote: NOTE - after writing all this, I did think of a possible solution, but I'm not sure if PG can handle it. If I made a table called "object" with one column, the object_id, and then had EVERY table inherit from this table. Then, I could have my constraints set up against this master table. U... yeah, that would be a great way to do that *if* the "inheritance" emulation in sql worked that way ... Unfortunately, it does not :-( That's exactly the reason I called it 'half-baked' in one of the earlier messages... When you inherit table B from table A, B will have all the columns A has, but setting up an FK on table C against A won't work, because the FK will check for the key to be present in A *itself*, not in A or any of its inherited children, as one would expect... What you *could* do, though, if you really wanted is to kinda emulate that inheritance on your own, with something like: create table A ( id serial primary key, type text not null ); create unique index a_idx on A (id,type); create table B1 ( id int primary key, type text not null default 'b'; stuff text foreign key (id,type) references A(id,type) on delete cascade on update cascade deferrable initially deferred ); create table B2 ( id int primary key, type text not null default 'b1', stuff text, foreign key (id,type) references A(id,type) on delete cascade on update cascade deferrable initially deferred ); create table C ( id int not null references A on delete cascade on update cascade initially deferred, note text ); ... now you can insert notes for either B1 or B2 into C This kinda works, but just seems like too much trouble to go through - it would be nice if the 'inheritance' support could do something like that for you automatically, but, since it doesn't, I'd rather stick with the old good 'plain sql' solution - get rid of A, merge B1 and B2 together (just B), and make C reference B. If you can combine your queries with a union, your table layouts must be very similar if not identical. Why not put everything into the same table then, and just have an FK between that table and the notes? No, there are _parts_ that are very similar. I don't know where this whole "table-combining" kick came from, but I've usually found that it ends in a disaster. Then you need to normalize your schema first - extract those "very similar" parts, and put them into the single table, and make your notes table reference that one, then create specialized table(s) on the side, that will contain those columns that are different between the objects, and make them reference your 'master' table too. What do you mean by soft-delete? Leaving orphaned notes behind? Bad idea... I have a boolean flag that says "active". I don't ever actually purge data. There are times when it is useful to come back in and look at what's been "deleted". From the user-interrface standpoint it has been deleted, but we can still go back in and retrieve records after they are gone. Well... That's the 'GUI delete'... Sooner or later you will want to do the 'real' delete - either to remove an object that just was created by mistake, and should not be there at all, or simply to clean up your database, and remove the stuff that has been sitting there for years, being 'inactive' How can you be sure? What if you get hit by a bus, and another admin, who is not as knowledgeable as you are goes to the db, and runs an insert with plain sql? They should read the documentation. I am sure, they will... *after* they screw up the database, and begin wonderring what's wrong with it :-) If they want to run an insert with SQL, they should at least be smart about it :) If someone's too stupid to read documentation, they are going to screw up the database no matter what. Not really... If your database schema is thoughtfully designed, one has to be *really* smart to be able to screw something up. Also, even if it was indeed only possible because of a broken code, you are not saying that your code is bug-free, are you? If it's not bug-free, having a good database schema isn't going to save me. No, it is not going to save *you*, but it *is* going to save your *data* Even, if it was, postgres is not, and your filesystem and OS are not either. If the database crashes in the middle of your insert, you'll end up having inconsistent data. Transactions will handle that one. No, they won't, unless you actually use them :-) If I understand your statement ('what would the harm be') correctly, and you just don't care about your data consistency, then, I guess, you are rigfht - you don't need any constraints... but, in that case, I don't You are missing the point. There are many things that must be balanced: * Ease of programming / speed of development * Correctness of code * Consistency of data I can get much further with my system on all three points than I can wi
[GENERAL] Weird query plan
Hi, everybody! Here is a weird problem, I ran into... I have two huge (80 million rows each) tables (a and b), with id as a PK on both of them and also an FK from b referencing a. When I try to run a query like: select * from a, b where a.id >= 7901288 and a.id=b.id limit 1; The query takes *forever*. If I do select * from a,b where b.id >= 7901288 and a.id=b.id limit 1; then it returns right away. The query plan looks identical in both cases: Limit (cost=0.00..12.51 rows=1 width=8) -> Nested Loop (cost=0.00..1009772807.91 rows=80740598 width=8) -> Index Scan using b_pkey on b (cost=0.00..375410773.29 rows=80740598 width=4) -> Index Scan using a_pkey on a (cost=0.00..6.85 rows=1 width=4) ... which makes me think that it decides to use b as the outer table for both cases (which would obviously make it suck in the first one)... :-( This happens on 7.2.4... I have a 7.3 database with the same schema, but it is not populated with data, so I could not test it on 7.3... I looked at the 7.3's query plans though, and they look better to me: Limit (cost=0.00..4.97 rows=1 width=8) -> Nested Loop (cost=0.00..1657.34 rows=333 width=8) -> Index Scan using b_pkey on b (cost=0.00..45.50 rows=333 width=4) Index Cond: (id >= 7901288) -> Index Scan using a_pkey on a (cost=0.00..4.82 rows=1 width=4) Index Cond: (a.id = "outer".id) in the second case, and Limit (cost=0.00..4.97 rows=1 width=8) -> Nested Loop (cost=0.00..1657.34 rows=333 width=8) -> Index Scan using a_pkey on a (cost=0.00..45.50 rows=333 width=4) Index Cond: (id >= 7901288) -> Index Scan using b_pkey on b (cost=0.00..4.82 rows=1 width=4) Index Cond: ("outer".id = b.id) in the first case... (looks like it does swap them around as I expected)... Do you know of anything that got fixed between 7.2.4 and 7.3, related to this problem? I also noticed that changing a,b to b,a in the from clause doesn't affect anything... and (what's even more weird) even using an explicit join doesn't help: explain select a.duns from a natural join b dm where a.id >= 7901288 limit 1; NOTICE: QUERY PLAN: Limit (cost=0.00..12.78 rows=1 width=8) -> Nested Loop (cost=0.00..1023061272.15 rows=80049919 width=8) -> Index Scan using b_pkey on b (cost=0.00..380070641.01 rows=81786784 width=4) -> Index Scan using a_pkey on a (cost=0.00..6.86 rows=1 width=4) :-( Any ideas? Thanks a lot! Dima ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Weird query plan
Hmmm... Indeed. I tried it in 7.2.4 on a couple of empty tables, and it does do the right thing... Also, I have another copy (not exact copy, but identical schema, and similar content... but about twice smaller) of the original database... I tried my query on it, and it works right too. So, there must be something wrong with that particular database I suppose... Any ideas what I should look at? Thanks a lot! Dima Tom Lane wrote: Dmitry Tkach <[EMAIL PROTECTED]> writes: The query plan looks identical in both cases: Limit (cost=0.00..12.51 rows=1 width=8) -> Nested Loop (cost=0.00..1009772807.91 rows=80740598 width=8) -> Index Scan using b_pkey on b (cost=0.00..375410773.29 rows=80740598 width=4) -> Index Scan using a_pkey on a (cost=0.00..6.85 rows=1 width=4) ... which makes me think that it decides to use b as the outer table for both cases (which would obviously make it suck in the first one)... :-( That's what it says, all right, which seems odd to me. Are you sure you looked at the right plans? This happens on 7.2.4... I have a 7.3 database with the same schema, but it is not populated with data, so I could not test it on 7.3... I could not reproduce a problem on 7.2.4. I get (using toy tables, and suppressing the planner's urge to use mergejoin instead) lo=# explain select * from a, b where a.id >= 7901288 and a.id=b.id limit 1; NOTICE: QUERY PLAN: Limit (cost=0.00..4.97 rows=1 width=8) -> Nested Loop (cost=0.00..1657.34 rows=333 width=8) -> Index Scan using a_pkey on a (cost=0.00..45.50 rows=333 width=4) -> Index Scan using b_pkey on b (cost=0.00..4.82 rows=1 width=4) EXPLAIN lo=# explain select * from a, b where b.id >= 7901288 and a.id=b.id limit 1; NOTICE: QUERY PLAN: Limit (cost=0.00..4.97 rows=1 width=8) -> Nested Loop (cost=0.00..1657.34 rows=333 width=8) -> Index Scan using b_pkey on b (cost=0.00..45.50 rows=333 width=4) -> Index Scan using a_pkey on a (cost=0.00..4.82 rows=1 width=4) EXPLAIN which looks like the right thing. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] Weird query plan
Tom Lane wrote: Dmitry Tkach <[EMAIL PROTECTED]> writes: Also, I have another copy (not exact copy, but identical schema, and similar content... but about twice smaller) of the original database... I tried my query on it, and it works right too. So, there must be something wrong with that particular database I suppose... Hmm. Take a look at the pg_stats statistics for the id columns in each case. Could the ones for the misbehaving tables be out of whack somehow? I'm wondering for example if the planner discounted the >= condition because it thought it would match all the rows. Well... It *does* match (almost) all the rows (there are about a million rows before that key, and the remaining 79 mil after)... The stats look in synch with that: for a: stavalues1 | {1000488,33495482,69111011,99286820,129611281,204441828,331968789,508451171,782660252,869480434,989787700} for b: stavalues1 | {1008692,54892364,110119463,192551141,300490851,389609207,465139533,570442801,706876547,849087358,989851076} (The key in the criteria was 7901288 - somewhere in the first bucket) *But* isn't my 'limit' clause supposed to affect that decision? I mean, even though the filter isn't very selective, it should still speed up getting the *first* match... Thanks! Dima P.S. I also tried to look at the stats of that other database I mentioned... The stats for b look similar: stavalues1 | {1028104,25100079,50685614,78032989,105221902,135832793,199827486,611968165,807597786,884897604,969971779} But the stats for a are just *not there at all* (is it even possible?) Could it be the reason why it works on that database (because it uses the default stats instead of the real thing)? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Weird query plan
P.S. I also tried to look at the stats of that other database I mentioned... The stats for b look similar: stavalues1 | {1028104,25100079,50685614,78032989,105221902,135832793,199827486,611968165,807597786,884897604,969971779} But the stats for a are just *not there at all* (is it even possible?) Could it be the reason why it works on that database (because it uses the default stats instead of the real thing)? I ran 'analyze a;' on that database... I now have the same problem with it, but the other way around - the query with a condition on a runs quickly, and one with a condition on b does not... and the query plans are the same, and have a as outer table... The new stats for a look like: stavalues1 | {1003284,61663485,126262679,211106732,300624079,392709544,469196539,572479496,697890767,842087009,989170923} ... but actually, I never tried it with a condition on b on that database before analyze :-( I just tried the "a - variant" (the one that wasn't working originally), and it worked... Now, I tried deleting all the entries for a from pg_statistic, and running the b-variant... and it still doesn't work. So, it probably has nothing to do with that analyze I ran... Dima ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[GENERAL] vacuum full failing in 7.3
Hi, everybody! I am getting a weird failure, trying to vacuum a table in 7.3 - it says "ERROR: Index pg_toast_89407_index is not a btree". Does it ring a bell to anyone? Any ideas what's wrong? Is it my database screwed up? I just created it today... I tried dropping and recreating it... and it seems to be working now. But still, if it all possible, I would appreciate any ideas as to what it could be that caused this problem in the first place... Thanks a lot! Dima P.S. For what it's worth, here is the stack to the place in the code where the badness happens: #0 elog (lev=20, fmt=0x81b5f9b "Index %s is not a btree") at elog.c:114 #1 0x0807e18a in _bt_getroot (rel=0x60eba740, access=1) at nbtpage.c:127 #2 0x08080dce in _bt_endpoint (scan=0x82a7830, dir=ForwardScanDirection) at nbtsearch.c:932 #3 0x08080ad8 in _bt_first (scan=0x82a7830, dir=ForwardScanDirection) at nbtsearch.c:686 #4 0x0807ef3d in btbulkdelete (fcinfo=0xbfffd240) at nbtree.c:627 #5 0x08163aa3 in OidFunctionCall3 (functionId=332, arg1=1626056512, arg2=135047316, arg3=0) at fmgr.c:1275 #6 0x0807b470 in index_bulk_delete (indexRelation=0x60eba740, callback=0x80ca894 , callback_state=0x0) at indexam.c:579 #7 0x080ca65c in scan_index (indrel=0x60eba740, num_tuples=0) at vacuum.c:2612 #8 0x080c7f8b in full_vacuum_rel (onerel=0x60ebdfa0, vacstmt=0x82a0768) at vacuum.c:948 #9 0x080c7e24 in vacuum_rel (relid=89410, vacstmt=0x82a0768, expected_relkind=116 't') at vacuum.c:827 #10 0x080c7e65 in vacuum_rel (relid=89407, vacstmt=0x82a0768, expected_relkind=114 'r') at vacuum.c:850 #11 0x080c7765 in vacuum (vacstmt=0x82a0768) at vacuum.c:290 #12 0x081198da in pg_exec_query_string (query_string=0x82a0538, dest=Remote, parse_context=0x8294e30) at postgres.c:789 #13 0x0811a9f1 in PostgresMain (argc=4, argv=0xbfffd770, username=0x8258849 "postgres") at postgres.c:2013 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[GENERAL] Two rules on a view do not like each other :-(
Hi, everybody! Here is a weird problem I ran into with 7.3.4. This is the complete test case: rapidb=# select version (); version - PostgreSQL 7.3.4 on i686-pc-linux-gnu, compiled by GCC 2.96 (1 row) rapidb=# create table test (x int primary key, y int); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'test_pkey' for table 'test' CREATE TABLE rapidb=# create view test_proxy as select * from test; CREATE VIEW rapidb=# create rule new_test_proxy as on insert to test_proxy do delete from test where x=new.x; CREATE RULE rapidb=# create rule new_test as on insert to test_proxy do instead insert into test values (new.x, new.y); CREATE RULE rapidb=# insert into test_proxy values (1,1); INSERT 663399483 1 rapidb=# select * from test; x | y ---+--- (0 rows) I create a table "test", and a view "test_proxy", then it create two on insert rules on test proxy - first rule deletes the row with the same PK as the one being inserted from test (so that I don't need to check for it before hand if I want to replace the row), the second - INSTEAD rule just does the insert on the actual table. The problem is that the new row seems to NEVER get inserted - the last two commands try to insert a row into test_proxy, and then look at it - the table is empty! This used to work in 7.2: rapidb=# select version(); version - PostgreSQL 7.2.4 on i686-pc-linux-gnu, compiled by GCC egcs-2.91.66 (1 row) rapidb=# create table test (x int, y int); CREATE rapidb=# create view test_proxy as select * from test; CREATE rapidb=# create rule new_test as on insert to test_proxy do instead insert into test values (new.x, new.y); CREATE rapidb=# create rule new_test_proxy as on insert to test_proxy do delete from test where x=new.x; CREATE rapidb=# insert into test_proxy values (1,1); INSERT 0 0 rapidb=# select * from test; x | y ---+--- 1 | 1 (1 row) Does anyone have any idea what is going on here? I suspect, my problem is that the rules get executed in the wrong order - so that a row gets inserted first, and then deleted right away... Is that right? If so, was this change from 7.2.4 done intentionally, or is it a bug? If the former, is there any way (a config option or something) to get the old behaviour back? Thanks a lot for your help! Dima ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match