[GENERAL] PostgreSQL Docs for Paper (benchmark and contrib)
Hello, I’m writing a paper on PostgreSQL. I’m looking for: 1) benchmark comparisons of PostgreSQL against other databases 2) Detailed descriptions (or project web sites) of all the ‘contrib’ projects (like dbsize, tsearch, xml, etc). Could someone point me in the right direction? Thank you! ~ Troy Campano ~
[GENERAL] 7.5 Windows version
Is there any chance I can get a pre-release executable of the Windows native Postgresql so I can start doing some application development? I do not intend to deploy a buggy postgres, but even if it goes boom occasionally it will allow me to build the app so that when everything is finalized I'm months ahead on development. It's important to note that I mainly use Borland compilers so unless this will compile in c++ builder, I'll need executables. = "We'll do the undoable, work the unworkable, scrute the inscrutable and have a long, hard look at the ineffable to see whether it might not be effed after all" ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] 7.5 Windows version
Hello Jonathan, Take look at URL: http://techer.pascal.free.fr/postgis/psqlwin/ That French guy has a binary version of PG for Windows. The zip is at URL: http://techer.pascal.free.fr/postgis/psqlwin.zip Note that I did not tried it! Good luck! Bernard On Tuesday 22 June 2004 15:13, Jonathan Barnhart wrote: > Is there any chance I can get a pre-release executable of the Windows > native Postgresql so I can start doing some application development? I > do not intend to deploy a buggy postgres, but even if it goes boom > occasionally it will allow me to build the app so that when everything > is finalized I'm months ahead on development. > > It's important to note that I mainly use Borland compilers so unless > this will compile in c++ builder, I'll need executables. > > = > "We'll do the undoable, work the unworkable, scrute the inscrutable and > have a long, hard look at the ineffable to see whether it might not be > effed after all" > > ---(end of broadcast)--- > TIP 6: Have you searched our list archives? > >http://archives.postgresql.org ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] More psql problems... >.
Hi all, I have got to say that my first foray into postgresSQL is becoming a very madening experience... I am sure it is my own fault for not knowing very much but it seems that everything I have tried so far to improve performance has in fact made it a lot worse. Now my program dies after roughly 300 seconds of processing directories and updates take literally 10 time longer than inserts (which are themselves very slow). I am sorry for winning... I've been trying to figure this out non stop for nearly two weeks... Anyway, I moved my backup program to another dedicated machine (an AMD Athlon 1.2GHz (1700+) with 512MB RAM and a Seagate Barracuda 7200.7, 2MB buffer ATA/100 IDE drive). As it stands now I have increased shmmax to 128MB and in the 'postgresql.conf' I dropped max_connections to 10 and upped shared_buffers to 4096. What is happening now is that the program does an 'ls' (system call) to get a list of the files and directories starting at the root of a mounted partition. These are read into an array which perl then processes one at a time. the 'ls' value is searched for in the database and if it doesn't exist, the values are inserted. If they do exist, they are updated (at 1/10th the speed). If the file is in fact a directory perl jumps into it and again reads in it's contents into another array and processes the one at a time. It will do this until all files or directories on the partition have been processed. My previous question was performance based, now I just need to get the darn thing working again. Like I said, after ~300 seconds perl dies. If I disable auto-commit then it dies the first time it runs an insert. (this is all done on the same table; 'file_dir'). If I add a 'commit' before each select than a bunch of selects will work (a few dozen) and then it dies anyway. Does this sound at all like a common problem? Thanks for reading my gripe. Madison PS - PostgresSQL 7.4 on Fedora Core 2; indexes on the three columns I search and my SELECT, UPDATE and INSERT calls are prepared. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] PostgreSQL Docs for Paper (benchmark and contrib)
Troy Campano wrote: Hello, I'm writing a paper on PostgreSQL. I'm looking for: 1) benchmark comparisons of PostgreSQL against other databases You might want to start here: http://osdb.sourceforge.net/ Or if you go here, http://pgfoundry.org/ on the right-hand side is a project to start doing the TPC-W tests. Most (all?) commercial databases won't let you publish benchmarks, and the only freely available results tend to be on expensive hardware heavily tuned. 2) Detailed descriptions (or project web sites) of all the 'contrib' projects (like dbsize, tsearch, xml, etc). New project home for PG-related stuff is at pgfoundry (above) - you might also be interested in: PgAdmin http://www.pgadmin.org/ PgPool ftp://ftp.sra.co.jp/pub/cmd/postgres/pgpool/pgpool-2.0.tar.gz Slony http://www.slony.org All the contrib stuff have README files in the source distribution. That should be enought to get you started :-) -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] 7.5 Windows version
Jonathan Barnhart wrote: Is there any chance I can get a pre-release executable of the Windows native Postgresql so I can start doing some application development? I do not intend to deploy a buggy postgres, but even if it goes boom occasionally it will allow me to build the app so that when everything is finalized I'm months ahead on development. You might want to start with the cygwin version at http://www.cygwin.com Start using that today perhaps, and move over to 7.5 when it goes into Beta. It's important to note that I mainly use Borland compilers so unless this will compile in c++ builder, I'll need executables. You'll probably want to check the mailing list archive for the Win32 developers: http://www.postgresql.org/lists.html I *think* there's a nightly build available, but I don't know how usable the windows port is as a development target yet. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] More psql problems... >.
On Tue, Jun 22, 2004 at 03:52:03PM -0400, Madison Kelly wrote: > What is happening now is that the program does an 'ls' (system call) > to get a list of the files and directories starting at the root of a > mounted partition. These are read into an array which perl then > processes one at a time. the 'ls' value is searched for in the database > and if it doesn't exist, the values are inserted. If they do exist, they > are updated (at 1/10th the speed). If the file is in fact a directory > perl jumps into it and again reads in it's contents into another array > and processes the one at a time. It will do this until all files or > directories on the partition have been processed. So you read the entire filesystem again and again? Sounds like a horrible idea to me. Have you tried using the mtimes, etc? > My previous question was performance based, now I just need to get > the darn thing working again. Like I said, after ~300 seconds perl dies. Out of memory? If you save your whole filesystem in a Perl array you are going to consume a lot of memory. This is, of course, not Postgres related, so I'm not sure why you are asking here. -- Alvaro Herrera () Jajaja! Solo hablaba en serio! ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] More psql problems... >.
Alvaro Herrera wrote: On Tue, Jun 22, 2004 at 03:52:03PM -0400, Madison Kelly wrote: What is happening now is that the program does an 'ls' (system call) to get a list of the files and directories starting at the root of a mounted partition. These are read into an array which perl then processes one at a time. the 'ls' value is searched for in the database and if it doesn't exist, the values are inserted. If they do exist, they are updated (at 1/10th the speed). If the file is in fact a directory perl jumps into it and again reads in it's contents into another array and processes the one at a time. It will do this until all files or directories on the partition have been processed. So you read the entire filesystem again and again? Sounds like a horrible idea to me. Have you tried using the mtimes, etc? I haven't heard of 'mtimes' before, I'll google for it now. My previous question was performance based, now I just need to get the darn thing working again. Like I said, after ~300 seconds perl dies. Out of memory? If you save your whole filesystem in a Perl array you are going to consume a lot of memory. This is, of course, not Postgres related, so I'm not sure why you are asking here. Running just the perl portion which reads and parses the file system works fine and fast. It isn't until I make the DB calls that everything breaks. I know that the DB will slow things down but the amount of performance loss I am seeing and the flat out breaking of the program can't be reasonable. Besides, postgresSQL should be able to handle 250,000 SELECTs followed by an UPDATE or INSERT for each on an AMD Athlon 1700+ with 512MB RAM, shouldn't it? Besides, the program is dieing after 5 minutes when the calls are being commited automatically so the work being done shouldn't be filling any memory, should it? Madison ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] 7.5 Windows version
On Tue, 22 Jun 2004 12:13:01 -0700, Jonathan Barnhart wrote: > Is there any chance I can get a pre-release executable of the Windows > native Postgresql so I can start doing some application development? I do > not intend to deploy a buggy postgres, but even if it goes boom > occasionally it will allow me to build the app so that when everything is > finalized I'm months ahead on development. Have a look at: http://momjian.postgresql.org/main/writings/pgsql/win32.html I have no idea how usable this is (just tracking the progress), but there's a binary snapshot. Holger ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Problem while OS-Update
Tom Lane wrote: Heiko Pohl <[EMAIL PROTECTED]> writes: -bash-2.05b$ /usr/local/pgsql/bin/postmaster -D /var/lib/pgsql.old/data FATAL 2: The database cluster was initialized with LC_COLLATE 'de_DE.UTF-8' and LC_CTYPE 'de_DE.UTF-8', but the server was compiled without locale support. It looks like you need to initdb or recompile. You need configure --enable-locale, or some such switch. Got it!!! :- I have my data now running on the 7.4-server and nothing is lost. Great! Thank You very much! Heiko Pohl ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[GENERAL] Connection gets into state where all queries fail
OK, I'm using a pool of worker threads that each have a connection (from my own software, no PHP or anything like that involved), and a connection is in a state where all queries fail. Looking back through the log I find the first entry where this connection has an error (there are successful queries on it prior, so it's not the first query): 2004-06-21 14:51:19 [5589] LOG: query: begin; set constraints all deferred; insert into "PatientCall_Step" ("id", "PatientCallId", "HandledByStaffId", "AssignedToStaffId", "DoneWhen", "Summary", "Notes", "ContactIsAnxious", "IsMedicallyUrgent", "PageMessageId", "AssignToNurseTakingCalls", "AssignNextToNurseTakingCalls") values (7991, 7774, 944557, 297199, '2004-06-21 19:43:00.000-00', '...', '...', 'f', 'f', 7992, 'f', 'f'); insert into "PageMessage" ("id", "FromStaffId", "ToStaffId", "PagerNum", "PagerMessage", "Source") values (7992, 944557, 297199, '7872', 'bogus value', 'PatientCall'); commit; 2004-06-21 14:51:19 [5589] ERROR: value too long for type character varying(80) I've removed the actual varchar values, because the data is confidential. Suffice it to say that there's a hole in the UI, and where the above says 'bogus value' there was indeed a value too long for the column, and the value seems to have had an embedded CR toward the end. So the error message from pg is correct. But the very next query on that process is a simple one, and it fails like so: 2004-06-21 14:51:58 [5589] LOG: query: select * from "PatientCall" where "PatientId" =' 534824'; 2004-06-21 14:51:58 [5589] ERROR: current transaction is aborted, queries ignored until end of transaction block In fact every transaction since then (nearly 300) on that same process has failed with the same error. This is 7.3.2 running on OS X (I've tested 7.4 and intend to upgrade RSN.) So I don't understand the errors. I even tried copying the initial query from the log and pasting into psql, got the error, and was able to continue making queries without further errors. Note that this is not a perfect way to test, because the "embedded CR" might have been switched from CR to LF or vice versa at some point in the logging/downloading/copying/pasting process. - Is there something I need to do after an error like this, in order to get the connection back to a usable state? - Is this a bug? - Is there anything I can do to provide more information? I still have the server running right now, but I'm going to have to reboot soon because this is causing problems. -- Scott Ribe [EMAIL PROTECTED] http://www.killerbytes.com/ (303) 665-7007 voice ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] More psql problems... >.
On Tue, Jun 22, 2004 at 03:52:03PM -0400, Madison Kelly wrote: > Hi all, > > I have got to say that my first foray into postgresSQL is becoming a > very madening experience... I am sure it is my own fault for not knowing > very much but it seems that everything I have tried so far to improve > performance has in fact made it a lot worse. Now my program dies after > roughly 300 seconds of processing directories and updates take literally > 10 time longer than inserts (which are themselves very slow). > > My previous question was performance based, now I just need to get > the darn thing working again. Like I said, after ~300 seconds perl dies. > If I disable auto-commit then it dies the first time it runs an insert. > (this is all done on the same table; 'file_dir'). If I add a 'commit' > before each select than a bunch of selects will work (a few dozen) and > then it dies anyway. What is "dying"? Do you get an error message? From your message there is nowhere near enough information to give you any good answers. Have you run VACUUM [FULL|ANALYZE] recently? Are your indexes being used? If you really want help, post your Per code, queries, database schema, exmaple output, anything that might indicate what your actual problem. We are, unfortunatly, not psychic. We run PostgreSQL with several simultaneous users on several million rows of data doing a few hundred thousand queries a day and the database is not the bottleneck. -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgpWh4riyu2BU.pgp Description: PGP signature
[GENERAL] flock user defined function
I'm trying to write two C language user defined functions, lockfile() and unlockfile(), that call flock using LOCK_EX and LOCK_UN respectively. If I call lockfile from a first psql process it returns successfully. Calling lockfile from a second psql process blocks. However, when I call unlockfile from the first psql process, the second process still blocks. The lockfile call from the second psql proccess doesn't return until I kill the first psql process. Any suggestions? Thanks in advance. Chris Goughnour PG_FUNCTION_INFO_V1(lockFile); Datum lockFile(PG_FUNCTION_ARGS){ text *t=PG_GETARG_TEXT_P(0); char *path=palloc(VARSIZE(t)-VARHDRSZ+1); int fileHandle,status; memcpy((void *)path,(void *)VARDATA(t),VARSIZE(t)-VARHDRSZ); path[VARSIZE(t)-VARHDRSZ]=0; fileHandle=open((const char *)path,O_RDONLY); if(fileHandle==-1){ PG_RETURN_INT32(-1); } if(flock(fileHandle,LOCK_EX)==-1){ PG_RETURN_INT32(-1); } PG_RETURN_INT32(0); } PG_FUNCTION_INFO_V1(unlockFile); Datum unlockFile(PG_FUNCTION_ARGS){ text *t=PG_GETARG_TEXT_P(0); char *path=palloc(VARSIZE(t)-VARHDRSZ+1); int fileHandle; memcpy((void *)path,(void *)VARDATA(t),VARSIZE(t)-VARHDRSZ); path[VARSIZE(t)-VARHDRSZ]=0; fileHandle=open((const char *)path,O_RDONLY); if(fileHandle==-1){ PG_RETURN_INT32(-1); } if(flock(fileHandle,LOCK_UN)==-1){ PG_RETURN_INT32(-1); } PG_RETURN_INT32(0); } ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] [ANNOUNCE] == PostgreSQL Weekly News - June 22nd 2004
Don't forget the auto-vacuum daemon! Martijn van Oosterhout wrote: Crikey! Tablespaces, Win32, nested transactions and PITR. Almost worth a version 8 :) On Tue, Jun 22, 2004 at 12:37:39AM -0400, Robert Treat wrote: == PostgreSQL Weekly News - June 22nd 2004 == ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] flock user defined function
On Tue, Jun 22, 2004 at 02:49:27PM -0700, Chris Goughnour wrote: > I'm trying to write two C language user defined functions, lockfile() and > unlockfile(), that call flock using LOCK_EX and LOCK_UN respectively. If I > call lockfile from a first psql process it returns successfully. Calling > lockfile from a second psql process blocks. However, when I call unlockfile > from the first psql process, the second process still blocks. The lockfile > call from the second psql proccess doesn't return until I kill the first > psql process. > Any suggestions? Thanks in advance. > Chris Goughnour Where do you close the file? That might cause some issues. -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgpqwWNVqzhIL.pgp Description: PGP signature
Re: [GENERAL] Connection gets into state where all queries fail
On Tue, Jun 22, 2004 at 03:06:39PM -0600, Scott Ribe wrote: > OK, I'm using a pool of worker threads that each have a connection (from my > own software, no PHP or anything like that involved), and a connection is in > a state where all queries fail. Looking back through the log I find the > first entry where this connection has an error (there are successful queries > on it prior, so it's not the first query): > Your problem is that you have an error within a transaction and as the error message states: > 2004-06-21 14:51:58 [5589] ERROR: current transaction is aborted, queries > ignored until end of transaction block All your queries will be ignored until you complete the transaction, either with a commit or a rollback. I find the message very clear, how do you think it could be reworded to be more clear? > - Is there something I need to do after an error like this, in order to get > the connection back to a usable state? COMMIT or ROLLBACK > - Is this a bug? In your program, yes. Although I find it interesting that your commit is ignored after the error. Just send it as a seperate query. > - Is there anything I can do to provide more information? I still have the > server running right now, but I'm going to have to reboot soon because this > is causing problems. Just add a transaction commit or rollback after the error. -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgp6dsBptC7JQ.pgp Description: PGP signature
Re: [GENERAL] Connection gets into state where all queries fail
>> OK, I'm using a pool of worker threads that each have a connection (from my >> own software, no PHP or anything like that involved), and a connection is in >> a state where all queries fail. Looking back through the log I find the >> first entry where this connection has an error (there are successful queries >> on it prior, so it's not the first query): >> > > Your problem is that you have an error within a transaction and as the > error message states: > >> 2004-06-21 14:51:58 [5589] ERROR: current transaction is aborted, queries >> ignored until end of transaction block > > All your queries will be ignored until you complete the transaction, > either with a commit or a rollback. I find the message very clear, how > do you think it could be reworded to be more clear? What's not clear is why I should be told this when the invalid query ended with a commit. >> - Is there something I need to do after an error like this, in order to get >> the connection back to a usable state? > > COMMIT or ROLLBACK > >> - Is this a bug? > > In your program, yes. Although I find it interesting that your commit > is ignored after the error. Just send it as a seperate query. Yes, exactly. I've never seen a commit be ignored like this--believe me I've had other erroneous queries, and the connections never got borked like this. >> - Is there anything I can do to provide more information? I still have the >> server running right now, but I'm going to have to reboot soon because this >> is causing problems. > > Just add a transaction commit or rollback after the error. I'll certainly add such code as a fail-safe, but I'd still like to understand more fully what has happened. Can I really be 100% sure this would keep the connection usable, given that the 1 commit already somehow failed to end the transaction block? I can certainly understand that a commit submitted by itself might be recognized where for some reason the original one was skipped over, but I'd still like to know more about what happened. -- Scott Ribe [EMAIL PROTECTED] http://www.killerbytes.com/ (303) 665-7007 voice ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] Connection gets into state where all queries fail
On Tue, Jun 22, 2004 at 04:30:15PM -0600, Scott Ribe wrote: > > All your queries will be ignored until you complete the transaction, > > either with a commit or a rollback. I find the message very clear, how > > do you think it could be reworded to be more clear? > > What's not clear is why I should be told this when the invalid query ended > with a commit. Hmm, it only happens when you send the whole lot in a single query strings. That's not recommended IIRC. > > In your program, yes. Although I find it interesting that your commit > > is ignored after the error. Just send it as a seperate query. > > Yes, exactly. I've never seen a commit be ignored like this--believe me I've > had other erroneous queries, and the connections never got borked like this. Will, it's definitly repeatable: $ perl -MPg -e '$db = Pg::connectdb("dbname=kleptog"); $db->exec("begin; select error; commit;"); $db->exec("select 1"); print $db->errorMessage;' ERROR: current transaction is aborted, queries ignored until end of transaction block > I'll certainly add such code as a fail-safe, but I'd still like to > understand more fully what has happened. Can I really be 100% sure this > would keep the connection usable, given that the 1 commit already somehow > failed to end the transaction block? I can certainly understand that a > commit submitted by itself might be recognized where for some reason the > original one was skipped over, but I'd still like to know more about what > happened. If you send each query in a seperate request, it'll work. I don't know if the current behaviour is really a bug... -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgpiNPNMjIFpv.pgp Description: PGP signature
Re: [GENERAL] flock user defined function
Chris Goughnour <[EMAIL PROTECTED]> writes: > Any suggestions? Thanks in advance. I believe locks are associated with file descriptors (what you're miscalling a handle). The unlock function cannot release a lock that is held via a different descriptor. What it needs to be doing is closing the descriptor that lockFile opened. This would also solve the rather serious descriptor-leak problem you've got. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] flock user defined function
Martijn van Oosterhout <[EMAIL PROTECTED]> writes: > On Tue, Jun 22, 2004 at 02:49:27PM -0700, Chris Goughnour wrote: >> I'm trying to write two C language user defined functions, lockfile() and >> unlockfile(), that call flock using LOCK_EX and LOCK_UN respectively. If I >> call lockfile from a first psql process it returns successfully. Calling >> lockfile from a second psql process blocks. However, when I call unlockfile >> from the first psql process, the second process still blocks. The lockfile >> call from the second psql proccess doesn't return until I kill the first >> psql process. >> Any suggestions? Thanks in advance. >> Chris Goughnour > > > > Where do you close the file? That might cause some issues. Yeah, it's generally best not to call LOCK_UN at all, but just to close the file (which will release the locks). Otherwise, if you are using stdio, you can get a situation where the file is unlocked but its stdio buffer has not been flushed, leading to the corruption you were trying to avoid by locking the file... -Doug ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[GENERAL] Planner choice & tuning
The planner is choosing a sequential scan for my query. I am trying to understand why since it is clearly not the fastest choice, and what the proper tuning dial is to adjust here. Here's the query: select count(1) from measurement where measure_id = 98; TIA. Here are the details: % psql -c "\d measurement" Table "public.measurement" Column | Type | Modifiers +--+- id | integer | not null default nextval('public.measurement_id_seq'::text) host_id| integer | not null measure_id | integer | not null value | double precision | not null duration | double precision | not null default 0.0 moment | timestamp with time zone | not null default ('now'::text)::timestamp(6) with time zone source_id | integer | not null default 1 Indexes: measurement_duration_idx btree (duration), measurement_host_idx btree (host_id), measurement_measure_idx btree (measure_id), measurement_moment_idx btree (moment), measurement_source_idx btree (source_id), measurement_value_idx btree (value) Foreign Key constraints: $2 FOREIGN KEY (measure_id) REFERENCES measure(id) ON UPDATE NO ACTION ON DELETE NO ACTION, $1 FOREIGN KEY (source_id) REFERENCES measurement_source(id) ON UPDATE NO ACTION ON DELETE NO ACTION % psql -c "analyze measurement" ANALYZE % psql -c "select count(1) from measurement" count - 1800866 (1 row) % psql -c "select count(1) from measurement where measure_id = 98" count --- 38862 (1 row) % time psql -c "set enable_seqscan=no; explain analyze select count(1) from measurement where measure_id = 98" QUERY PLAN Aggregate (cost=215208.66..215208.66 rows=1 width=0) (actual time=904.58..904.58 rows=1 loops=1) -> Index Scan using measurement_measure_idx on measurement (cost=0.00..215062.64 rows=58408 width=0) (actual time=0.17..843.78 rows=38866 loops=1) Index Cond: (measure_id = 98) Total runtime: 904.77 msec (4 rows) real0m1.298s user0m0.010s sys 0m0.000s % time psql -c "explain analyze select count(1) from measurement where measure_id = 98" QUERY PLAN - Aggregate (cost=97857.09..97857.09 rows=1 width=0) (actual time=2493.90..2493.90 rows=1 loops=1) -> Seq Scan on measurement (cost=0.00..97711.07 rows=58408 width=0) (actual time=12.94..2430.08 rows=38866 loops=1) Filter: (measure_id = 98) Total runtime: 2494.11 msec (4 rows) real0m2.885s user0m0.000s sys 0m0.000s This seems to be saying the planner thinks its less expensive to do the sequential scan, but why? Including pg_stats data in case it is relevant here. % psql -c "select * from pg_stats where tablename = 'measurement' and attname = 'measure_id'" schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals |most_common_freqs |histogram_bounds | correlation +-++---+---++---+--+-+- public | measurement | measure_id | 0 | 4 | 52 | {104,108,113,124,106,110,101,107,112,109} | {0.084,0.0556667,0.052,0.05,0.048,0.047,0.038,0.036,0.034,0.0326667} | {23,36,39,43,85,89,100,111,120,122,128} |0.232087 (1 row) % psql -c "select name, setting from pg_settings where name like 'random%'" name | setting --+- random_page_cost | 4 (1 row) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] flock user defined function
Thanks. Yeah, I figured that out after Martijn's response. I'm just returning the file descriptor from lockFile, passing it to unlockFile and closing the descriptor there. It works fine now. Thanks for edifying a clueless novice such as myself. :-) > Chris Goughnour <[EMAIL PROTECTED]> writes: >> Any suggestions? Thanks in advance. > > I believe locks are associated with file descriptors (what you're > miscalling a handle). The unlock function cannot release a lock > that is held via a different descriptor. What it needs to be doing > is closing the descriptor that lockFile opened. This would also > solve the rather serious descriptor-leak problem you've got. > > regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] Connection gets into state where all queries fail
Martijn van Oosterhout <[EMAIL PROTECTED]> writes: > On Tue, Jun 22, 2004 at 04:30:15PM -0600, Scott Ribe wrote: >> What's not clear is why I should be told this when the invalid query ended >> with a commit. > Hmm, it only happens when you send the whole lot in a single query > strings. That's not recommended IIRC. When you send multiple commands in a single query string, the entire string is thrown away after an error. That's how it's always worked, and I believe it's well-documented. There are apps that (mis?)use this fact. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] Planner choice & tuning
On Tue, 2004-06-22 at 16:47, Ed L. wrote: > The planner is choosing a sequential scan for my query. I am trying > to understand why since it is clearly not the fastest choice, and > what the proper tuning dial is to adjust here. Here's the query: > > select count(1) from measurement where measure_id = 98; > > TIA. Here are the details: > > % psql -c "\d measurement" > Table "public.measurement" >Column | Type | Modifiers > > +--+- > id | integer | not null default > nextval('public.measurement_id_seq'::text) > host_id| integer | not null > measure_id | integer | not null > value | double precision | not null > duration | double precision | not null default 0.0 > moment | timestamp with time zone | not null default > ('now'::text)::timestamp(6) with time zone > source_id | integer | not null default 1 > Indexes: measurement_duration_idx btree (duration), > measurement_host_idx btree (host_id), > measurement_measure_idx btree (measure_id), > measurement_moment_idx btree (moment), > measurement_source_idx btree (source_id), > measurement_value_idx btree (value) > Foreign Key constraints: $2 FOREIGN KEY (measure_id) REFERENCES measure(id) ON > UPDATE NO ACTION ON DELETE NO ACTION, > $1 FOREIGN KEY (source_id) REFERENCES > measurement_source(id) ON UPDATE NO ACTION ON DELETE NO ACTION > > % psql -c "analyze measurement" > ANALYZE > > % psql -c "select count(1) from measurement" > count > - > 1800866 > (1 row) > > > % psql -c "select count(1) from measurement where measure_id = 98" > count > --- > 38862 > (1 row) > > % time psql -c "set enable_seqscan=no; explain analyze select count(1) from > measurement where measure_id = 98" >QUERY PLAN > > > Aggregate (cost=215208.66..215208.66 rows=1 width=0) (actual time=904.58..904.58 > rows=1 loops=1) >-> Index Scan using measurement_measure_idx on measurement > (cost=0.00..215062.64 rows=58408 width=0) (actual time=0.17..843.78 rows=38866 > loops=1) > Index Cond: (measure_id = 98) > Total runtime: 904.77 msec > (4 rows) > > > real0m1.298s > user0m0.010s > sys 0m0.000s > > % time psql -c "explain analyze select count(1) from measurement where measure_id = > 98" >QUERY PLAN > > - > Aggregate (cost=97857.09..97857.09 rows=1 width=0) (actual time=2493.90..2493.90 > rows=1 loops=1) >-> Seq Scan on measurement (cost=0.00..97711.07 rows=58408 width=0) (actual > time=12.94..2430.08 rows=38866 loops=1) > Filter: (measure_id = 98) > Total runtime: 2494.11 msec > (4 rows) > > > real0m2.885s > user0m0.000s > sys 0m0.000s > > > This seems to be saying the planner thinks its less expensive to do the > sequential scan, but why? > > Including pg_stats data in case it is relevant here. > > % psql -c "select * from pg_stats where tablename = 'measurement' and attname = > 'measure_id'" > schemaname | tablename | attname | null_frac | avg_width | n_distinct | > most_common_vals | > most_common_freqs |histogram_bounds >| correlation > +-++---+---++---+--+-+- > public | measurement | measure_id | 0 | 4 | 52 | > {104,108,113,124,106,110,101,107,112,109} | > {0.084,0.0556667,0.052,0.05,0.048,0.047,0.038,0.036,0.034,0.0326667} | > {23,36,39,43,85,89,100,111,120,122,128} |0.232087 > (1 row) > > % psql -c "select name, setting from pg_settings where name like 'random%'" >name | setting > --+- > random_page_cost | 4 > (1 row) I'd say your random_page_cost is too low for your setup. While there's a slight misguess on the planner in the number of rows (38k verus 58k) it's not that big. Try values between 1.2 and 2.0. Most larger se
Re: [GENERAL] More psql problems... >.
On Tue, 2004-06-22 at 13:52, Madison Kelly wrote: > Hi all, > >I have got to say that my first foray into postgresSQL is becoming a > very madening experience... I am sure it is my own fault for not knowing > very much but it seems that everything I have tried so far to improve > performance has in fact made it a lot worse. Now my program dies after > roughly 300 seconds of processing directories and updates take literally > 10 time longer than inserts (which are themselves very slow). /SNIP >My previous question was performance based, now I just need to get > the darn thing working again. Like I said, after ~300 seconds perl dies. > If I disable auto-commit then it dies the first time it runs an insert. > (this is all done on the same table; 'file_dir'). If I add a 'commit' > before each select than a bunch of selects will work (a few dozen) and > then it dies anyway. Is it exactly 300 seconds? Sounds like a timeout somewhere to me. Does perl have one of those? Or maybe your OS / Shell combo do, or something like it? Just guessing here. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] Planner choice & tuning
"Ed L." <[EMAIL PROTECTED]> writes: > This seems to be saying the planner thinks its less expensive to do the > sequential scan, but why? Because it thinks it's less expensive ;-) There are a couple issues here: one is overestimation of the number of rows matching the query. That is a statistical issue and should be fixable by increasing the statistics target for the column. With an accurate rows estimate the seqscan cost estimate would not change but the indexscan cost estimate would decrease approximately proportionally. Allowing for the row estimation error, the indexscan cost estimate is still about 4x what it ought to be, which means that you could bring the estimated cost (for this query anyway) in line with reality by setting random_page_cost = 1. Note however that we are defining "reality" as "the results of this one single experiment". You should realize in particular that the table is probably fully cached in memory during your test, which is a scenario in which random_page_cost actually is 1.0 (given reasonable assumptions about the behavior of the kernel's cache, anyway). If you optimize for this case then you are likely to be pessimizing the behavior for larger tables that don't fit in memory. My suggestion would be to lower random_page_cost to 3 or so, which would be enough to tip the decision to indexscan for this case, though not to make the estimated cost really truly correct. You should however run a larger set of experiments before doing anything, and realize that any random_page_cost setting is a compromise because the model doesn't take all the variables into account. You can find much more about this issue in the pgsql-performance archives. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] More psql problems... >.
Scott Marlowe wrote: My previous question was performance based, now I just need to get the darn thing working again. Like I said, after ~300 seconds perl dies. If I disable auto-commit then it dies the first time it runs an insert. (this is all done on the same table; 'file_dir'). If I add a 'commit' before each select than a bunch of selects will work (a few dozen) and then it dies anyway. Is it exactly 300 seconds? Sounds like a timeout somewhere to me. Does perl have one of those? Or maybe your OS / Shell combo do, or something like it? Just guessing here. Hi Scott, Thanks for replying. In my frustration I missed the obvious; the web-browser was timing out (web front end)... Darn that makes me feel foolish! Anyway, I went for a walk to relax and calm down and now I can get back to working on performance. Madison ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[GENERAL] Deleting all but one row of a list of non-uniques
Hi all I have a table, for simplicity's sake containing one field, called unid. for example, select unid, oid from table gives me something like this: unid | oid -+- XNO24ORK | 40276607 XNPGJDPU | 40276673 * XNPGJDPU | 40276674 * XNXAAVQ2 | 40277583 ZAZAFAA4 | 40276600 ZAZV5UG4 | 40276446 ZD66A1LL | 40277162 ZDXZ27RS | 40277454 ZEKRT3GM | 40277739 * ZEKRT3GM | 40277740 * ZEKRT3GM | 40277741 * (I've marked the rows with duplicated unids with the * to the right) I'd like to delete rows in such a way that one (and any one) row for each unid remains, and all other duplicates are removed. Does anyone have any ideas that may help me here please? slecting distinct is not helpful as in reality there are other fields which contain data like timestamps that will differ but I don't mind which one remains, but I'd need to work with them on selects to the table afterwards. I've tried: delete from table where oid in (select p1.oid from table p1, table p2 where p1.oid != p2.oid and p1.unid=p2.unid); which only works in a few cases - I suspect where there are only two rows with the same unid. Is it even possible? Alternatively, can I get the \copy command to ignore attempts to insert duplicated rows into a UNIQUE column instead of aborting the copy? Not sure if any of the options that can be supplied to the table at creation time for unique will help here. Thanks. Ciao Zak -- http://www.carfolio.com/Searchable database of 10 000+ car specs ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Deleting all but one row of a list of non-uniques
On Wed, Jun 23, 2004 at 02:17:50AM +0200, Zak McGregor wrote: > Hi all > > > I have a table, for simplicity's sake containing one field, called unid. > (I've marked the rows with duplicated unids with the * to the right) > > I'd like to delete rows in such a way that one (and any one) row for each unid > remains, and all other duplicates are removed. Does anyone have any ideas that > may help me here please? The usual scheme I use is something like: (not perfect SQL) delete from table where exists (select from table where this.id < that.id). Hope this helps, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgparxoRVLE86.pgp Description: PGP signature
Re: [GENERAL] Connection gets into state where all queries fail
>From Martijn van Oosterhout: > Hmm, it only happens when you send the whole lot in a single query > strings. That's not recommended IIRC. And from Tom Lane: > When you send multiple commands in a single query string, the entire > string is thrown away after an error. That's how it's always worked, > and I believe it's well-documented. There are apps that (mis?)use this > fact. OK, so I think I understand now. I'd earlier seen errors in multi-statement strings and this problem didn't happen. But those errors were with constraints that were deferred, so the error wasn't generated until the commit was being processed, so the transaction block was terminated. While this error (string too long for varchar column) happens as soon as the insert is handled and the commit is never seen. And as for: > Will, it's definitly repeatable: I suppose psql sends either a commit or rollback after the query generates the error, maybe after every query, so my attempt to use it to check this wasn't a valid test. Thanks for the help. -- Scott Ribe [EMAIL PROTECTED] http://www.killerbytes.com/ (303) 665-7007 voice ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] Deleting all but one row of a list of non-uniques
Try this. Create a temp table with a list of the duplicate unid's eg create temp table duplicates as select min(oid) as oid_val, unid from group by unid having count(*) > 1; Then isolate the unwanted rows update set unid = null from duplicates where .unid = duplicates.unid and oid_val <> .oid Then delete them delete from where unid is null Thanks Andrew -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Zak McGregor Sent: Wednesday, 23 June 2004 10:18 AM To: [EMAIL PROTECTED] Subject: [GENERAL] Deleting all but one row of a list of non-uniques Hi all I have a table, for simplicity's sake containing one field, called unid. for example, select unid, oid from table gives me something like this: unid | oid -+- XNO24ORK | 40276607 XNPGJDPU | 40276673 * XNPGJDPU | 40276674 * XNXAAVQ2 | 40277583 ZAZAFAA4 | 40276600 ZAZV5UG4 | 40276446 ZD66A1LL | 40277162 ZDXZ27RS | 40277454 ZEKRT3GM | 40277739 * ZEKRT3GM | 40277740 * ZEKRT3GM | 40277741 * (I've marked the rows with duplicated unids with the * to the right) I'd like to delete rows in such a way that one (and any one) row for each unid remains, and all other duplicates are removed. Does anyone have any ideas that may help me here please? slecting distinct is not helpful as in reality there are other fields which contain data like timestamps that will differ but I don't mind which one remains, but I'd need to work with them on selects to the table afterwards. I've tried: delete from table where oid in (select p1.oid from table p1, table p2 where p1.oid != p2.oid and p1.unid=p2.unid); which only works in a few cases - I suspect where there are only two rows with the same unid. Is it even possible? Alternatively, can I get the \copy command to ignore attempts to insert duplicated rows into a UNIQUE column instead of aborting the copy? Not sure if any of the options that can be supplied to the table at creation time for unique will help here. Thanks. Ciao Zak -- http://www.carfolio.com/Searchable database of 10 000+ car specs ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Connection gets into state where all queries fail
On Tue, Jun 22, 2004 at 07:02:28PM -0600, Scott Ribe wrote: > I suppose psql sends either a commit or rollback after the query generates > the error, maybe after every query, so my attempt to use it to check this > wasn't a valid test. Nope, psql breaks the statements on ';' and sends each query individually, so the issue never comes up. -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgph2q83tAM8J.pgp Description: PGP signature