[GENERAL] PostgreSQL Docs for Paper (benchmark and contrib)

2004-06-22 Thread Troy Campano








Hello, 

I’m writing a paper on PostgreSQL.

 

I’m looking for:

1)   benchmark
comparisons of PostgreSQL against other databases

2)   Detailed
descriptions (or project web sites) of all the ‘contrib’ projects
(like dbsize, tsearch, xml, etc).

 

Could someone point me in the right direction?

 

 

Thank you!

 

~ Troy Campano ~








[GENERAL] 7.5 Windows version

2004-06-22 Thread Jonathan Barnhart
Is there any chance I can get a pre-release executable of the Windows
native Postgresql so I can start doing some application development?  I
do not intend to deploy a buggy postgres, but even if it goes boom
occasionally it will allow me to build the app so that when everything
is finalized I'm months ahead on development.  

It's important to note that I mainly use Borland compilers so unless
this will compile in c++ builder, I'll need executables.

=
"We'll do the undoable, work the unworkable, scrute the inscrutable and have a long, 
hard look at the ineffable to see whether it might not be effed after all"

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] 7.5 Windows version

2004-06-22 Thread Bernard Clement
Hello Jonathan,

Take look at URL: http://techer.pascal.free.fr/postgis/psqlwin/

That French guy has a binary version of PG for Windows.

The zip is at URL: http://techer.pascal.free.fr/postgis/psqlwin.zip

Note that I did not tried it!

Good luck!

Bernard

On Tuesday 22 June 2004 15:13, Jonathan Barnhart wrote:
> Is there any chance I can get a pre-release executable of the Windows
> native Postgresql so I can start doing some application development?  I
> do not intend to deploy a buggy postgres, but even if it goes boom
> occasionally it will allow me to build the app so that when everything
> is finalized I'm months ahead on development.
>
> It's important to note that I mainly use Borland compilers so unless
> this will compile in c++ builder, I'll need executables.
>
> =
> "We'll do the undoable, work the unworkable, scrute the inscrutable and
> have a long, hard look at the ineffable to see whether it might not be
> effed after all"
>
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
>
>http://archives.postgresql.org


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[GENERAL] More psql problems... >.

2004-06-22 Thread Madison Kelly
Hi all,
  I have got to say that my first foray into postgresSQL is becoming a 
very madening experience... I am sure it is my own fault for not knowing 
very much but it seems that everything I have tried so far to improve 
performance has in fact made it a lot worse. Now my program dies after 
roughly 300 seconds of processing directories and updates take literally 
10 time longer than inserts (which are themselves very slow).

  I am sorry for winning... I've been trying to figure this out non 
stop for nearly two weeks...

  Anyway, I moved my backup program to another dedicated machine (an 
AMD Athlon 1.2GHz (1700+) with 512MB RAM and a Seagate Barracuda 7200.7, 
2MB buffer ATA/100 IDE drive). As it stands now I have increased shmmax 
to 128MB and in the 'postgresql.conf' I dropped max_connections to 10 
and upped shared_buffers to 4096.

  What is happening now is that the program does an 'ls' (system call) 
to get a list of the files and directories starting at the root of a 
mounted partition. These are read into an array which perl then 
processes one at a time. the 'ls' value is searched for in the database 
and if it doesn't exist, the values are inserted. If they do exist, they 
are updated (at 1/10th the speed). If the file is in fact a directory 
perl jumps into it and again reads in it's contents into another array 
and processes the one at a time. It will do this until all files or 
directories on the partition have been processed.

  My previous question was performance based, now I just need to get 
the darn thing working again. Like I said, after ~300 seconds perl dies. 
If I disable auto-commit then it dies the first time it runs an insert. 
(this is all done on the same table; 'file_dir'). If I add a 'commit' 
before each select than a bunch of selects will work (a few dozen) and 
then it dies anyway.

  Does this sound at all like a common problem? Thanks for reading my 
gripe.

Madison
PS - PostgresSQL 7.4 on Fedora Core 2; indexes on the three columns I 
search and my SELECT, UPDATE and INSERT calls are prepared.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] PostgreSQL Docs for Paper (benchmark and contrib)

2004-06-22 Thread Richard Huxton
Troy Campano wrote:
Hello, 

I'm writing a paper on PostgreSQL.
 

I'm looking for:
1)   benchmark comparisons of PostgreSQL against other databases
You might want to start here: http://osdb.sourceforge.net/
Or if you go here, http://pgfoundry.org/ on the right-hand side is a 
project to start doing the TPC-W tests.

Most (all?) commercial databases won't let you publish benchmarks, and 
the only freely available results tend to be on expensive hardware 
heavily tuned.

2)   Detailed descriptions (or project web sites) of all the 'contrib'
projects (like dbsize, tsearch, xml, etc).
New project home for PG-related stuff is at pgfoundry (above) - you 
might also be interested in:

PgAdmin
  http://www.pgadmin.org/
PgPool
  ftp://ftp.sra.co.jp/pub/cmd/postgres/pgpool/pgpool-2.0.tar.gz
Slony
  http://www.slony.org
All the contrib stuff have README files in the source distribution. That 
should be enought to get you started :-)

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] 7.5 Windows version

2004-06-22 Thread Richard Huxton
Jonathan Barnhart wrote:
Is there any chance I can get a pre-release executable of the Windows
native Postgresql so I can start doing some application development?  I
do not intend to deploy a buggy postgres, but even if it goes boom
occasionally it will allow me to build the app so that when everything
is finalized I'm months ahead on development.  
You might want to start with the cygwin version at http://www.cygwin.com
Start using that today perhaps, and move over to 7.5 when it goes into Beta.
It's important to note that I mainly use Borland compilers so unless
this will compile in c++ builder, I'll need executables.
You'll probably want to check the mailing list archive for the Win32 
developers: http://www.postgresql.org/lists.html
I *think* there's a nightly build available, but I don't know how usable 
the windows port is as a development target yet.

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] More psql problems... >.

2004-06-22 Thread Alvaro Herrera
On Tue, Jun 22, 2004 at 03:52:03PM -0400, Madison Kelly wrote:

>   What is happening now is that the program does an 'ls' (system call) 
> to get a list of the files and directories starting at the root of a 
> mounted partition. These are read into an array which perl then 
> processes one at a time. the 'ls' value is searched for in the database 
> and if it doesn't exist, the values are inserted. If they do exist, they 
> are updated (at 1/10th the speed). If the file is in fact a directory 
> perl jumps into it and again reads in it's contents into another array 
> and processes the one at a time. It will do this until all files or 
> directories on the partition have been processed.

So you read the entire filesystem again and again?  Sounds like a
horrible idea to me.  Have you tried using the mtimes, etc?


>   My previous question was performance based, now I just need to get 
> the darn thing working again. Like I said, after ~300 seconds perl dies. 

Out of memory?  If you save your whole filesystem in a Perl array you
are going to consume a lot of memory.  This is, of course, not Postgres
related, so I'm not sure why you are asking here.

-- 
Alvaro Herrera ()
Jajaja! Solo hablaba en serio!


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] More psql problems... >.

2004-06-22 Thread Madison Kelly
Alvaro Herrera wrote:
On Tue, Jun 22, 2004 at 03:52:03PM -0400, Madison Kelly wrote:

 What is happening now is that the program does an 'ls' (system call) 
to get a list of the files and directories starting at the root of a 
mounted partition. These are read into an array which perl then 
processes one at a time. the 'ls' value is searched for in the database 
and if it doesn't exist, the values are inserted. If they do exist, they 
are updated (at 1/10th the speed). If the file is in fact a directory 
perl jumps into it and again reads in it's contents into another array 
and processes the one at a time. It will do this until all files or 
directories on the partition have been processed.

So you read the entire filesystem again and again?  Sounds like a
horrible idea to me.  Have you tried using the mtimes, etc?
  I haven't heard of 'mtimes' before, I'll google for it now.
 My previous question was performance based, now I just need to get 
the darn thing working again. Like I said, after ~300 seconds perl dies. 

Out of memory?  If you save your whole filesystem in a Perl array you
are going to consume a lot of memory.  This is, of course, not Postgres
related, so I'm not sure why you are asking here.
  Running just the perl portion which reads and parses the file system 
works fine and fast. It isn't until I make the DB calls that everything 
breaks. I know that the DB will slow things down but the amount of 
performance loss I am seeing and the flat out breaking of the program 
can't be reasonable.

  Besides, postgresSQL should be able to handle 250,000 SELECTs 
followed by an UPDATE or INSERT for each on an AMD Athlon 1700+ with 
512MB RAM, shouldn't it? Besides, the program is dieing after 5 minutes 
when the calls are being commited automatically so the work being done 
shouldn't be filling any memory, should it?

Madison
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [GENERAL] 7.5 Windows version

2004-06-22 Thread Holger Hoffstaette
On Tue, 22 Jun 2004 12:13:01 -0700, Jonathan Barnhart wrote:

> Is there any chance I can get a pre-release executable of the Windows
> native Postgresql so I can start doing some application development?  I do
> not intend to deploy a buggy postgres, but even if it goes boom
> occasionally it will allow me to build the app so that when everything is
> finalized I'm months ahead on development.

Have a look at:
http://momjian.postgresql.org/main/writings/pgsql/win32.html

I have no idea how usable this is (just tracking the progress), but
there's a binary snapshot.

Holger



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Problem while OS-Update

2004-06-22 Thread Heiko Pohl

Tom Lane wrote:
Heiko Pohl <[EMAIL PROTECTED]> writes:
 

-bash-2.05b$ /usr/local/pgsql/bin/postmaster -D /var/lib/pgsql.old/data
FATAL 2:  The database cluster was initialized with LC_COLLATE 
'de_DE.UTF-8' and
   LC_CTYPE 'de_DE.UTF-8', but the server was compiled without 
locale support.
   It looks like you need to initdb or recompile.
   

You need configure --enable-locale, or some such switch.
 

Got it!!!  :-
I have my data now running on the 7.4-server and nothing is lost. Great! 
Thank You very much!

Heiko Pohl
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


[GENERAL] Connection gets into state where all queries fail

2004-06-22 Thread Scott Ribe
OK, I'm using a pool of worker threads that each have a connection (from my
own software, no PHP or anything like that involved), and a connection is in
a state where all queries fail. Looking back through the log I find the
first entry where this connection has an error (there are successful queries
on it prior, so it's not the first query):


2004-06-21 14:51:19 [5589]   LOG:  query: begin; set constraints all
deferred;
insert into "PatientCall_Step"
("id", "PatientCallId", "HandledByStaffId", "AssignedToStaffId", "DoneWhen",
"Summary", "Notes", "ContactIsAnxious", "IsMedicallyUrgent",
"PageMessageId", "AssignToNurseTakingCalls", "AssignNextToNurseTakingCalls")
values (7991, 7774, 944557, 297199, '2004-06-21 19:43:00.000-00', '...',
'...', 'f', 'f', 7992, 'f', 'f');
insert into "PageMessage"
("id", "FromStaffId", "ToStaffId", "PagerNum", "PagerMessage", "Source")
values (7992, 944557, 297199, '7872', 'bogus value', 'PatientCall');
commit;

2004-06-21 14:51:19 [5589]   ERROR:  value too long for type character
varying(80)


I've removed the actual varchar values, because the data is confidential.
Suffice it to say that there's a hole in the UI, and where the above says
'bogus value' there was indeed a value too long for the column, and the
value seems to have had an embedded CR toward the end. So the error message
from pg is correct.

But the very next query on that process is a simple one, and it fails like
so:


2004-06-21 14:51:58 [5589]   LOG:  query: select * from "PatientCall" where
"PatientId" =' 534824';

2004-06-21 14:51:58 [5589]   ERROR:  current transaction is aborted, queries
ignored until end of transaction block


In fact every transaction since then (nearly 300) on that same process has
failed with the same error. This is 7.3.2 running on OS X (I've tested 7.4
and intend to upgrade RSN.)

So I don't understand the errors. I even tried copying the initial query
from the log and pasting into psql, got the error, and was able to continue
making queries without further errors. Note that this is not a perfect way
to test, because the "embedded CR" might have been switched from CR to LF or
vice versa at some point in the logging/downloading/copying/pasting process.

- Is there something I need to do after an error like this, in order to get
the connection back to a usable state?

- Is this a bug?

- Is there anything I can do to provide more information? I still have the
server running right now, but I'm going to have to reboot soon because this
is causing problems.


-- 
Scott Ribe
[EMAIL PROTECTED]
http://www.killerbytes.com/
(303) 665-7007 voice


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] More psql problems... >.

2004-06-22 Thread Martijn van Oosterhout
On Tue, Jun 22, 2004 at 03:52:03PM -0400, Madison Kelly wrote:
> Hi all,
> 
>   I have got to say that my first foray into postgresSQL is becoming a 
> very madening experience... I am sure it is my own fault for not knowing 
> very much but it seems that everything I have tried so far to improve 
> performance has in fact made it a lot worse. Now my program dies after 
> roughly 300 seconds of processing directories and updates take literally 
> 10 time longer than inserts (which are themselves very slow).


> 
>   My previous question was performance based, now I just need to get 
> the darn thing working again. Like I said, after ~300 seconds perl dies. 
> If I disable auto-commit then it dies the first time it runs an insert. 
> (this is all done on the same table; 'file_dir'). If I add a 'commit' 
> before each select than a bunch of selects will work (a few dozen) and 
> then it dies anyway.

What is "dying"? Do you get an error message? From your message there
is nowhere near enough information to give you any good answers. Have
you run VACUUM [FULL|ANALYZE] recently? Are your indexes being used? If
you really want help, post your Per code, queries, database schema,
exmaple output, anything that might indicate what your actual problem.
We are, unfortunatly, not psychic.

We run PostgreSQL with several simultaneous users on several million
rows of data doing a few hundred thousand queries a day and the
database is not the bottleneck.

-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


pgpWh4riyu2BU.pgp
Description: PGP signature


[GENERAL] flock user defined function

2004-06-22 Thread Chris Goughnour
I'm trying to write two C language user defined functions, lockfile() and
unlockfile(), that call flock using LOCK_EX and LOCK_UN respectively.  If I
call lockfile from a first psql process it returns successfully.  Calling
lockfile from a second psql process blocks. However, when I call unlockfile
from the first psql process, the second process still blocks.  The lockfile
call from the second psql proccess doesn't return until I kill the first
psql process.
Any suggestions? Thanks in advance.
Chris Goughnour

PG_FUNCTION_INFO_V1(lockFile);
Datum lockFile(PG_FUNCTION_ARGS){
text *t=PG_GETARG_TEXT_P(0);
char *path=palloc(VARSIZE(t)-VARHDRSZ+1);
int fileHandle,status;
memcpy((void *)path,(void *)VARDATA(t),VARSIZE(t)-VARHDRSZ);
path[VARSIZE(t)-VARHDRSZ]=0;
fileHandle=open((const char *)path,O_RDONLY);
if(fileHandle==-1){
PG_RETURN_INT32(-1);
}
if(flock(fileHandle,LOCK_EX)==-1){
PG_RETURN_INT32(-1);
}
PG_RETURN_INT32(0);
}

PG_FUNCTION_INFO_V1(unlockFile);
Datum unlockFile(PG_FUNCTION_ARGS){
text *t=PG_GETARG_TEXT_P(0);
char *path=palloc(VARSIZE(t)-VARHDRSZ+1);
int fileHandle;
memcpy((void *)path,(void *)VARDATA(t),VARSIZE(t)-VARHDRSZ);
path[VARSIZE(t)-VARHDRSZ]=0;
fileHandle=open((const char *)path,O_RDONLY);
if(fileHandle==-1){
PG_RETURN_INT32(-1);
}
if(flock(fileHandle,LOCK_UN)==-1){
PG_RETURN_INT32(-1);
}
PG_RETURN_INT32(0);
}


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] [ANNOUNCE] == PostgreSQL Weekly News - June 22nd 2004

2004-06-22 Thread Rick Gigger
Don't forget the auto-vacuum daemon!
Martijn van Oosterhout wrote:
Crikey! Tablespaces, Win32, nested transactions and PITR. Almost worth
a version 8 :)
On Tue, Jun 22, 2004 at 12:37:39AM -0400, Robert Treat wrote:
== PostgreSQL Weekly News - June 22nd 2004 ==


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [GENERAL] flock user defined function

2004-06-22 Thread Martijn van Oosterhout
On Tue, Jun 22, 2004 at 02:49:27PM -0700, Chris Goughnour wrote:
> I'm trying to write two C language user defined functions, lockfile() and
> unlockfile(), that call flock using LOCK_EX and LOCK_UN respectively.  If I
> call lockfile from a first psql process it returns successfully.  Calling
> lockfile from a second psql process blocks. However, when I call unlockfile
> from the first psql process, the second process still blocks.  The lockfile
> call from the second psql proccess doesn't return until I kill the first
> psql process.
> Any suggestions? Thanks in advance.
> Chris Goughnour



Where do you close the file? That might cause some issues.
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


pgpqwWNVqzhIL.pgp
Description: PGP signature


Re: [GENERAL] Connection gets into state where all queries fail

2004-06-22 Thread Martijn van Oosterhout
On Tue, Jun 22, 2004 at 03:06:39PM -0600, Scott Ribe wrote:
> OK, I'm using a pool of worker threads that each have a connection (from my
> own software, no PHP or anything like that involved), and a connection is in
> a state where all queries fail. Looking back through the log I find the
> first entry where this connection has an error (there are successful queries
> on it prior, so it's not the first query):
> 

Your problem is that you have an error within a transaction and as the
error message states:

> 2004-06-21 14:51:58 [5589]   ERROR:  current transaction is aborted, queries
> ignored until end of transaction block

All your queries will be ignored until you complete the transaction,
either with a commit or a rollback. I find the message very clear, how
do you think it could be reworded to be more clear?

> - Is there something I need to do after an error like this, in order to get
> the connection back to a usable state?

COMMIT or ROLLBACK

> - Is this a bug?

In your program, yes. Although I find it interesting that your commit
is ignored after the error. Just send it as a seperate query.

> - Is there anything I can do to provide more information? I still have the
> server running right now, but I'm going to have to reboot soon because this
> is causing problems.

Just add a transaction commit or rollback after the error.
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


pgp6dsBptC7JQ.pgp
Description: PGP signature


Re: [GENERAL] Connection gets into state where all queries fail

2004-06-22 Thread Scott Ribe
>> OK, I'm using a pool of worker threads that each have a connection (from my
>> own software, no PHP or anything like that involved), and a connection is in
>> a state where all queries fail. Looking back through the log I find the
>> first entry where this connection has an error (there are successful queries
>> on it prior, so it's not the first query):
>> 
> 
> Your problem is that you have an error within a transaction and as the
> error message states:
> 
>> 2004-06-21 14:51:58 [5589]   ERROR:  current transaction is aborted, queries
>> ignored until end of transaction block
> 
> All your queries will be ignored until you complete the transaction,
> either with a commit or a rollback. I find the message very clear, how
> do you think it could be reworded to be more clear?

What's not clear is why I should be told this when the invalid query ended
with a commit.

>> - Is there something I need to do after an error like this, in order to get
>> the connection back to a usable state?
> 
> COMMIT or ROLLBACK
> 
>> - Is this a bug?
> 
> In your program, yes. Although I find it interesting that your commit
> is ignored after the error. Just send it as a seperate query.

Yes, exactly. I've never seen a commit be ignored like this--believe me I've
had other erroneous queries, and the connections never got borked like this.

>> - Is there anything I can do to provide more information? I still have the
>> server running right now, but I'm going to have to reboot soon because this
>> is causing problems.
> 
> Just add a transaction commit or rollback after the error.

I'll certainly add such code as a fail-safe, but I'd still like to
understand more fully what has happened. Can I really be 100% sure this
would keep the connection usable, given that the 1 commit already somehow
failed to end the transaction block? I can certainly understand that a
commit submitted by itself might be recognized where for some reason the
original one was skipped over, but I'd still like to know more about what
happened.


-- 
Scott Ribe
[EMAIL PROTECTED]
http://www.killerbytes.com/
(303) 665-7007 voice


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Connection gets into state where all queries fail

2004-06-22 Thread Martijn van Oosterhout
On Tue, Jun 22, 2004 at 04:30:15PM -0600, Scott Ribe wrote:
> > All your queries will be ignored until you complete the transaction,
> > either with a commit or a rollback. I find the message very clear, how
> > do you think it could be reworded to be more clear?
> 
> What's not clear is why I should be told this when the invalid query ended
> with a commit.

Hmm, it only happens when you send the whole lot in a single query
strings. That's not recommended IIRC.

> > In your program, yes. Although I find it interesting that your commit
> > is ignored after the error. Just send it as a seperate query.
> 
> Yes, exactly. I've never seen a commit be ignored like this--believe me I've
> had other erroneous queries, and the connections never got borked like this.

Will, it's definitly repeatable:

$ perl -MPg -e '$db = Pg::connectdb("dbname=kleptog");
$db->exec("begin; select error; commit;");
$db->exec("select 1"); 
print $db->errorMessage;' 
ERROR:  current transaction is aborted, queries ignored until end of
transaction block

> I'll certainly add such code as a fail-safe, but I'd still like to
> understand more fully what has happened. Can I really be 100% sure this
> would keep the connection usable, given that the 1 commit already somehow
> failed to end the transaction block? I can certainly understand that a
> commit submitted by itself might be recognized where for some reason the
> original one was skipped over, but I'd still like to know more about what
> happened.

If you send each query in a seperate request, it'll work. I don't know
if the current behaviour is really a bug...

-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


pgpiNPNMjIFpv.pgp
Description: PGP signature


Re: [GENERAL] flock user defined function

2004-06-22 Thread Tom Lane
Chris Goughnour <[EMAIL PROTECTED]> writes:
> Any suggestions? Thanks in advance.

I believe locks are associated with file descriptors (what you're
miscalling a handle).  The unlock function cannot release a lock
that is held via a different descriptor.  What it needs to be doing
is closing the descriptor that lockFile opened.  This would also
solve the rather serious descriptor-leak problem you've got.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] flock user defined function

2004-06-22 Thread Doug McNaught
Martijn van Oosterhout <[EMAIL PROTECTED]> writes:

> On Tue, Jun 22, 2004 at 02:49:27PM -0700, Chris Goughnour wrote:
>> I'm trying to write two C language user defined functions, lockfile() and
>> unlockfile(), that call flock using LOCK_EX and LOCK_UN respectively.  If I
>> call lockfile from a first psql process it returns successfully.  Calling
>> lockfile from a second psql process blocks. However, when I call unlockfile
>> from the first psql process, the second process still blocks.  The lockfile
>> call from the second psql proccess doesn't return until I kill the first
>> psql process.
>> Any suggestions? Thanks in advance.
>> Chris Goughnour
>
> 
>
> Where do you close the file? That might cause some issues.

Yeah, it's generally best not to call LOCK_UN at all, but just to
close the file (which will release the locks).  Otherwise, if you are
using stdio, you can get a situation where the file is unlocked but
its stdio buffer has not been flushed, leading to the corruption you
were trying to avoid by locking the file...

-Doug

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[GENERAL] Planner choice & tuning

2004-06-22 Thread Ed L.

The planner is choosing a sequential scan for my query.  I am trying
to understand why since it is clearly not the fastest choice, and 
what the proper tuning dial is to adjust here.  Here's the query:

select count(1) from measurement where measure_id = 98;

TIA.  Here are the details:

% psql -c "\d measurement"
 Table "public.measurement"
   Column   |   Type   |  Modifiers
  
+--+-
 id | integer  | not null default 
nextval('public.measurement_id_seq'::text)
 host_id| integer  | not null
 measure_id | integer  | not null
 value  | double precision | not null
 duration   | double precision | not null default 0.0
 moment | timestamp with time zone | not null default ('now'::text)::timestamp(6) 
with time zone
 source_id  | integer  | not null default 1
Indexes: measurement_duration_idx btree (duration),
 measurement_host_idx btree (host_id),
 measurement_measure_idx btree (measure_id),
 measurement_moment_idx btree (moment),
 measurement_source_idx btree (source_id),
 measurement_value_idx btree (value)
Foreign Key constraints: $2 FOREIGN KEY (measure_id) REFERENCES measure(id) ON UPDATE 
NO ACTION ON DELETE NO ACTION,
 $1 FOREIGN KEY (source_id) REFERENCES measurement_source(id) 
ON UPDATE NO ACTION ON DELETE NO ACTION

% psql -c "analyze measurement"
ANALYZE

% psql -c "select count(1) from measurement"
  count  
-
 1800866
(1 row)


% psql -c "select count(1) from measurement where measure_id = 98"
 count 
---
 38862
(1 row)

% time psql -c "set enable_seqscan=no; explain analyze select count(1) from 
measurement where measure_id = 98"
   QUERY PLAN  
 

 Aggregate  (cost=215208.66..215208.66 rows=1 width=0) (actual time=904.58..904.58 
rows=1 loops=1)
   ->  Index Scan using measurement_measure_idx on measurement  (cost=0.00..215062.64 
rows=58408 width=0) (actual time=0.17..843.78 rows=38866 loops=1)
 Index Cond: (measure_id = 98)
 Total runtime: 904.77 msec
(4 rows)


real0m1.298s
user0m0.010s
sys 0m0.000s

% time psql -c "explain analyze select count(1) from measurement where measure_id = 98"
   QUERY PLAN  
  
-
 Aggregate  (cost=97857.09..97857.09 rows=1 width=0) (actual time=2493.90..2493.90 
rows=1 loops=1)
   ->  Seq Scan on measurement  (cost=0.00..97711.07 rows=58408 width=0) (actual 
time=12.94..2430.08 rows=38866 loops=1)
 Filter: (measure_id = 98)
 Total runtime: 2494.11 msec
(4 rows)


real0m2.885s
user0m0.000s
sys 0m0.000s


This seems to be saying the planner thinks its less expensive to do the
sequential scan, but why?

Including pg_stats data in case it is relevant here.

% psql -c "select * from pg_stats where tablename = 'measurement' and attname = 
'measure_id'"
 schemaname |  tablename  |  attname   | null_frac | avg_width | n_distinct |  
   most_common_vals  |most_common_freqs
 |histogram_bounds | correlation 
+-++---+---++---+--+-+-
 public | measurement | measure_id | 0 | 4 | 52 | 
{104,108,113,124,106,110,101,107,112,109} | 
{0.084,0.0556667,0.052,0.05,0.048,0.047,0.038,0.036,0.034,0.0326667} | 
{23,36,39,43,85,89,100,111,120,122,128} |0.232087
(1 row)

% psql -c "select name, setting from pg_settings where name like 'random%'"
   name   | setting 
--+-
 random_page_cost | 4
(1 row)




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] flock user defined function

2004-06-22 Thread Chris Goughnour
Thanks. Yeah, I figured that out after Martijn's response. I'm just
returning the file descriptor from lockFile, passing it to unlockFile and
closing the descriptor there.  It works fine now. Thanks for edifying a
clueless novice such as myself. :-)
> Chris Goughnour <[EMAIL PROTECTED]> writes:
>> Any suggestions? Thanks in advance.
> 
> I believe locks are associated with file descriptors (what you're
> miscalling a handle).  The unlock function cannot release a lock
> that is held via a different descriptor.  What it needs to be doing
> is closing the descriptor that lockFile opened.  This would also
> solve the rather serious descriptor-leak problem you've got.
> 
> regards, tom lane


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [GENERAL] Connection gets into state where all queries fail

2004-06-22 Thread Tom Lane
Martijn van Oosterhout <[EMAIL PROTECTED]> writes:
> On Tue, Jun 22, 2004 at 04:30:15PM -0600, Scott Ribe wrote:
>> What's not clear is why I should be told this when the invalid query ended
>> with a commit.

> Hmm, it only happens when you send the whole lot in a single query
> strings. That's not recommended IIRC.

When you send multiple commands in a single query string, the entire
string is thrown away after an error.  That's how it's always worked,
and I believe it's well-documented.  There are apps that (mis?)use this
fact.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] Planner choice & tuning

2004-06-22 Thread Scott Marlowe
On Tue, 2004-06-22 at 16:47, Ed L. wrote:
> The planner is choosing a sequential scan for my query.  I am trying
> to understand why since it is clearly not the fastest choice, and 
> what the proper tuning dial is to adjust here.  Here's the query:
> 
>   select count(1) from measurement where measure_id = 98;
> 
> TIA.  Here are the details:
> 
> % psql -c "\d measurement"
>  Table "public.measurement"
>Column   |   Type   |  Modifiers  
> 
> +--+-
>  id | integer  | not null default 
> nextval('public.measurement_id_seq'::text)
>  host_id| integer  | not null
>  measure_id | integer  | not null
>  value  | double precision | not null
>  duration   | double precision | not null default 0.0
>  moment | timestamp with time zone | not null default 
> ('now'::text)::timestamp(6) with time zone
>  source_id  | integer  | not null default 1
> Indexes: measurement_duration_idx btree (duration),
>  measurement_host_idx btree (host_id),
>  measurement_measure_idx btree (measure_id),
>  measurement_moment_idx btree (moment),
>  measurement_source_idx btree (source_id),
>  measurement_value_idx btree (value)
> Foreign Key constraints: $2 FOREIGN KEY (measure_id) REFERENCES measure(id) ON 
> UPDATE NO ACTION ON DELETE NO ACTION,
>  $1 FOREIGN KEY (source_id) REFERENCES 
> measurement_source(id) ON UPDATE NO ACTION ON DELETE NO ACTION
> 
> % psql -c "analyze measurement"
> ANALYZE
> 
> % psql -c "select count(1) from measurement"
>   count  
> -
>  1800866
> (1 row)
> 
> 
> % psql -c "select count(1) from measurement where measure_id = 98"
>  count 
> ---
>  38862
> (1 row)
> 
> % time psql -c "set enable_seqscan=no; explain analyze select count(1) from 
> measurement where measure_id = 98"
>QUERY PLAN
>
> 
>  Aggregate  (cost=215208.66..215208.66 rows=1 width=0) (actual time=904.58..904.58 
> rows=1 loops=1)
>->  Index Scan using measurement_measure_idx on measurement  
> (cost=0.00..215062.64 rows=58408 width=0) (actual time=0.17..843.78 rows=38866 
> loops=1)
>  Index Cond: (measure_id = 98)
>  Total runtime: 904.77 msec
> (4 rows)
> 
> 
> real0m1.298s
> user0m0.010s
> sys 0m0.000s
> 
> % time psql -c "explain analyze select count(1) from measurement where measure_id = 
> 98"
>QUERY PLAN
> 
> -
>  Aggregate  (cost=97857.09..97857.09 rows=1 width=0) (actual time=2493.90..2493.90 
> rows=1 loops=1)
>->  Seq Scan on measurement  (cost=0.00..97711.07 rows=58408 width=0) (actual 
> time=12.94..2430.08 rows=38866 loops=1)
>  Filter: (measure_id = 98)
>  Total runtime: 2494.11 msec
> (4 rows)
> 
> 
> real0m2.885s
> user0m0.000s
> sys 0m0.000s
> 
> 
> This seems to be saying the planner thinks its less expensive to do the
> sequential scan, but why?
> 
> Including pg_stats data in case it is relevant here.
> 
> % psql -c "select * from pg_stats where tablename = 'measurement' and attname = 
> 'measure_id'"
>  schemaname |  tablename  |  attname   | null_frac | avg_width | n_distinct |
>  most_common_vals  |
> most_common_freqs |histogram_bounds  
>| correlation 
> +-++---+---++---+--+-+-
>  public | measurement | measure_id | 0 | 4 | 52 | 
> {104,108,113,124,106,110,101,107,112,109} | 
> {0.084,0.0556667,0.052,0.05,0.048,0.047,0.038,0.036,0.034,0.0326667} | 
> {23,36,39,43,85,89,100,111,120,122,128} |0.232087
> (1 row)
> 
> % psql -c "select name, setting from pg_settings where name like 'random%'"
>name   | setting 
> --+-
>  random_page_cost | 4
> (1 row)

I'd say your random_page_cost is too low for your setup.  While there's
a slight misguess on the planner in the number of rows (38k verus 58k)
it's not that big.  Try values between 1.2 and 2.0.  Most larger se

Re: [GENERAL] More psql problems... >.

2004-06-22 Thread Scott Marlowe
On Tue, 2004-06-22 at 13:52, Madison Kelly wrote:
> Hi all,
> 
>I have got to say that my first foray into postgresSQL is becoming a 
> very madening experience... I am sure it is my own fault for not knowing 
> very much but it seems that everything I have tried so far to improve 
> performance has in fact made it a lot worse. Now my program dies after 
> roughly 300 seconds of processing directories and updates take literally 
> 10 time longer than inserts (which are themselves very slow).

/SNIP

>My previous question was performance based, now I just need to get 
> the darn thing working again. Like I said, after ~300 seconds perl dies. 
> If I disable auto-commit then it dies the first time it runs an insert. 
> (this is all done on the same table; 'file_dir'). If I add a 'commit' 
> before each select than a bunch of selects will work (a few dozen) and 
> then it dies anyway.

Is it exactly 300 seconds?  Sounds like a timeout somewhere to me.  Does
perl have one of those?  Or maybe your OS / Shell combo do, or something
like it?  Just guessing here.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] Planner choice & tuning

2004-06-22 Thread Tom Lane
"Ed L." <[EMAIL PROTECTED]> writes:
> This seems to be saying the planner thinks its less expensive to do the
> sequential scan, but why?

Because it thinks it's less expensive ;-)

There are a couple issues here: one is overestimation of the number of
rows matching the query.  That is a statistical issue and should be
fixable by increasing the statistics target for the column.  With an
accurate rows estimate the seqscan cost estimate would not change but
the indexscan cost estimate would decrease approximately proportionally.

Allowing for the row estimation error, the indexscan cost estimate is
still about 4x what it ought to be, which means that you could bring the
estimated cost (for this query anyway) in line with reality by setting
random_page_cost = 1.

Note however that we are defining "reality" as "the results of this one
single experiment".  You should realize in particular that the table is
probably fully cached in memory during your test, which is a scenario
in which random_page_cost actually is 1.0 (given reasonable assumptions
about the behavior of the kernel's cache, anyway).  If you optimize for
this case then you are likely to be pessimizing the behavior for larger
tables that don't fit in memory.

My suggestion would be to lower random_page_cost to 3 or so, which would
be enough to tip the decision to indexscan for this case, though not to
make the estimated cost really truly correct.  You should however run a
larger set of experiments before doing anything, and realize that any
random_page_cost setting is a compromise because the model doesn't take
all the variables into account.

You can find much more about this issue in the pgsql-performance archives.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] More psql problems... >.

2004-06-22 Thread Madison Kelly
Scott Marlowe wrote:
  My previous question was performance based, now I just need to get 
the darn thing working again. Like I said, after ~300 seconds perl dies. 
If I disable auto-commit then it dies the first time it runs an insert. 
(this is all done on the same table; 'file_dir'). If I add a 'commit' 
before each select than a bunch of selects will work (a few dozen) and 
then it dies anyway.

Is it exactly 300 seconds?  Sounds like a timeout somewhere to me.  Does
perl have one of those?  Or maybe your OS / Shell combo do, or something
like it?  Just guessing here.
Hi Scott,
  Thanks for replying. In my frustration I missed the obvious; the 
web-browser was timing out (web front end)... Darn that makes me feel 
foolish! Anyway, I went for a walk to relax and calm down and now I can 
get back to working on performance.

  Madison
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[GENERAL] Deleting all but one row of a list of non-uniques

2004-06-22 Thread Zak McGregor
Hi all


I have a table, for simplicity's sake containing one field, called unid. 


for example, select unid, oid from table gives me something like this:
 
   unid   |   oid
 -+-
 XNO24ORK | 40276607
 XNPGJDPU | 40276673 *
 XNPGJDPU | 40276674 *
 XNXAAVQ2 | 40277583
 ZAZAFAA4 | 40276600
 ZAZV5UG4 | 40276446
 ZD66A1LL | 40277162
 ZDXZ27RS | 40277454
 ZEKRT3GM | 40277739 *
 ZEKRT3GM | 40277740 *
 ZEKRT3GM | 40277741 *

(I've marked the rows with duplicated unids with the * to the right)

I'd like to delete rows in such a way that one (and any one) row for each unid
remains, and all other duplicates are removed. Does anyone have any ideas that
may help me here please?

slecting distinct is not helpful as in reality there are other fields which
contain data like timestamps that will differ but I don't mind which one
remains, but I'd need to work with them on selects to the table afterwards.

 I've tried:

 delete from table where oid in (select p1.oid from table p1, table p2 where
p1.oid != p2.oid and p1.unid=p2.unid);

which only works in a few cases - I suspect where there are only two rows with
the same unid. Is it even possible?

Alternatively, can I get the \copy command to ignore attempts to insert
duplicated rows into a UNIQUE column instead of aborting the copy? Not sure if
any of the options that can be supplied to the table at creation time for unique
will help here.

Thanks.

Ciao

Zak

--

http://www.carfolio.com/Searchable database of 10 000+ car specs


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] Deleting all but one row of a list of non-uniques

2004-06-22 Thread Martijn van Oosterhout
On Wed, Jun 23, 2004 at 02:17:50AM +0200, Zak McGregor wrote:
> Hi all
> 
> 
> I have a table, for simplicity's sake containing one field, called unid. 



> (I've marked the rows with duplicated unids with the * to the right)
> 
> I'd like to delete rows in such a way that one (and any one) row for each unid
> remains, and all other duplicates are removed. Does anyone have any ideas that
> may help me here please?

The usual scheme I use is something like: (not perfect SQL)

delete from table where exists (select from table where this.id <
that.id).

Hope this helps,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


pgparxoRVLE86.pgp
Description: PGP signature


Re: [GENERAL] Connection gets into state where all queries fail

2004-06-22 Thread Scott Ribe
>From Martijn van Oosterhout:

> Hmm, it only happens when you send the whole lot in a single query
> strings. That's not recommended IIRC.

And from Tom Lane:

> When you send multiple commands in a single query string, the entire
> string is thrown away after an error.  That's how it's always worked,
> and I believe it's well-documented.  There are apps that (mis?)use this
> fact.

OK, so I think I understand now. I'd earlier seen errors in multi-statement
strings and this problem didn't happen. But those errors were with
constraints that were deferred, so the error wasn't generated until the
commit was being processed, so the transaction block was terminated. While
this error (string too long for varchar column) happens as soon as the
insert is handled and the commit is never seen.

And as for:

> Will, it's definitly repeatable:

I suppose psql sends either a commit or rollback after the query generates
the error, maybe after every query, so my attempt to use it to check this
wasn't a valid test.

Thanks for the help.


-- 
Scott Ribe
[EMAIL PROTECTED]
http://www.killerbytes.com/
(303) 665-7007 voice


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] Deleting all but one row of a list of non-uniques

2004-06-22 Thread Andrew Bartley
Try this.

Create a temp table with a list of the duplicate unid's

eg

create temp table duplicates
as
select min(oid) as oid_val, unid from 
group by unid having count(*) > 1;

Then isolate the unwanted rows

update 
set unid  = null  
from duplicates
where .unid = duplicates.unid
and oid_val <> .oid

Then delete them

delete from  where unid is null

Thanks

Andrew

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Zak McGregor
Sent: Wednesday, 23 June 2004 10:18 AM
To: [EMAIL PROTECTED]
Subject: [GENERAL] Deleting all but one row of a list of non-uniques


Hi all


I have a table, for simplicity's sake containing one field, called unid.


for example, select unid, oid from table gives me something like this:

   unid   |   oid
 -+-
 XNO24ORK | 40276607
 XNPGJDPU | 40276673 *
 XNPGJDPU | 40276674 *
 XNXAAVQ2 | 40277583
 ZAZAFAA4 | 40276600
 ZAZV5UG4 | 40276446
 ZD66A1LL | 40277162
 ZDXZ27RS | 40277454
 ZEKRT3GM | 40277739 *
 ZEKRT3GM | 40277740 *
 ZEKRT3GM | 40277741 *

(I've marked the rows with duplicated unids with the * to the right)

I'd like to delete rows in such a way that one (and any one) row for each
unid
remains, and all other duplicates are removed. Does anyone have any ideas
that
may help me here please?

slecting distinct is not helpful as in reality there are other fields which
contain data like timestamps that will differ but I don't mind which one
remains, but I'd need to work with them on selects to the table afterwards.

 I've tried:

 delete from table where oid in (select p1.oid from table p1, table p2 where
p1.oid != p2.oid and p1.unid=p2.unid);

which only works in a few cases - I suspect where there are only two rows
with
the same unid. Is it even possible?

Alternatively, can I get the \copy command to ignore attempts to insert
duplicated rows into a UNIQUE column instead of aborting the copy? Not sure
if
any of the options that can be supplied to the table at creation time for
unique
will help here.

Thanks.

Ciao

Zak

--

http://www.carfolio.com/Searchable database of 10 000+ car specs


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Connection gets into state where all queries fail

2004-06-22 Thread Martijn van Oosterhout
On Tue, Jun 22, 2004 at 07:02:28PM -0600, Scott Ribe wrote:
> I suppose psql sends either a commit or rollback after the query generates
> the error, maybe after every query, so my attempt to use it to check this
> wasn't a valid test.

Nope, psql breaks the statements on ';' and sends each query
individually, so the issue never comes up.
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


pgph2q83tAM8J.pgp
Description: PGP signature