Re: [HACKERS] Problem with ControlFileData structure being ABI dependent

2007-12-07 Thread Dave Page

Dave Page wrote:

but the CRC is still different for some as-yet unknown reason...


Unknown because I wasn't fully grokking what it was a CRC of. Everything 
looks good now :-)


/D

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Problem with ControlFileData structure being ABI dependent

2007-12-07 Thread Dave Page
Tom Lane wrote:
> AFAIK, time_t is a Unix-ism, so it's pretty unlikely to be used in the
> APIs of anything on Windows.

Oh, it is.

> I guess my advice would be to see if we can define _USE_32BIT_TIME_T
> in port/win32.h and make it go away that way.  It'd definitely be nice
> if MSVC and Mingw builds weren't binary-incompatible.

The attached patch defines it in the MSVC project files along with the
other API-config related macros. It fixes all the offsets so they match
mingw, but the CRC is still different for some as-yet unknown reason...

VC++ with patch:

WARNING:  sizeof(ControlFileData)=392
WARNING:  offsetof(state)=16
WARNING:  offsetof(time)=20
WARNING:  offsetof(checkPoint)=24
WARNING:  offsetof(prevCheckPoint)=32
WARNING:  offsetof(checkPointCopy)=40
WARNING:  offsetof(minRecoveryPoint)=76
WARNING:  offsetof(maxAlign)=84
WARNING:  offsetof(floatFormat)=88
WARNING:  offsetof(blcksz)=96
WARNING:  offsetof(relseg)=100
WARNING:  offsetof(xlog)=104
WARNING:  offsetof(xlog)=108
WARNING:  offsetof(nameDataLen)=112
WARNING:  offsetof(indexMaxKeys)=116
WARNING:  offsetof(toast)=120
WARNING:  offsetof(enableIntTimes)=124
WARNING:  offsetof(localeBuflen)=128
WARNING:  offsetof(lc)=132
WARNING:  offsetof(lc)=260
WARNING:  offsetof(crc)=388
WARNING:  crc = 2265484434

Mingw:

WARNING:  sizeof(ControlFileData)=392
WARNING:  offsetof(state)=16
WARNING:  offsetof(time)=20
WARNING:  offsetof(checkPoint)=24
WARNING:  offsetof(prevCheckPoint)=32
WARNING:  offsetof(checkPointCopy)=40
WARNING:  offsetof(minRecoveryPoint)=76
WARNING:  offsetof(maxAlign)=84
WARNING:  offsetof(floatFormat)=88
WARNING:  offsetof(blcksz)=96
WARNING:  offsetof(relseg)=100
WARNING:  offsetof(xlog)=104
WARNING:  offsetof(xlog)=108
WARNING:  offsetof(nameDataLen)=112
WARNING:  offsetof(indexMaxKeys)=116
WARNING:  offsetof(toast)=120
WARNING:  offsetof(enableIntTimes)=124
WARNING:  offsetof(localeBuflen)=128
WARNING:  offsetof(lc)=132
WARNING:  offsetof(lc)=260
WARNING:  offsetof(crc)=388
WARNING:  crc = 3643130025

/D
Index: src/tools/msvc/Project.pm
===
RCS file: /projects/cvsroot/pgsql/src/tools/msvc/Project.pm,v
retrieving revision 1.14
diff -u -r1.14 src/tools/msvc/Project.pm
--- Project.pm	21 Aug 2007 15:10:41 -	1.14
+++ Project.pm	7 Dec 2007 11:14:29 -
@@ -489,7 +489,7 @@
 	ConfigurationType="$cfgtype" UseOfMFC="0" ATLMinimizesCRunTimeLibraryUsage="FALSE" CharacterSet="2" WholeProgramOptimization="$p->{wholeopt}">
 	
---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] shared_buffers, wal_buffers, WAL files, data files

2007-12-07 Thread Guillaume Lelarge
Alvaro Herrera a écrit :
> Guillaume Lelarge wrote:
>> Tom Lane a écrit :
>>> Guillaume Lelarge <[EMAIL PROTECTED]> writes:
 I try to answer a simple question : what happens when I do a simple
 "INSERT" on a just started PostgreSQL server.
 From what I understand with the INSERT statement, here is what happens :
  * backend loads first (and only) block from footable file into a shared
buffer
  * it modifies this block on the shared buffer, and sets it as dirty
>>> Right, and it also makes a WAL log entry about this action.
>>>
>> The WAL log entry is made on the wal buffers (in memory). As soon as
>> this statement is commited (in my example, it's right now, but in a
>> BEGIN ... COMMIT statement, at COMMIT time), the wal buffer is flushed
>> on WAL files. It can be flushed before if wal buffer is not big enough
>> to contain all the current transactions. Am I right ?
> 
> That's correct.  WAL buffers are obviously shared; when one transaction
> commits it will flush not only its own entries, but also those that any
> other transaction could have written.
> 

OK, thanks. I'll probably have more questions but I need to think a bit
more about all your answers.

Anyways, thanks.

Regards.


-- 
Guillaume.
 http://www.postgresqlfr.org
 http://dalibo.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Problem with ControlFileData structure being ABI depe ndent

2007-12-07 Thread Gregory Stark
"Magnus Hagander" <[EMAIL PROTECTED]> writes:

>> > I guess my advice would be to see if we can define _USE_32BIT_TIME_T
>> > in port/win32.h and make it go away that way.  It'd definitely be nice
>> > if MSVC and Mingw builds weren't binary-incompatible.
>> 
>> The attached patch defines it in the MSVC project files along with the
>> other API-config related macros. It fixes all the offsets so they match
>> mingw, but the CRC is still different for some as-yet unknown reason...
>
> Is the project file really the proper place? Consider an add-on module -
> wouldn't we want the setting to go in that one as well? meaning we'd have it 
> in
> a header file instead?

For an add-on module which is actually using time_t to interface with Postgres
it had better define _USE_... itself or else it risks having some files use
them and some not.

An alternative is leaving it in the project file but putting something like
this in c.h:

#ifdef WIN32
#ifndef _USE_32BIT_TIME_T
#error "Postgres uses 32 bit time_t add #define _USE_32BIT_TIME_T on Windows
#endif
#endif

For modules which *do* use time_t this is safer. However for modules which
don't use time_t it'll be an unnecessary hassle.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Problem with ControlFileData structure being ABI depe ndent

2007-12-07 Thread Magnus Hagander

> > I guess my advice would be to see if we can define _USE_32BIT_TIME_T
> > in port/win32.h and make it go away that way.  It'd definitely be nice
> > if MSVC and Mingw builds weren't binary-incompatible.
> 
> The attached patch defines it in the MSVC project files along with the
> other API-config related macros. It fixes all the offsets so they match
> mingw, but the CRC is still different for some as-yet unknown reason...

Is the project file really the proper place? Consider an add-on module - 
wouldn't we want the setting to go in that one as well? meaning we'd have it in 
a header file instead?

/Magnus



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Possible PostgreSQL 8.3beta4 bug with MD5 authentication in psql?

2007-12-07 Thread Mark Cave-Ayland
Hi everyone,

I think that I may have found a minor bug with PostgreSQL 8.3beta4 with
respect to md5 authentication. I actually discovered this on Win32, but
it appears that the behaviour is the same under Linux too.

As part of the PostGIS install under Win32, I have a few scripts that
check for the existence of a particular database by doing the following:

psql -d adatabase -h localhost -c "SELECT version();"

By checking the psql exit code, it is fairly easy to see whether this
failed, and if so display the contents of stdout for the user. The
problem I have is that under PostgreSQL 8.3beta4, if the database
doesn't exist then I get an extra password prompt which breaks the
install scripts as they run in the background :(

To recreate this is fairly easy:

1. Temporarily rename any .pgpass files so they aren't found by libpq
2. Stop the PostgreSQL 8.3 server
3. Change pg_hba.conf so that local connections are disabled, but
connections to 127.0.0.1 are allowed with md5 authentication
4. Restart the PostgreSQL server
5. Open up a shell and do the following:

[EMAIL PROTECTED]:~$ export PGPASSWORD=mypass
[EMAIL PROTECTED]:~$ psql -h localhost -d postgres -c "SELECT version();"
  version 
---
 PostgreSQL 8.3beta2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC)
4.0.3 (Ubuntu 4.0.3-1ubuntu5)
(1 row)


So far so good. But now try with a database that doesn't exist:

[EMAIL PROTECTED]:~$ psql -h localhost -d doesntexist -c "SELECT
version();"
Password:
psql: FATAL:  database "doesntexist" does not exist


H. So even though PGPASSWORD is set (and the command works if the
database exists within the cluster), if I specify a non-existent
database then I still get prompted for a password.

I've run the same test against PostgreSQL 8.2.5 and the test works in
that I don't get prompted for a password the second time. So the
behaviour has changed between versions, but I wanted to check that it
wasn't a deliberate change before looking deeper.


Many thanks,

Mark.

-- 
ILande - Open Source Consultancy
http://www.ilande.co.uk



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Problem with ControlFileData structure being ABI depe ndent

2007-12-07 Thread Dave Page

Tom Lane wrote:

Dave Page <[EMAIL PROTECTED]> writes:

Gregory Stark wrote:

An alternative is leaving it in the project file but putting something like
this in c.h:


Put it in win32.h, please.  c.h shouldn't get cluttered with
platform-specific kluges when there's no need for it.

Is there a good reason not to just #define _USE_32BIT_TIME_T in win32.h?


Yeah, the fact that addons may then end up partially compiled with and 
partially without it being defined. It we just have it error as Greg 
suggested, then it will force the authors to define it themselves, and 
if they get that wrong it's their fault not ours.


/D

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Problem with ControlFileData structure being ABI depe ndent

2007-12-07 Thread Tom Lane
Dave Page <[EMAIL PROTECTED]> writes:
> Gregory Stark wrote:
>> An alternative is leaving it in the project file but putting something like
>> this in c.h:

Put it in win32.h, please.  c.h shouldn't get cluttered with
platform-specific kluges when there's no need for it.

Is there a good reason not to just #define _USE_32BIT_TIME_T in win32.h?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Problem with ControlFileData structure being ABI dependent

2007-12-07 Thread Trevor Talbot
On 12/7/07, Dave Page <[EMAIL PROTECTED]> wrote:
> Tom Lane wrote:

> > AFAIK, time_t is a Unix-ism, so it's pretty unlikely to be used in the
> > APIs of anything on Windows.

> Oh, it is.

It's confined to the C Runtime libraries, not part of the Windows API
proper. (Three exceptions: IP Helper uses the type, but the docs
contain a warning; a DHCP API that came along after the change to
64bit; an obsoleted Wbem class library for C++.)

The CRT has been causing compatibility problems in mixed-tools
projects for years. I find Microsoft's incessant changes so irritating
that I go out of my way to avoid using it in any project I intend to
deploy. It's just one of those things you end up having to deal with
somehow :(

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Possible PostgreSQL 8.3beta4 bug with MD5 authentication in psql?

2007-12-07 Thread Tom Lane
Dave Page <[EMAIL PROTECTED]> writes:
> Just to add a note to that - when running it in the same shell from 
> which I started the server with messages going to stdout, it seemed 
> clear that it trys to connect once using PGPASSWORD, then when that 
> fails, it prompts for the password instead, and then tries to connect 
> with that and fails a second time.

Hmmm ... it seems the problem is that we've defined
PQconnectionUsedPassword in such a way that it returns true (causing a
prompt) regardless of whether the reason for the connection failure was
a bad password or not.  We might need to reconsider that API.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Possible PostgreSQL 8.3beta4 bug with MD5 authentication in psql?

2007-12-07 Thread Dave Page

Mark Cave-Ayland wrote:

H. So even though PGPASSWORD is set (and the command works if the
database exists within the cluster), if I specify a non-existent
database then I still get prompted for a password.


Just to add a note to that - when running it in the same shell from 
which I started the server with messages going to stdout, it seemed 
clear that it trys to connect once using PGPASSWORD, then when that 
fails, it prompts for the password instead, and then tries to connect 
with that and fails a second time.


/D

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] pg_dump and BINARY mode COPY

2007-12-07 Thread Hubert FONGARNAND
I use often pgdump... and we have many bytea on our database...
Without binary COPY, backup is very slow and takes 4x more space on the
network wire...

It would be great to have an option (--binary) with eventually a WARNING
message (that your backup could be not portable across architecture...)
but i think that 99% of the time we don't change between x86 or ppc

please think for users...
Hubert

Le jeudi 06 décembre 2007 à 18:32 +, Simon Riggs a écrit :

> On Thu, 2007-12-06 at 13:22 -0500, Tom Lane wrote:
> > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > On Thu, 2007-12-06 at 12:43 -0500, Tom Lane wrote:
> > >> To preserve platform and version independence of the dump.
> > 
> > > Only if there is a risk. Since a common route to upgrade is a
> > > dump/restore back onto same box it seems worth having an option to do
> > > this.
> > 
> > I think you missed the point about version independence.  That would be
> > exactly the scenario where a version-to-version change in binary format
> > --- for instance, "money" widening to 64 bits in 8.3 --- would burn you.
> 
> OK
> 
> > Furthermore, this doesn't seem like a scenario where performance ought
> > to be the primary concern.  How often do you do major version upgrades?
> 
> Not often enough.
> 
> If it was faster, we would rely on more people actually doing it.
> 

_
Ce message et les éventuels documents joints peuvent contenir des informations 
confidentielles.
Au cas où il ne vous serait pas destiné, nous vous remercions de bien vouloir 
le supprimer et en aviser immédiatement l'expéditeur. Toute utilisation de ce 
message non conforme à sa destination, toute diffusion ou publication, totale 
ou partielle et quel qu'en soit le moyen est formellement interdite.
Les communications sur internet n'étant pas sécurisées, l'intégrité de ce 
message n'est pas assurée et la société émettrice ne peut être tenue pour 
responsable de son contenu.



Re: [DOCS] [HACKERS] Uniform policy for author credits in contrib module documentation?

2007-12-07 Thread Joshua D. Drake

Tom Lane wrote:

Josh Berkus <[EMAIL PROTECTED]> writes:
Historically, the only way to troubleshoot some of the contrib modules 
was to e-mail the author.  If people think that we've reached a standard 
that everything in /contrib is now well-documented and supported by the 
general community, the I suppose we don't need contact information.  I'm 
not so sure myself.


I do not think that we should encourage people to mail the authors first
rather than pgsql-bugs.  For one thing, a lot of those addresses are
dead, and some of the ones that aren't don't respond especially fast.

If the community-at-large can't handle a bug, we certainly have enough
institutional memory to try to contact the original author, even if that
address isn't in the SGML docs.



Perhaps the at a minimum the email goes in the commit?

Joshua D. Drake



regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings




---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Problem with ControlFileData structure being ABI depe ndent

2007-12-07 Thread Dave Page

Gregory Stark wrote:

"Magnus Hagander" <[EMAIL PROTECTED]> writes:


I guess my advice would be to see if we can define _USE_32BIT_TIME_T
in port/win32.h and make it go away that way.  It'd definitely be nice
if MSVC and Mingw builds weren't binary-incompatible.

The attached patch defines it in the MSVC project files along with the
other API-config related macros. It fixes all the offsets so they match
mingw, but the CRC is still different for some as-yet unknown reason...

Is the project file really the proper place? Consider an add-on module -
wouldn't we want the setting to go in that one as well? meaning we'd have it in
a header file instead?


For an add-on module which is actually using time_t to interface with Postgres
it had better define _USE_... itself or else it risks having some files use
them and some not.


Yeah, that was roughly my thinking having run into exactly that problem 
whilst playing around earlier. We also have other 'api config' macros in 
project files (though I appreciate those aren't changing ABI related 
stuff) so it seemed a natural place.


My other thought was that many of the addons this is likely to affect 
will be packaged as additional contrib modules, so they will 
automatically pickup the macro if dropped in /contrib and built from there.



An alternative is leaving it in the project file but putting something like
this in c.h:

#ifdef WIN32
#ifndef _USE_32BIT_TIME_T
#error "Postgres uses 32 bit time_t add #define _USE_32BIT_TIME_T on Windows
#endif
#endif

For modules which *do* use time_t this is safer. However for modules which
don't use time_t it'll be an unnecessary hassle.


Yeah, that might be useful. We could narrow the scope of platforms 
affected by using _MSC_VER instead of WIN32.


/D


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Possible PostgreSQL 8.3beta4 bug with MD5 authentication in psql?

2007-12-07 Thread Mark Cave-Ayland
On Fri, 2007-12-07 at 11:03 -0500, Tom Lane wrote:

> Hmmm ... it seems the problem is that we've defined
> PQconnectionUsedPassword in such a way that it returns true (causing a
> prompt) regardless of whether the reason for the connection failure was
> a bad password or not.  We might need to reconsider that API.

Right. I think it depends on the interpretation of the
PQconnectionUsedPassword function. If it should simply return whether or
not the connection used a password or not (as it does now), then you
could argue that it should be psql which should incorporate an
additional check to determine whether the attempt was cancelled due to
an invalid database name.

On first glance, libpq appears to return just CONNECTION_BAD and an
error message, so I'm not sure whether we can easily detect the
difference between an incorrect password and an invalid database name :(
Is there an additional status code in PGconn that can be used to
determine the exact cause of connection failure?


ATB,

Mark.

-- 
ILande - Open Source Consultancy
http://www.ilande.co.uk



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [DOCS] [HACKERS] Uniform policy for author credits in contrib module documentation?

2007-12-07 Thread Selena Deckelmann
On Dec 7, 2007 9:03 AM, Andrew Dunstan <[EMAIL PROTECTED]> wrote:
>
>
> Joshua D. Drake wrote:
> > Tom Lane wrote:
> >> Josh Berkus <[EMAIL PROTECTED]> writes:
> >>> Historically, the only way to troubleshoot some of the contrib
> >>> modules was to e-mail the author.  If people think that we've
> >>> reached a standard that everything in /contrib is now
> >>> well-documented and supported by the general community, the I
> >>> suppose we don't need contact information.  I'm not so sure myself.
> >>
> >> I do not think that we should encourage people to mail the authors first
> >> rather than pgsql-bugs.  For one thing, a lot of those addresses are
> >> dead, and some of the ones that aren't don't respond especially fast.
> >>
> >> If the community-at-large can't handle a bug, we certainly have enough
> >> institutional memory to try to contact the original author, even if that
> >> address isn't in the SGML docs.
> >>
> >
> > Perhaps the at a minimum the email goes in the commit?
> >
>
> I don't see any reason, unless we're going to start doing that for all
> contributions. 'contrib' is a serious misnomer anyway, and there's no
> reason to think in general that the original author is specially
> responsible for any of it. I think Tom's point is entirely valid.

I think it is totally appropriate to replace the email address contact
information with a link to pgsql-bugs.

But there are reasons other than bugfixing to contact the original
author of a patch or contrib packages.  Some of those could include:

* collaboration on a professional, hobby or academic research level
* journalism/books written about PostgreSQL
* academic or historical research into the development of PostgreSQL
* job prospects
* socializing

For those reasons, I think it would be a huge loss to the community to
remove the credit sections or to prevent their inclusion in future
documentation. Maintaining them leaves a breadcrumb trail that
otherwise would be lost in mailing list threads and commit logs that
are very difficult for a person without specialized knowledge to
navigate. Why make contacting people hard?

Some of the contrib features are likely targets for future research
and development (for example: tsearch, HOT, pl/lolcode - not contrib,
but awesome!) and in that context, the specific people involved are
important.

Contrib documentation will get folded into the main docs eventually
(and at that point, the credits are removed). But the record of that
evolution is easily accessed, without any special knowledge of a
revision control system, or mailing list culture.

And I realize that the logical extension of what I am saying is a
research and documentation project about the people who all have
contributed to the development of PostgreSQL.

-selena

-- 
Selena Deckelmann
PDXPUG - Portland PostgreSQL Users Group
http://pugs.postgresql.org/pdx
http://www.chesnok.com/daily

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [DOCS] "distributed checkpoint"

2007-12-07 Thread Simon Riggs
On Thu, 2007-12-06 at 19:43 -0800, David Fetter wrote:
> On Thu, Dec 06, 2007 at 07:19:44PM -0800, Joshua D. Drake wrote:
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA1
> > 
> > On Thu, 06 Dec 2007 20:44:49 -0500
> > Tom Lane <[EMAIL PROTECTED]> wrote:
> > 
> > > Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > > > Am I the only one who finds the phrase "distributed checkpointing"
> > > > a bit awkward?  Would it be better if we used "time-distributed
> > > > checkpointing" instead?
> > > 
> > > Yeah, "distributed" has a bunch of connotations that are wrong for
> > > this purpose.
> > > 
> > > I spent a bit of time with a thesaurus but didn't come up with
> > > anything that seemed le mot juste.  Best I could do was "spread
> > > checkpoint" or "time-extended checkpoint".  Anybody have a better
> > > idea?
> > 
> > balanced
> > gradual
> > extended (I see you mention time-extended but wouldn't time be implicit
> > based on the actual docs and thus we only need extended?)
> 
> How about "smoothed?"

Agreed

"Smoothed" makes a lot of sense for me. We used to have a checkpoint
spike, now we don't. 

Perhaps we should say something like "time extended checkpoints provide
smoother (transaction?) response times" 

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Andrew Dunstan wrote:
> 
> 
> Simon Riggs wrote:
> >
> > - Heap-Only Tuples (HOT) accelerate space reuse for UPDATEs
> > change to
> > "Heap-Only Tuples (HOT) improve performance of frequent UPDATEs"
> >
> >
> >   
> 
> I think we need to qualify this, or it could be quite misleading. 
> perhaps add "that don't affect indexed columns" or something like that.

This is covered in the detailed text below the item:

  
   Heap-Only Tuples (HOT) accelerate space reuse for
   UPDATEs (Pavan Deolasee, with ideas from many others)
  

  
   UPDATEs and DELETEs leave dead tuples
   behind, as do failed INSERTs.  Previously only
   VACUUM could reclaim space taken by dead tuples. With
   HOT dead tuple space can be reclaimed at the time of
   UPDATE or INSERT if no changes are made to
   indexed columns.  This allows for more consistent performance.
   Also, HOT avoids adding duplicate index entries.

I just updated the wording so the index issue is clearer.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] pg_controldata doesn't report 64/32bit?

2007-12-07 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> We could always tighten this up a bit by listing the alignment of a
> handful of built-in data types but I suppose there will always be
> holes in this area anyways.

In theory yeah, but the note in pg_control.h still applies to every
platform I've heard of:

 * This data is used to check for hardware-architecture compatibility of
 * the database and the backend executable.  We need not check endianness
 * explicitly, since the pg_control version will surely look wrong to a
 * machine of different endianness, but we do need to worry about MAXALIGN
 * and floating-point format.  (Note: storage layout nominally also
 * depends on SHORTALIGN and INTALIGN, but in practice these are the same
 * on all architectures of interest.)

The main risk we are taking is in the assumption that int64 and float8
have the same alignment requirement, ie DOUBLEALIGN.  Which is probably
a fairly safe thing in reality.  Also, we've so far avoided using either
type in the system catalogs, which takes away one of the possible
failure modes (that the C compiler's alignment of struct fields might
vary from what we think the type needs).

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] pg_controldata doesn't report 64/32bit?

2007-12-07 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> Argh, As soon as I sent it I realized even that's not right. The
> alignments of our datums is specified by pg_type and doesn't vary by
> platform. There has been some thought of making 8-byte data types like
> bigint pass-by-value on 64-bit machines in which case we would want to
> put that boolean flag in the control file as well.

And that's *still* not right, because whether a type is pass-by-value
has again got exactly 0 to do with what the bits look like on disk.

The only 32-vs-64-bit difference that has any impact on the on-disk
layout is MAXALIGN, which tends to be different in the two cases
(though there are exceptions).

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] pg_controldata doesn't report 64/32bit?

2007-12-07 Thread Joshua D. Drake

Gregory Stark wrote:

"Gregory Stark" <[EMAIL PROTECTED]> writes:


"Tom Lane" <[EMAIL PROTECTED]> writes:


Josh Berkus <[EMAIL PROTECTED]> writes:
I've just noticed that pg_controldata doesn't say anything about whether 
the database is 64-bit or 32-bit.

That's because there is no such concept.

I think the relevant concept is whether Datum is 32-bit or 64-bit wide.


Oh, I see what you meant now. Datum is a purely in-memory concept, it doesn't
actually reach disk.

We could always tighten this up a bit by listing the alignment of a handful of
built-in data types but I suppose there will always be holes in this area
anyways.


What are we gaining here? A 32bit postmaster can't start a 64bit one... 
Running select version() will tell you if you are running 64 or 32bit...


What am I missing?

Joshua D. Drake







---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] pg_controldata doesn't report 64/32bit?

2007-12-07 Thread Gregory Stark
"Gregory Stark" <[EMAIL PROTECTED]> writes:

> "Tom Lane" <[EMAIL PROTECTED]> writes:
>
>> Josh Berkus <[EMAIL PROTECTED]> writes:
>>> I've just noticed that pg_controldata doesn't say anything about whether 
>>> the database is 64-bit or 32-bit.
>>
>> That's because there is no such concept.
>
> I think the relevant concept is whether Datum is 32-bit or 64-bit wide.

Oh, I see what you meant now. Datum is a purely in-memory concept, it doesn't
actually reach disk.

We could always tighten this up a bit by listing the alignment of a handful of
built-in data types but I suppose there will always be holes in this area
anyways.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] pg_controldata doesn't report 64/32bit?

2007-12-07 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes:

> Josh Berkus <[EMAIL PROTECTED]> writes:
>> I've just noticed that pg_controldata doesn't say anything about whether 
>> the database is 64-bit or 32-bit.
>
> That's because there is no such concept.

I think the relevant concept is whether Datum is 32-bit or 64-bit wide.

My first thought was that surely the structure would be massively different on
a 64-bit architecture though? But actually I'm not sure that's true. There's
an Oid in the checkpoint record which is an "unsigned int" so at least ILP64
architectures would be distinguished but I can't find any "long" members or
pointers so an LP64 architecture actually would have the same member sizes as
a 32-bit architecture.

So if there's an LP64 architecture which has the same maxalign (presumably
64-bit for doubles) as its 32-bit cousin then it's actually possible we
wouldn't notice?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [DOCS] "distributed checkpoint"

2007-12-07 Thread Jonah H. Harris
On Dec 7, 2007 10:25 PM, Greg Smith <[EMAIL PROTECTED]> wrote:
> > "Smoothed" makes a lot of sense for me. We used to have a checkpoint
> > spike, now we don't.
>
> To be accurate, there used to be a huge and unavoidable spike, now there's
> a control that aims to make it smaller.  The problem hasn't completely
> gone away yet.

Agreed.

> "Distributed checkpoints prevent I/O spikes during checkpoints"
>
> It's good this came up, because that is factually wrong;

Agreed.

> "Checkpoint writes can be spread over a longer time period to smooth the
> I/O spike during each checkpoint"

Sounds good to me.

> It's got spread, it's got smooth, and if I could have worked "silky" in
> there too I would have.

:)

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
499 Thornall Street, 2nd Floor  | [EMAIL PROTECTED]
Edison, NJ 08837| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] pg_controldata doesn't report 64/32bit?

2007-12-07 Thread Josh Berkus

Folks,

I've just noticed that pg_controldata doesn't say anything about whether 
the database is 64-bit or 32-bit.  Since the data files can be 
incompatible based on word length, shouldn't that be in pg_controldata?


--Josh Berkus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Simon Riggs wrote:
> On Fri, 2007-12-07 at 16:21 -0500, Bruce Momjian wrote:
> > Simon Riggs wrote:
> > > Maybe the importance of the patches that were removed wasn't clear
> > > enough, so let me explain my viewpoint. On another part of this thread I
> > > summarised the feedback from others to a list of features that were
> > > definitely user noticeable. The list was:
> > > 
> > > - Merge Join performance has been substantially improved when low number
> > > of duplicate join keys exist on the outer side of the join (Simon, Greg)
> > 
> > Most users don't know if they are using mergejoins or not, nor are they
> > going to do anything differently now that the feature is in, so that is
> > why I don't see a need to mention it.
> 
> Wow, what an amazing claim. Please post that to -performance!

I am talking the average user, not someone subscribed to performance.

> But if you applied that test, another half of the performance features
> should go also. As would many others in various other sections. I want
> them all to stay, so I won't point the finger at others. And why did we
> mention outer joins were tuned in 8.2?

This is starting to sound quite petty.  I have already outlined the
logic I used.  Now if you think the logic I am using is invalid we can
discuss it.  If there are specific items I have added/removed that are
wrong based on that logic, be specific.

Frankly, based on the amount of time I might expend dealing with
everyone's "why isn't my item in there" I might suggest just removing
names completely just to avoid the discussion.

> The release notes are already 33 pages on Firefox's Print Preview, so a
> couple of extra sentences won't change anything.

I have to add entries based on some coherent logic, not based on a
request because then the release notes aren't consistent in their
limited coverage.  And if I add everyone's two lines it will be 66 pages
with 33 pages of useful information.

For example, why add the mergejoin item and not the many other optimizer
improvements in every release?  Is that useful information for the
average user?  Couldn't we just tell them "Postgres is faster" and leave
it at that.  If they are curious why they can ask or find out themselves.

If you want to collect all the optimizer items for 8.3 and put them in a
list we can link to them from the web site.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Stefan Kaltenbrunner wrote:
> Bruce Momjian wrote:
> > Stefan Kaltenbrunner wrote:
> >>> Let me give you the criteria I use for the release notes.  The release
> >>> notes try to document all changes visible to the average user in a way
> >>> that is understandable to the average user.
> >> hmm I'm not so sure about that - there are a number of changes to psql
> >> that are not mentioned for example - like the addition of various new
> >> backslashcommands (\dF, \dFp, \dFt ,...) as part of the main tsearch2
> >> commit or a number of tab-complete improvements/additions (for tsearch
> >> as well as some other stuff).
> > 
> > Yea, but when you say you are adding tsearch there is the assumption
> > that psql and pg_dump and tab completion will be updated to reflect
> > that.
> 
> hmm well - tab.complete support was not done for a number of new
> SQL-level stuff with the main commits in the last few releases and was
> mostly added by different developers (like me) either in the same
> release cycle or even later.
> Anyway this "implicitly included in the main commit" adds a few
> interesting gotchas like having a "Build Options" section that has stuff
> like "--enable-profiling" and "--with-system-tzdata" but
> "--with-ossp-uuid" is in the contrib section and "--with-libxml" is
> missing (mostly because implicitly assumed as part of SQL/XML) as is
> "--with-gssapi"(part of the GSSAPI feature?) and "--with-libxslt (no idea ?)

These are new user-visible flags, not something like tab support which
you just kind of expect to work for every SQL command, and if it doesn't,
it is a bug.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Simon Riggs wrote:
> On Fri, 2007-12-07 at 16:21 -0500, Bruce Momjian wrote:
> > Simon Riggs wrote:
> > > Maybe the importance of the patches that were removed wasn't clear
> > > enough, so let me explain my viewpoint. On another part of this thread I
> > > summarised the feedback from others to a list of features that were
> > > definitely user noticeable. The list was:
> > > 
> > > - Merge Join performance has been substantially improved when low number
> > > of duplicate join keys exist on the outer side of the join (Simon, Greg)
> > 
> > Most users don't know if they are using mergejoins or not, nor are they
> > going to do anything differently now that the feature is in, so that is
> > why I don't see a need to mention it.
> 
> Wow, what an amazing claim. Please post that to -performance!
> 
> But if you applied that test, another half of the performance features
> should go also. As would many others in various other sections. I want
> them all to stay, so I won't point the finger at others. And why did we
> mention outer joins were tuned in 8.2?

Because outer joins are user-visible.

> The release notes are already 33 pages on Firefox's Print Preview, so a
> couple of extra sentences won't change anything.
> 
> > > - Large I/O reduction during recovery when full_page_writes = on
> > > (Heikki)
> > 
> > Again, a speedup, but not something that impacts people to behave
> > differently or see different output.
> 
> Same comment.
> 
> > > > If people are concerned about the unfairness, and I understand that, the
> > > > best solution is not to add more items to the release notes to be more
> > > > fair, but to remove all names from release note items.
> > > 
> > > That makes no sense, but it would benefit people that wrote fewer
> > > patches, I guess.
> > 
> > Yep, kind of illogical but it is fair.
> 
> If we're at the stage of saying logic is irrelevant, then I'll never
> persuade you, probably of anything, ever... 

You said above yourself you didn't want the names removed.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Stefan Kaltenbrunner
Bruce Momjian wrote:
> Stefan Kaltenbrunner wrote:
>>> Let me give you the criteria I use for the release notes.  The release
>>> notes try to document all changes visible to the average user in a way
>>> that is understandable to the average user.
>> hmm I'm not so sure about that - there are a number of changes to psql
>> that are not mentioned for example - like the addition of various new
>> backslashcommands (\dF, \dFp, \dFt ,...) as part of the main tsearch2
>> commit or a number of tab-complete improvements/additions (for tsearch
>> as well as some other stuff).
> 
> Yea, but when you say you are adding tsearch there is the assumption
> that psql and pg_dump and tab completion will be updated to reflect
> that.

hmm well - tab.complete support was not done for a number of new
SQL-level stuff with the main commits in the last few releases and was
mostly added by different developers (like me) either in the same
release cycle or even later.
Anyway this "implicitly included in the main commit" adds a few
interesting gotchas like having a "Build Options" section that has stuff
like "--enable-profiling" and "--with-system-tzdata" but
"--with-ossp-uuid" is in the contrib section and "--with-libxml" is
missing (mostly because implicitly assumed as part of SQL/XML) as is
"--with-gssapi"(part of the GSSAPI feature?) and "--with-libxslt (no idea ?)

> 
>> While looking at this i noted a "psql's \d command and pg_dump have been
>> enhanced" in the Object Manipulation Changes" Chapter which looks a bit
>> missplaced/useless and probably should be removed or clarified and moved
>> to the psql part.
> 
> I don't see that anywhere.  Are you looking at the most recent version:
> 
>   http://momjian.us/main/writings/pgsql/sgml/release-8-3.html

hmm yeah sorry - I was looking at the beta3 version we have on the main
website seems to be fixed in the current code.


Stefan

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Simon Riggs
On Fri, 2007-12-07 at 16:21 -0500, Bruce Momjian wrote:
> Simon Riggs wrote:
> > Maybe the importance of the patches that were removed wasn't clear
> > enough, so let me explain my viewpoint. On another part of this thread I
> > summarised the feedback from others to a list of features that were
> > definitely user noticeable. The list was:
> > 
> > - Merge Join performance has been substantially improved when low number
> > of duplicate join keys exist on the outer side of the join (Simon, Greg)
> 
> Most users don't know if they are using mergejoins or not, nor are they
> going to do anything differently now that the feature is in, so that is
> why I don't see a need to mention it.

Wow, what an amazing claim. Please post that to -performance!

But if you applied that test, another half of the performance features
should go also. As would many others in various other sections. I want
them all to stay, so I won't point the finger at others. And why did we
mention outer joins were tuned in 8.2?

The release notes are already 33 pages on Firefox's Print Preview, so a
couple of extra sentences won't change anything.

> > - Large I/O reduction during recovery when full_page_writes = on
> > (Heikki)
> 
> Again, a speedup, but not something that impacts people to behave
> differently or see different output.

Same comment.

> > > If people are concerned about the unfairness, and I understand that, the
> > > best solution is not to add more items to the release notes to be more
> > > fair, but to remove all names from release note items.
> > 
> > That makes no sense, but it would benefit people that wrote fewer
> > patches, I guess.
> 
> Yep, kind of illogical but it is fair.

If we're at the stage of saying logic is irrelevant, then I'll never
persuade you, probably of anything, ever... 

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Andrew Dunstan wrote:
> >> I still think this needs to be qualified either way. As it stands it's 
> >> quite misleading. Many update scenarios will not benefit one whit from 
> >> HOT updates.
> >> 
> >
> > Doesn't the detail description qualify it enought?  The heading isn't
> > suppose to have all the information or it would be unreadable.
> >
> >   
> 
> If you don't want to be more specific I'd say "certain updates" or "some 
> updates" or something similar, just some flag to say it's not all of them.

Good idea.  I added "most":

   Heap-Only Tuples (HOT) accelerate space reuse for most
   UPDATEs (Pavan Deolasee, with ideas from many others)

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Simon Riggs
On Fri, 2007-12-07 at 12:33 -0500, Bruce Momjian wrote:
> Simon Riggs wrote:
> > On Fri, 2007-11-30 at 09:49 +, Gregory Stark wrote:
> > > "Simon Riggs" <[EMAIL PROTECTED]> writes:
> > > 
> > > > If people understand there aren't 13 performance improvements there are
> > > > at *least* 19+ that is a positive message to help people decide to
> > > > upgrade. 
> > > 
> > > Frankly I think the release notes are already too long. 
> > 
> > So why do we have stuff in there that the users will never see?

Thanks for your reply. 

> Which release note items?

Most of the stuff in Source Code would fall into that category. I don't
advocate removing those items, but I don't see the argument that space
is so tight in the release notes that we have to remove important
performance items but keep those.

> > We already have a release summary, so why summarise *some* of the detail
> > as well, but not all of it???
> > 
> > I see no reason to diminish yours, Heikki's or my own contributions, all
> > of which were in the area of performance, which people do care about.
> > None of the ones I mentioned were trivial patches, nor were their
> > effects small. 
> 
> I totally agree that we are unfair in how we give attribution in the
> release notes.  

I do understand that the release notes are there to inform the user and
not directly to give credit. 

Some important items have been removed from the release notes. It took
me a whole month to notice, but I did eventually notice because I'm
familiar with my own work, as well as that of people working on closely
related items. I have, when I have been aware of them, pointed out
patches produced by others that I thought were missing.

> There is no weight given to patch difficulty and people
> who produce user-invisible changes are much less likely to be mentioned
> in the release notes.

The reward system drives the outputs. If trivial feature additions are
what we reward, then that's what we'll get. That's not important right
now and discussing that is not why I started this thread.

> I don't see any way to fix this that would not harm the release notes
> themselves.  As I mentioned in an earlier email the release notes are
> designed to highlight user-visible changes in a user-understandable way.
> The mentioning of people who wrote the patches is merely a side-effect
> of that to give some credit, but it is a side-effect, not the main
> reason we mention something in the release notes.

Perhaps we are talking about different things. I'm discussing whether
something is important and you seem to be imagining that I only want to
see the phrase "(Simon)" lots of times. If that was the case, it would
have been very simple to arrange, yet I seem to have elected the most
difficult route to doing that. I could easily have hoovered up a few
more trivial changes if that was my line of thinking. So it clearly
wasn't.

Maybe the importance of the patches that were removed wasn't clear
enough, so let me explain my viewpoint. On another part of this thread I
summarised the feedback from others to a list of features that were
definitely user noticeable. The list was:

- Merge Join performance has been substantially improved when low number
of duplicate join keys exist on the outer side of the join (Simon, Greg)

- Large I/O reduction during recovery when full_page_writes = on
(Heikki)

- WAL file switch performance has been improved. Recovery startup now
refers to the last checkpoint time, which may be anything up to the
checkpoint_timeout interval before a database crash. (Simon, Tom)

The last one seems important to me, but I can see that might be TMI for
some people, though I feel we should document it somewhere. The other
side of that is that many people know about those response time spikes
and they will be very keen to know their cause was identified and
removed.

Those items have resulted in important performance gains, not just a few
percentage points. The first one can be 50% or more, the second one is
100% gain and the last one reduced response time spikes on busy systems
by something like a second at switch time. I wouldn't dare to mention
these things if the effects were small, but they are massive gains. 

> If people are concerned about the unfairness, and I understand that, the
> best solution is not to add more items to the release notes to be more
> fair, but to remove all names from release note items.

That makes no sense, but it would benefit people that wrote fewer
patches, I guess.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Andrew Dunstan wrote:
> 
> 
> Bruce Momjian wrote:
> >   
> >> - Heap-Only Tuples (HOT) accelerate space reuse for UPDATEs
> >> change to
> >> "Heap-Only Tuples (HOT) improve performance of frequent UPDATEs"
> >> 
> >
> > I used the original text because it tries to explain _how_ HOT improves
> > performance.  The item that has the descriptive text explains how the
> > space reuse works.  A generic "improve performance" doesn't seem like an
> > improvement.
> >
> >   
> >   
> I still think this needs to be qualified either way. As it stands it's 
> quite misleading. Many update scenarios will not benefit one whit from 
> HOT updates.

Doesn't the detail description qualify it enought?  The heading isn't
suppose to have all the information or it would be unreadable.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Josh Berkus wrote:
> Greg,
> 
> > Frankly I think the release notes are already too long. People who judge a
> > release by counting the number of items in the release notes are not worth
> > appeasing. Including every individual lock removed or code path optimized
> > will only obscure the important points on which people should be judging
> > the relevance of the release to them. Things like smoothing checkpoint i/o
> > which could be removing a show-stopper problem for them.
> 
> I disagree.  For people who want a quick summary of the major user-facing 
> things changed we'll have multiple sources:  (a) the announcement, (b) the 
> press features list, (c) the Feature-Version matrix.  The Release notes 
> should have a *complete* list of changes.
> 
> Why?  Because we don't use a bug/feature tracker.  So a user trying to figure 
> out "hey, was my issue XXX fixed so that I should upgrade?" has *no other 
> source* than the Release notes to look at, except CVS history.  And if we 
> start asking sysadmins and application vendors to read the CVS history, we're 
> gonna simply push them towards other DBMSes which have this information more 
> clearly.
> 
> If we want to shorten the release notes, then we should adopt an issue 
> tracker.

We do mention bug fixes in the release notes if they affect more than a
few users.  For rare bugs the original bug submitter is told what
release will have the fix as part of the bug fxing discussion.  If it
wasn't fixed right away and became a TODO item that item is removed as
part of the release.  (I just did that for 8.3.)

It is true we don't have a tracker but I have not seen a major demand
for it, or at least not enough for someone to actually do the work
required to list _all_ fixes/changes in a release.  The job isn't that
hard, maybe a few days work for someone experienced.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Simon Riggs wrote:
> On Fri, 2007-11-30 at 09:49 +, Gregory Stark wrote:
> > "Simon Riggs" <[EMAIL PROTECTED]> writes:
> > 
> > > If people understand there aren't 13 performance improvements there are
> > > at *least* 19+ that is a positive message to help people decide to
> > > upgrade. 
> > 
> > Frankly I think the release notes are already too long. 
> 
> So why do we have stuff in there that the users will never see?

Which release note items?

> We already have a release summary, so why summarise *some* of the detail
> as well, but not all of it???
> 
> I see no reason to diminish yours, Heikki's or my own contributions, all
> of which were in the area of performance, which people do care about.
> None of the ones I mentioned were trivial patches, nor were their
> effects small. 

I totally agree that we are unfair in how we give attribution in the
release notes.  There is no weight given to patch difficulty and people
who produce user-invisible changes are much less likely to be mentioned
in the release notes.

I don't see any way to fix this that would not harm the release notes
themselves.  As I mentioned in an earlier email the release notes are
designed to highlight user-visible changes in a user-understandable way.
The mentioning of people who wrote the patches is merely a side-effect
of that to give some credit, but it is a side-effect, not the main
reason we mention something in the release notes.

If people are concerned about the unfairness, and I understand that, the
best solution is not to add more items to the release notes to be more
fair, but to remove all names from release note items.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Simon Riggs wrote:
> On Fri, 2007-11-30 at 06:31 +, Simon Riggs wrote:
> 
> > I also notice that two performance features have disappeared from the
> > release notes. (Presumably they have been removed from source). Both of
> > them have changes that can be seen by users, so can't see why we would
> > want them removed.
> 
> Wow, just realised 3 of Heikki's performance patches aren't mentioned
> either:
> 
> - CheckpointStartLock removal
> 
> - I/O reduction during recovery
> 
> - Tuning of Visibility code
> 
> I'm not sure what the rationale is for not mentioning these things.
> They're at least as important, if not more so, than mentioning minor
> source code changes.

The source code changes are more _visible_, I think, meaning they often
require programmers to adjust their code.

> If people understand there aren't 13 performance improvements there are
> at *least* 19+ that is a positive message to help people decide to
> upgrade. 

Frankly I think most people expact an X% improvement in every Postgres
release.  I don't see how mentioning 19 vs. 13 items is going to change
the general understanding that you should upgrade to get better
performance.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [DOCS] [HACKERS] Uniform policy for author credits in contrib module documentation?

2007-12-07 Thread Andrew Dunstan



Joshua D. Drake wrote:

Tom Lane wrote:

Josh Berkus <[EMAIL PROTECTED]> writes:
Historically, the only way to troubleshoot some of the contrib 
modules was to e-mail the author.  If people think that we've 
reached a standard that everything in /contrib is now 
well-documented and supported by the general community, the I 
suppose we don't need contact information.  I'm not so sure myself.


I do not think that we should encourage people to mail the authors first
rather than pgsql-bugs.  For one thing, a lot of those addresses are
dead, and some of the ones that aren't don't respond especially fast.

If the community-at-large can't handle a bug, we certainly have enough
institutional memory to try to contact the original author, even if that
address isn't in the SGML docs.



Perhaps the at a minimum the email goes in the commit?



I don't see any reason, unless we're going to start doing that for all 
contributions. 'contrib' is a serious misnomer anyway, and there's no 
reason to think in general that the original author is specially 
responsible for any of it. I think Tom's point is entirely valid.


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [DOCS] "distributed checkpoint"

2007-12-07 Thread Greg Smith

On Fri, 7 Dec 2007, Simon Riggs wrote:


"Smoothed" makes a lot of sense for me. We used to have a checkpoint
spike, now we don't.


To be accurate, there used to be a huge and unavoidable spike, now there's 
a control that aims to make it smaller.  The problem hasn't completely 
gone away yet.


With that in mind, let me start over.  Here's what's in the release notes 
right now:


"Distributed checkpoints prevent I/O spikes during checkpoints"

It's good this came up, because that is factually wrong; while the average 
case is much better some OS-dependant aspects of the spike (what happens 
at fsync) are certainly still there.  I think it's easier to rewrite this 
whole thing so it's technically accurate rather than a simple fix of the 
wording, something like this:


"Checkpoint writes can be spread over a longer time period to smooth the 
I/O spike during each checkpoint"


It's got spread, it's got smooth, and if I could have worked "silky" in 
there too I would have.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pg_controldata doesn't report 64/32bit?

2007-12-07 Thread Tom Lane
Josh Berkus <[EMAIL PROTECTED]> writes:
> I've just noticed that pg_controldata doesn't say anything about whether 
> the database is 64-bit or 32-bit.

That's because there is no such concept.

> Since the data files can be 
> incompatible based on word length, shouldn't that be in pg_controldata?

They can't be incompatible "based on word length".  They can be
incompatible based on MAXALIGN ... which *is* reported.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Andrew Dunstan



Bruce Momjian wrote:

Andrew Dunstan wrote:
  
I still think this needs to be qualified either way. As it stands it's 
quite misleading. Many update scenarios will not benefit one whit from 
HOT updates.



Doesn't the detail description qualify it enought?  The heading isn't
suppose to have all the information or it would be unreadable.

  
  
If you don't want to be more specific I'd say "certain updates" or "some 
updates" or something similar, just some flag to say it's not all of them.



Good idea.  I added "most":

   Heap-Only Tuples (HOT) accelerate space reuse for most
   UPDATEs (Pavan Deolasee, with ideas from many others)
  


But that's not true either. For example, in my current $dayjob app not 
one significant update will benefit - we have an index rich environment. 
You have no basis for saying "most" that I can see. We really should not 
be in the hyp business in our release notes - that job belongs to the 
commercial promoters ;-)


cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Stefan Kaltenbrunner wrote:
> > Let me give you the criteria I use for the release notes.  The release
> > notes try to document all changes visible to the average user in a way
> > that is understandable to the average user.
> 
> hmm I'm not so sure about that - there are a number of changes to psql
> that are not mentioned for example - like the addition of various new
> backslashcommands (\dF, \dFp, \dFt ,...) as part of the main tsearch2
> commit or a number of tab-complete improvements/additions (for tsearch
> as well as some other stuff).

Yea, but when you say you are adding tsearch there is the assumption
that psql and pg_dump and tab completion will be updated to reflect
that.

> While looking at this i noted a "psql's \d command and pg_dump have been
> enhanced" in the Object Manipulation Changes" Chapter which looks a bit
> missplaced/useless and probably should be removed or clarified and moved
> to the psql part.

I don't see that anywhere.  Are you looking at the most recent version:

http://momjian.us/main/writings/pgsql/sgml/release-8-3.html

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Stefan Kaltenbrunner
Bruce Momjian wrote:
> Simon Riggs wrote:
> 
> [ Sorry for my delay in replying to this.]
> 
>> Few proposals
>>
>> - Can we say "smoothed" rather than "distributed" checkpoints?
>> "Smoothed checkpoints greatly reduce checkpoint I/O spikes"
> 
> Agreed.  Changed.
> 
>> - Heap-Only Tuples (HOT) accelerate space reuse for UPDATEs
>> change to
>> "Heap-Only Tuples (HOT) improve performance of frequent UPDATEs"
> 
> I used the original text because it tries to explain _how_ HOT improves
> performance.  The item that has the descriptive text explains how the
> space reuse works.  A generic "improve performance" doesn't seem like an
> improvement.
> 
>> I also notice that two performance features have disappeared from the
>> release notes. (Presumably they have been removed from source). Both of
>> them have changes that can be seen by users, so can't see why we would
>> want them removed.
>>
>> - Merge Join performance has been substantially improved by ring buffer
>> which avoids materializing the previous sort step. (Simon, Greg)
>>
>> More info, not for release notes:
>> The materialization of the prior sort step would generally double the
>> time taken for the sort, so avoiding this effectively gives a 50%
>> performance gain on sorts that are part of large merge joins.
>>
>>
>> - WAL file switches don't update controlfile any longer. Recovery now
>> refers to the last checkpoint time, which may be many minutes earlier
>> than time previously mentioned. (Simon, Tom)
>>
>> More info, not for release notes:
>> WAL file switches were performed holding important LWLocks, so this
>> improves scalability on high end systems as well as reducing response
>> time spikes under heavy load on all kinds of hardware.
> 
> Let me give you the criteria I use for the release notes.  The release
> notes try to document all changes visible to the average user in a way
> that is understandable to the average user.

hmm I'm not so sure about that - there are a number of changes to psql
that are not mentioned for example - like the addition of various new
backslashcommands (\dF, \dFp, \dFt ,...) as part of the main tsearch2
commit or a number of tab-complete improvements/additions (for tsearch
as well as some other stuff).
While looking at this i noted a "psql's \d command and pg_dump have been
enhanced" in the Object Manipulation Changes" Chapter which looks a bit
missplaced/useless and probably should be removed or clarified and moved
to the psql part.



Stefan

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pg_controldata doesn't report 64/32bit?

2007-12-07 Thread Gregory Stark
"Gregory Stark" <[EMAIL PROTECTED]> writes:

> Oh, I see what you meant now. Datum is a purely in-memory concept, it doesn't
> actually reach disk.
>
> We could always tighten this up a bit by listing the alignment of a handful of
> built-in data types but I suppose there will always be holes in this area
> anyways.

Argh, As soon as I sent it I realized even that's not right. The alignments of
our datums is specified by pg_type and doesn't vary by platform. There has
been some thought of making 8-byte data types like bigint pass-by-value on
64-bit machines in which case we would want to put that boolean flag in the
control file as well. But until that happens there's nothing to see here.

Sorry for the noise arguing with myself.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Simon Riggs wrote:
> On Fri, 2007-12-07 at 12:33 -0500, Bruce Momjian wrote:
> > Simon Riggs wrote:
> > > On Fri, 2007-11-30 at 09:49 +, Gregory Stark wrote:
> > > > "Simon Riggs" <[EMAIL PROTECTED]> writes:
> > > > 
> > > > > If people understand there aren't 13 performance improvements there 
> > > > > are
> > > > > at *least* 19+ that is a positive message to help people decide to
> > > > > upgrade. 
> > > > 
> > > > Frankly I think the release notes are already too long. 
> > > 
> > > So why do we have stuff in there that the users will never see?
> 
> Thanks for your reply. 
> 
> > Which release note items?
> 
> Most of the stuff in Source Code would fall into that category. I don't
> advocate removing those items, but I don't see the argument that space
> is so tight in the release notes that we have to remove important
> performance items but keep those.

I just looked at that section and see the need for each one.  Which
items exactly?

> > > We already have a release summary, so why summarise *some* of the detail
> > > as well, but not all of it???
> > > 
> > > I see no reason to diminish yours, Heikki's or my own contributions, all
> > > of which were in the area of performance, which people do care about.
> > > None of the ones I mentioned were trivial patches, nor were their
> > > effects small. 
> > 
> > I totally agree that we are unfair in how we give attribution in the
> > release notes.  
> 
> I do understand that the release notes are there to inform the user and
> not directly to give credit. 
> 
> Some important items have been removed from the release notes. It took
> me a whole month to notice, but I did eventually notice because I'm
> familiar with my own work, as well as that of people working on closely
> related items. I have, when I have been aware of them, pointed out
> patches produced by others that I thought were missing.
> 
> > There is no weight given to patch difficulty and people
> > who produce user-invisible changes are much less likely to be mentioned
> > in the release notes.
> 
> The reward system drives the outputs. If trivial feature additions are
> what we reward, then that's what we'll get. That's not important right
> now and discussing that is not why I started this thread.

Franky I don't think the release note mention is a significant part of
why people work on patches.  If we remove names entirely, for example, I
don't think we would see any change in activity.  The fact that this
issue hasn't come up since I started in 1996 confirms that.

> > I don't see any way to fix this that would not harm the release notes
> > themselves.  As I mentioned in an earlier email the release notes are
> > designed to highlight user-visible changes in a user-understandable way.
> > The mentioning of people who wrote the patches is merely a side-effect
> > of that to give some credit, but it is a side-effect, not the main
> > reason we mention something in the release notes.
> 
> Perhaps we are talking about different things. I'm discussing whether
> something is important and you seem to be imagining that I only want to
> see the phrase "(Simon)" lots of times. If that was the case, it would
> have been very simple to arrange, yet I seem to have elected the most
> difficult route to doing that. I could easily have hoovered up a few
> more trivial changes if that was my line of thinking. So it clearly
> wasn't.
> 
> Maybe the importance of the patches that were removed wasn't clear
> enough, so let me explain my viewpoint. On another part of this thread I
> summarised the feedback from others to a list of features that were
> definitely user noticeable. The list was:
> 
> - Merge Join performance has been substantially improved when low number
> of duplicate join keys exist on the outer side of the join (Simon, Greg)

Most users don't know if they are using mergejoins or not, nor are they
going to do anything differently now that the feature is in, so that is
why I don't see a need to mention it.

> - Large I/O reduction during recovery when full_page_writes = on
> (Heikki)

Again, a speedup, but not something that impacts people to behave
differently or see different output.

> - WAL file switch performance has been improved. Recovery startup now
> refers to the last checkpoint time, which may be anything up to the
> checkpoint_timeout interval before a database crash. (Simon, Tom)

The cput switch performance is the same issue as above.  When you say
"recovery startup" are you talking about the log message?

> The last one seems important to me, but I can see that might be TMI for
> some people, though I feel we should document it somewhere. The other
> side of that is that many people know about those response time spikes
> and they will be very keen to know their cause was identified and
> removed.

True.  We already mention some improvements in that area so I don't see
how mentioning something else that is kind of vague is going to help.

> Those items have resulted 

Re: [HACKERS] Release Note Changes

2007-12-07 Thread Andrew Dunstan



Bruce Momjian wrote:

Andrew Dunstan wrote:
  

Bruce Momjian wrote:

  
  

- Heap-Only Tuples (HOT) accelerate space reuse for UPDATEs
change to
"Heap-Only Tuples (HOT) improve performance of frequent UPDATEs"



I used the original text because it tries to explain _how_ HOT improves
performance.  The item that has the descriptive text explains how the
space reuse works.  A generic "improve performance" doesn't seem like an
improvement.

  
  
  
I still think this needs to be qualified either way. As it stands it's 
quite misleading. Many update scenarios will not benefit one whit from 
HOT updates.



Doesn't the detail description qualify it enought?  The heading isn't
suppose to have all the information or it would be unreadable.

  


If you don't want to be more specific I'd say "certain updates" or "some 
updates" or something similar, just some flag to say it's not all of them.


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Fri, 7 Dec 2007 14:47:22 -0500 (EST)
Bruce Momjian <[EMAIL PROTECTED]> wrote:

> It is true we don't have a tracker but I have not seen a major demand
> for it,

Are you kidding? 

> or at least not enough for someone to actually do the work
> required to list _all_ fixes/changes in a release.  The job isn't that
> hard, maybe a few days work for someone experienced.

O.k. I would buy this..

Joshua D. Drake



> 


- -- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHWaVGATb/zqfZUUQRAlhoAKCNRp5eCwa71y70d8pkEKWR4eKscQCfa/4Z
UK47SO6aqelh0f9atYcyZmA=
=x8b6
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Simon Riggs wrote:

[ Sorry for my delay in replying to this.]

> Few proposals
> 
> - Can we say "smoothed" rather than "distributed" checkpoints?
> "Smoothed checkpoints greatly reduce checkpoint I/O spikes"

Agreed.  Changed.

> - Heap-Only Tuples (HOT) accelerate space reuse for UPDATEs
> change to
> "Heap-Only Tuples (HOT) improve performance of frequent UPDATEs"

I used the original text because it tries to explain _how_ HOT improves
performance.  The item that has the descriptive text explains how the
space reuse works.  A generic "improve performance" doesn't seem like an
improvement.

> I also notice that two performance features have disappeared from the
> release notes. (Presumably they have been removed from source). Both of
> them have changes that can be seen by users, so can't see why we would
> want them removed.
> 
> - Merge Join performance has been substantially improved by ring buffer
> which avoids materializing the previous sort step. (Simon, Greg)
> 
> More info, not for release notes:
> The materialization of the prior sort step would generally double the
> time taken for the sort, so avoiding this effectively gives a 50%
> performance gain on sorts that are part of large merge joins.
> 
> 
> - WAL file switches don't update controlfile any longer. Recovery now
> refers to the last checkpoint time, which may be many minutes earlier
> than time previously mentioned. (Simon, Tom)
> 
> More info, not for release notes:
> WAL file switches were performed holding important LWLocks, so this
> improves scalability on high end systems as well as reducing response
> time spikes under heavy load on all kinds of hardware.

Let me give you the criteria I use for the release notes.  The release
notes try to document all changes visible to the average user in a way
that is understandable to the average user.

The above items are probably neither visible (except faster) nor
understandable.  Now of course we change change that criteria but that
is going to need a larger discussion.

One idea that would allow these to be included is a "geek" section of
the release notes that has items that would not be understandable by the
average user, e.g. optimizer improvements, locking improvements.  It
would be kind of like "Postgres is faster in this release in some
obscure ways, and this is why".  Of course the section would have to be
labeled clearly and it does open us up to the release notes being less
user-friendly.

Such a section seems to be more of a supplying a curiosity rather than
useful information, though.

I will address the issue of giving people credit for work in my next
email.

The good news is that we can keep adjusting the release notes until 8.3
final.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Bruce Momjian
Usama Dar wrote:
> i agree that release notes should not be too long, but may be there should
> be (if there isn't one already) something like a "change log" where people
> can find out all the changes done from the previous release, if they are
> intrested ?

Right now only the CVS logs provide more detailed information.  At some
point perhaps we should have something that summarizes the CVS logs.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Release Note Changes

2007-12-07 Thread Andrew Dunstan



Bruce Momjian wrote:
  

- Heap-Only Tuples (HOT) accelerate space reuse for UPDATEs
change to
"Heap-Only Tuples (HOT) improve performance of frequent UPDATEs"



I used the original text because it tries to explain _how_ HOT improves
performance.  The item that has the descriptive text explains how the
space reuse works.  A generic "improve performance" doesn't seem like an
improvement.

  
  
I still think this needs to be qualified either way. As it stands it's 
quite misleading. Many update scenarios will not benefit one whit from 
HOT updates.


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [DOCS] "distributed checkpoint"

2007-12-07 Thread Hannu Krosing

Ühel kenal päeval, R, 2007-12-07 kell 18:22, kirjutas Simon Riggs:
> On Thu, 2007-12-06 at 19:43 -0800, David Fetter wrote:
> > On Thu, Dec 06, 2007 at 07:19:44PM -0800, Joshua D. Drake wrote:
> > > -BEGIN PGP SIGNED MESSAGE-
> > > Hash: SHA1
> > > 
> > > On Thu, 06 Dec 2007 20:44:49 -0500
> > > Tom Lane <[EMAIL PROTECTED]> wrote:
> > > 
> > > > Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > > > > Am I the only one who finds the phrase "distributed checkpointing"
> > > > > a bit awkward?  Would it be better if we used "time-distributed
> > > > > checkpointing" instead?
> > > > 
> > > > Yeah, "distributed" has a bunch of connotations that are wrong for
> > > > this purpose.
> > > > 
> > > > I spent a bit of time with a thesaurus but didn't come up with
> > > > anything that seemed le mot juste.  Best I could do was "spread
> > > > checkpoint" or "time-extended checkpoint".  Anybody have a better
> > > > idea?
> > > 
> > > balanced
> > > gradual
> > > extended (I see you mention time-extended but wouldn't time be implicit
> > > based on the actual docs and thus we only need extended?)
> > 
> > How about "smoothed?"
> 
> Agreed
> 
> "Smoothed" makes a lot of sense for me. We used to have a checkpoint
> spike, now we don't. 

wide checkpoints ? 

provide wide and low spikes :)

or even background checpoints ?

> Perhaps we should say something like "time extended checkpoints provide
> smoother (transaction?) response times" 





---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Possible PostgreSQL 8.3beta4 bug with MD5 authentication in psql?

2007-12-07 Thread Tom Lane
Mark Cave-Ayland <[EMAIL PROTECTED]> writes:
> On Fri, 2007-12-07 at 11:03 -0500, Tom Lane wrote:
>> Hmmm ... it seems the problem is that we've defined
>> PQconnectionUsedPassword in such a way that it returns true (causing a
>> prompt) regardless of whether the reason for the connection failure was
>> a bad password or not.  We might need to reconsider that API.

> Right. I think it depends on the interpretation of the
> PQconnectionUsedPassword function. If it should simply return whether or
> not the connection used a password or not (as it does now), then you
> could argue that it should be psql which should incorporate an
> additional check to determine whether the attempt was cancelled due to
> an invalid database name.

I think the problem is that we've tried to make PQconnectionUsedPassword
serve two masters.  What it was originally designed for was to provide
a way for dblink to determine whether the user had provided an
authentication token, or was trying to impersonate the postgres user
on a remote connection.  For that purpose its behavior is fine.
However, we (I think this is my fault :-() also tried to use it to
replace the ugly technique of checking for a specific error message
string to decide whether to prompt for a password.  This example shows
that that doesn't work.

I can see three ways to proceed:

1. Revert the changes that removed dependencies on PQnoPasswordSupplied.
This is ugly but might be the safest solution for 8.3 --- we can always
revisit the issue later.

2. Try to adjust PQconnectionUsedPassword so that what it reports after
successful connection (which is what dblink cares about) isn't
necessarily defined the same as what it says after a failed connection.
This seems pretty ugly and nonintuitive.

3. Invent another libpq function, maybe PQconnectionNeedsPassword,
that does the right thing for the password-checking tests.

I don't think that leaving it as-is is acceptable --- if we do that,
we'll be encouraging client apps to adopt a broken API.

Thoughts?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match