Re: [BUGS] BUG #3504: Some listening sessions never return from writing, problems ensue

2007-08-10 Thread Peter Koczan
On 8/9/07, Peter Koczan <[EMAIL PROTECTED]> wrote:
> On 8/6/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> > "Peter Koczan" <[EMAIL PROTECTED]> writes:
> > > Here's my theory (and feel free to tell me that I'm full of 
> > > it)...somehow, a
> > > lot of notifies happened at once, or in a very short period of time, to 
> > > the
> > > point where the app was still processing notifies when the timer clicked 
> > > off
> > > another second. The connection (or app, or perl module) never marked those
> > > notifies as being processed, or never updated its timestamp of when it
> > > finished, so when the next notify came around, it tried to reprocess the 
> > > old
> > > data (or data since the last time it finished), and yet again couldn't
> > > finish. Lather, rinse, repeat. In sum, it might be that trying to call
> > > pg_notifies while processing notifies tickles a race condition and tricks
> > > the connection into thinking its in a bad state.
> >
> > Hmm.  Is the app trying to do this processing inside an interrupt
> > service routine (a/k/a signal handler)?  If so, and if the ISR can
> > interrupt itself, then you've got a problem because you'll be doing
> > reentrant calls of libpq, which it doesn't support.  You can only make
> > that work if the handler blocks further occurrences of its signal until
> > it finishes.
> >
>
> I'm not entirely sure if this answers your question, but here's what I
> found out from the primary maintainer of the app. Note that
> update_reqs is the function calling pg_notifies. If there's more
> information I can provide or another test we can run, please let me
> know.
>
> --- BEGIN MESSAGE ---
> I just checked and the timer won't interrupt update_reqs, so we'll
> have to look for another solution.  Anyway, update_reqs doesn't do
> anything with the database except for checking for a notify, so I
> don't see where it can be interrupted to cause DB problems.
> --- END MESSAGE ---
>
> I also found out that one notify gets sent per action (not per batch
> of actions), so if n requests get resolved at once, n notifies are
> sent, not 1. In theory this could mitigate this problem, but I don't
> know how easy it is at this point. Still, it doesn't explain how or
> why the client's recv-q isn't getting cleared.
>
> Hope this helps.
>

On our end, we changed the the code in the function calling
pg_notifies to use an if statement rather than a while (that way it
only updates once per second instead of continuously as long as there
are pending async notifies).

I looked more closely at the docs for DBD::Pg, and the pg_notifies
call grabs *all* pending async notifies and returns them in a hash,
not just one at a time. So, what was happening before was that if a
new notify came through while processing the previous notifies, the
code would reprocess. Lather, rinse, repeat. I think that if the
program is waiting for pg_notifies when the timer interrupts it again,
causing the client to call pg_notifies while still waiting for
something. I think this is what gets the listening connection into the
bad state.

In theory this change should mitigate the "notify interrupt" behavior
on our end, but, again, why the client's recv-q is filling up is as
yet unexplained.

Peter

P.S. In src/backend/commands/async.c, somewhere between lines 910 and
981 (set_ps_display calls) is where the code gets interrupted. How and
why, I don't know.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [BUGS] failed to re-find parent key in "..." for deletion target page

2007-08-10 Thread Heikki Linnakangas
Any news on this?

Tom Lane wrote:
> Brian Hirt <[EMAIL PROTECTED]> writes:
>> basement_dev=# VACUUM FULL developer_name;
>> ERROR:  failed to re-find parent key in  
>> "developer_name_developer_name" for deletion target page 163
> 
> Oh dear, I thought we'd fixed that.
> 
> Can you send me copies of this index file ... and preferably the
> underlying table too?
> 
> You should be able to get out of it by REINDEXing, but we need to find
> the cause first.
> 
>   regards, tom lane

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[BUGS] BUG #3530: Can't start as service if sb created 'c:\program' file

2007-08-10 Thread

The following bug has been logged online:

Bug reference:  3530
Logged by:  
Email address:  [EMAIL PROTECTED]
PostgreSQL version: 8.1
Operating system:   Windows XP
Description:Can't start as service if sb created 'c:\program' file
Details: 

I've installed PostgreSQL in 'c:\program files'. Some buggy program created
file named 'c:\program', and what? PostgreSQL doesn't start!
This error is due to Windows's stupid rules of resolving executable names.
You've added registry entry ImagePath=c:\program files\Postgre
SQL\bin\pg_ctl.exe ...
Now Windows tries to execute c:\program, if not found - c:\program
files\Postgre, and so on, until reaches the end of string.
pg_ctl.exe when registering the service should quote this name:
ImagePath="c:\program files\Postgre SQL\bin\pg_ctl.exe".
But even this doesn't solve the problem, pg_ctl.exe starts postgres.exe, and
probably you haven't used quotations again in CreateProcess invocation. So
there are at least two bugs with lack of "..." in pg_ctl.exe.

If the server started not using net start, but manually (using postgres.exe
-D "..."), everything works ok, so postgres.exe probably doesn't have this
bug.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[BUGS] BUG #3532: Can't rollup array of arrays

2007-08-10 Thread James William Pye

The following bug has been logged online:

Bug reference:  3532
Logged by:  James William Pye
Email address:  [EMAIL PROTECTED]
PostgreSQL version: 8.2
Operating system:   FreeBSD
Description:Can't rollup array of arrays
Details: 

Dunno about the spec, but I would think this should work:

[EMAIL PROTECTED]/tinman[]=# SELECT array(select array[table_name,
column_name] FROM information_schema.columns LIMIT 10);
ERROR:  could not find array type for datatype character varying[]

[EMAIL PROTECTED]/tinman[]=# SELECT version();
version 
   


 PostgreSQL 8.2.4 on i386-portbld-freebsd6.2, compiled by GCC cc (GCC) 3.4.6
[FreeBSD] 20060305
(1 row)

The expectation is the production of an array like:
'{{table_foo,column_bar},{table_foo,column_bar2},...}'.

No? (yeah, it may be more of a feature request than a bug)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [BUGS] BUG #3512: buggy install + no manual support

2007-08-10 Thread Magnus Hagander
Again, please don't drop the list CC.


Jim Oak wrote:
> --- Magnus Hagander <[EMAIL PROTECTED]> wrote:
> 
>> (please don't drop the list CC)
>>
>>
>> Jim Oak wrote:
>>> --- Magnus Hagander <[EMAIL PROTECTED]> wrote:
>>>
 What error message did you get exactly?
>>> "could not connect to database postgres: server
>> closed
>>> the connection unexpectedly This probably means
>> the
>>> server terminated abnormally before or while
>>> processing the request."
>> This should not be directly related to the port,
>> it's something else
>> that's broken. You need to check the server logs
>> (eventlog + pg_log
>> directory).
> 
> yes go figure it out ... for unix/linux install there
> are min requirements in manual (gcc installed) ...
> what's min kernel that supports pgsql 8 but for win32
> there is nothing about that winxp sp1 is not
> supportted (why not?) yes i saw in port compiltaion
> desc they used mingw on wxpsp2 but why it couldn't
> work on sp1

It works fine on SP1. I'd certainly recommend SP2 in all cases though -
nobody should really be using XP without it.


> part of pg_log: (well other logs are empty for some
> reason and last part of text repeats few times in this
> first log)
> 
> 2007-08-05 22:20:29 LOG:  checkpoint record is at
> 0/487970
> 2007-08-05 22:20:29 LOG:  redo record is at 0/487970;
> undo record is at 0/0; shutdown TRUE
> 2007-08-05 22:20:29 LOG:  next transaction ID: 0/595;
> next OID: 10820
> 2007-08-05 22:20:29 LOG:  next MultiXactId: 1; next
> MultiXactOffset: 0
> 2007-08-05 22:20:30 LOG:  could not receive data from
> client: An operation was attempted on something that
> is not a socket.

If you search the archives, you will find that this is typical of buggy
antivirus or firewall software. Do you have any such software installed
on your server?


> As I stated above for some unknown reason i have no
> problems insttal it on the other machine with sp2?
> 
> Is that cause ms changed some libs that's msvcc relies
> on or it's just my old (crappy, decaying, shitty)
> winsp1 install.

It shouldn't be, it's most likely something else installed on that machine.


>> If it's the db password, change your pg_hba config
>> to trust, log in,
>> change the password, and then change pg_hba back to
>> md5. If you search
>> the archives, there should be detailed instructions.
> 
> Thx i'll try it (cause im interested to figure it out)
> ... i already try to mess around postgresql.conf w/o
> any progress ... but why you pointing me to pg_hba
> isn't that only to set trusted IPs nothing to do with
> passwords.

No. pg_hba tells which authentication method is used for which IPs. See
http://www.postgresql.org/docs/current/static/auth-pg-hba-conf.html.


> and localhost is already set to trust
> 
> hostall all 127.0.0.1/32 
> md5

No, that clearly sets localhost to md5.

//Magnus

---(end of broadcast)---
TIP 6: explain analyze is your friend