Fujii Masao wrote:
> On Tue, Mar 23, 2010 at 7:56 AM, Bruce Momjian wrote:
> > Sorry, release notes updated:
> >
> > ? ? ? ? ? ? ?Add > ? ? ? ? ? ?
> > ?linkend="functions-recovery-info-table">pg_last_xlog_receive_location()
> > ? ? ? ? ? ? ?and pg_last_xlog_replay_location(), which
> > ? ? ? ?
On Tue, Mar 23, 2010 at 7:56 AM, Bruce Momjian wrote:
> Sorry, release notes updated:
>
> Add
> linkend="functions-recovery-info-table">pg_last_xlog_receive_location()
> and pg_last_xlog_replay_location(), which
> can be used to monitor standby
Simon Riggs wrote:
> On Thu, 2010-01-14 at 17:33 +0900, Fujii Masao wrote:
>
> > I added two new functions;
> >
> > (1) pg_last_xlog_receive_location() reports the last WAL location received
> > and synced by walreceiver. If streaming replication is still in progress
> > this will increas
On Thu, 2010-01-14 at 17:33 +0900, Fujii Masao wrote:
> I added two new functions;
>
> (1) pg_last_xlog_receive_location() reports the last WAL location received
> and synced by walreceiver. If streaming replication is still in progress
> this will increase monotonically. If streaming rep
On Sun, Jan 17, 2010 at 8:53 AM, Josh Berkus wrote:
> * amount of *time* since last successful archive (this would be a good
> trigger for alerts)
> * number of failed archive attempts
> * number of archive files awaiting processing (presumably monitored by
> the slave)
> * last archive file proce
> I'd happily write a patch to handle all that if I thought it would be
> accepted. I fear that the whole approach will be considered a bit too
> hackish and get rejected on that basis though. Not really sure of a
> "right" way to handle this though. Anything better is going to be more
> compli
Kevin Grittner wrote:
Stefan Kaltenbrunner wrote:
Kevin Grittner wrote:
Right, we don't want to give the monitoring software an OS login
for the database servers, for security reasons.
depending on what you exactly mean by that I do have to wonder how
you monitor more complex stuff (or stuf
Greg Smith wrote:
Stefan Kaltenbrunner wrote:
Another popular question is "how far behind real-time is the archiver
process?" You can do this right now by duplicating the same xlog
file name scanning and sorting that the archiver does in your own
code, looking for .ready files. It would be
Stefan Kaltenbrunner wrote:
Another popular question is "how far behind real-time is the archiver
process?" You can do this right now by duplicating the same xlog
file name scanning and sorting that the archiver does in your own
code, looking for .ready files. It would be simpler if you cou
Stefan Kaltenbrunner wrote:
> Kevin Grittner wrote:
>> Right, we don't want to give the monitoring software an OS login
>> for the database servers, for security reasons.
>
> depending on what you exactly mean by that I do have to wonder how
> you monitor more complex stuff (or stuff that requi
Kevin Grittner wrote:
Greg Smith wrote:
In many of the more secure environments I've worked in (finance,
defense), there is *no* access to the database server beyond what
comes out of port 5432 without getting a whole separate team of
people involved. If the DBA can write a simple monitorin
Greg Smith wrote:
Stefan Kaltenbrunner wrote:
Greg Smith wrote:
The other popular request that keeps popping up here is providing an
easy way to see how backlogged the archive_command is, to make it
easier to monitor for out of disk errors that might prove
catastrophic to replication.
I
Greg Smith wrote:
> In many of the more secure environments I've worked in (finance,
> defense), there is *no* access to the database server beyond what
> comes out of port 5432 without getting a whole separate team of
> people involved. If the DBA can write a simple monitoring program
> thems
Stefan Kaltenbrunner wrote:
Greg Smith wrote:
The other popular request that keeps popping up here is providing an
easy way to see how backlogged the archive_command is, to make it
easier to monitor for out of disk errors that might prove
catastrophic to replication.
I tend to disagree -
Greg Smith wrote:
> to make it easier to monitor for out of disk errors that might
> prove catastrophic to replication.
We handle that with the fsutil functions (in pgfoundry). This can
actually measure free space on each volume. These weren't portable
enough to include in core, but maybe th
Greg Smith wrote:
Fujii Masao wrote:
"I'm thinking something like pg_standbys_xlog_location() [on the primary] which
returns
one row per standby servers, showing pid of walsender, host name/
port number/user OID of the standby, the location where the standby
has written/flushed WAL. DBA can mea
On Thu, 2010-01-14 at 23:07 -0500, Greg Smith wrote:
> pg_last_archived_xlogfile() text: Get the name of the last file the
> archive_command [tried to|successfully] archived since the server was
> started. If archiving is disabled or no xlog files have become ready
> to archive since startup, a
Fujii Masao wrote:
"I'm thinking something like pg_standbys_xlog_location() [on the primary] which
returns
one row per standby servers, showing pid of walsender, host name/
port number/user OID of the standby, the location where the standby
has written/flushed WAL. DBA can measure the gap from t
On Wed, Jan 13, 2010 at 5:47 PM, Greg Smith wrote:
> The pieces are coming together...summary:
Thanks for the summary!
> -Also add pg_standbys_xlog_location() on the master: while they could live
> without it, this really helps out the "alert/monitor" script writer whose use
> cases keep popp
Stefan Kaltenbrunner wrote:
so is there an actually concrete proposal of _what_ interals to expose? '
The pieces are coming together...summary:
-Status quo: really bad, but could probably ship anyway because
existing PITR is no better and people manage to use it
-Add slave pg_current_xlog_lo
On Tue, Jan 12, 2010 at 10:16 AM, Bruce Momjian wrote:
> I am concerned that knowledge of this new read-only replication user
> would have to be spread all over the backend code, which is really not
> something we should be doing at this stage in 8.5 development. I am
> also thinking such a speci
On Tue, Jan 12, 2010 at 10:59 PM, Tom Lane wrote:
> Fujii Masao writes:
>> I'm not sure whether poll(2) should be called for this purpose. But
>> poll(2) and select(2) seem to often come together in the existing code.
>> We should follow such custom?
>
> Yes. poll() is usually more efficient, so
Simon Riggs wrote:
> On Tue, 2010-01-12 at 15:42 -0500, Tom Lane wrote:
> > Bruce Momjian writes:
> > > The final commit-fest is in 5 days --- this is not the time for design
> > > discussion and feature additions.
> >
> > +10 --- the one reason I can see for deciding to bounce SR is that there
>
> However, it's probably a better thing to simply expose a way to query
> how much extra log data we have, in raw form (bytes or pages). From
> this, an administration script could take appropriate action.
Also: I think we could release without having this facility. We did
with PITR, after all
> I guess the slightly more ambitious performance monitoring bits that
> Simon was suggesting may cross the line as being too late to implement
> now though (depends on how productive the people actually coding on this
> are I guess), and certainly the ideas thrown out for implementing any
> smart
On Tuesday 12 January 2010 17:37:11 Simon Riggs wrote:
> There is not much sense being talked here. I have asked for sufficient
> monitoring to allow us to manage it in production, which is IMHO the
> minimum required to make it shippable. This is a point I have mentioned
> over the course of many
On Monday 11 January 2010 23:24:24 Greg Smith wrote:
> Fujii Masao wrote:
> > On Mon, Jan 11, 2010 at 5:36 PM, Craig Ringer
> >
> > wrote:
> >> Personally, I'd be uncomfortable enabling something like that without
> >> _both_ an admin alert _and_ the ability to refresh the slave's base
> >> backup
Stefan Kaltenbrunner wrote:
> The database needs to prove very basic information like "we are
> 10min behind in replication" or "3 wal files behind" - the
> decision if any of that is an actual issue or not should be left
> to the actual monitoring system.
+1
-Kevin
--
Sent via pgsql-hacke
Greg Smith wrote:
Bruce Momjian wrote:
Right, so what is the risk of shipping without any fancy monitoring?
You can monitor the code right now by watching the output shown in the
ps display and by trolling the database logs. If I had to I could build
a whole monitoring system out of thos
On Tue, 2010-01-12 at 17:41 -0500, Greg Smith wrote:
> Bruce Momjian wrote:
> > Right, so what is the risk of shipping without any fancy monitoring?
> >
>
> You can monitor the code right now by watching the output shown in the
> ps display and by trolling the database logs. If I had to I cou
Bruce Momjian wrote:
Right, so what is the risk of shipping without any fancy monitoring?
You can monitor the code right now by watching the output shown in the
ps display and by trolling the database logs. If I had to I could build
a whole monitoring system out of those components, it wo
On Tue, 2010-01-12 at 15:42 -0500, Tom Lane wrote:
> Bruce Momjian writes:
> > The final commit-fest is in 5 days --- this is not the time for design
> > discussion and feature additions.
>
> +10 --- the one reason I can see for deciding to bounce SR is that there
> still seem to be design discus
> Right, so what is the risk of shipping without any fancy monitoring?
We add monitoring in 9.1. er, 8.6.
--Josh Berkus
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, 2010-01-12 at 16:34 -0500, Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian writes:
> > > The final commit-fest is in 5 days --- this is not the time for design
> > > discussion and feature additions.
> >
> > +10 --- the one reason I can see for deciding to bounce SR is that there
Stefan Kaltenbrunner wrote:
> >> Let's get a reasonable feature set implemented and then come back in 8.6
> >> to improve it. For example, there is no need for a special
> >> 'replication' user (just use super-user), and monitoring should be
> >> minimal until we have field experience of exactly w
Tom Lane wrote:
> Bruce Momjian writes:
> > The final commit-fest is in 5 days --- this is not the time for design
> > discussion and feature additions.
>
> +10 --- the one reason I can see for deciding to bounce SR is that there
> still seem to be design discussions going on. It is WAY TOO LATE
Simon Riggs wrote:
On Tue, 2010-01-12 at 15:11 -0500, Bruce Momjian wrote:
Stefan Kaltenbrunner wrote:
Simon Riggs wrote:
On Tue, 2010-01-12 at 08:24 +0100, Stefan Kaltenbrunner wrote:
Fujii Masao wrote:
On Tue, Jan 12, 2010 at 1:21 PM, Greg Smith wrote:
I don't think anybody can deploy th
On Tue, 2010-01-12 at 15:11 -0500, Bruce Momjian wrote:
> Stefan Kaltenbrunner wrote:
> > Simon Riggs wrote:
> > > On Tue, 2010-01-12 at 08:24 +0100, Stefan Kaltenbrunner wrote:
> > >> Fujii Masao wrote:
> > >>> On Tue, Jan 12, 2010 at 1:21 PM, Greg Smith
> > >>> wrote:
> > I don't think any
Bruce Momjian writes:
> The final commit-fest is in 5 days --- this is not the time for design
> discussion and feature additions.
+10 --- the one reason I can see for deciding to bounce SR is that there
still seem to be design discussions going on. It is WAY TOO LATE for
that folks. It's time
Marko Kreen writes:
> FYI: on PL/Proxy we use poll() exclusively and on platforms
> that dont have it (win32) we emulate poll() with select():
Yeah, maybe. At the time we started adding poll() support there were
enough platforms with only select() that it didn't make sense to impose
any sort of
On Tue, Jan 12, 2010 at 3:11 PM, Bruce Momjian wrote:
> The final commit-fest is in 5 days --- this is not the time for design
Actually just over 2 days at this point...
...Robert
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http:/
Stefan Kaltenbrunner wrote:
> Simon Riggs wrote:
> > On Tue, 2010-01-12 at 08:24 +0100, Stefan Kaltenbrunner wrote:
> >> Fujii Masao wrote:
> >>> On Tue, Jan 12, 2010 at 1:21 PM, Greg Smith wrote:
> I don't think anybody can deploy this feature without at least some very
> basic monitori
On 1/12/10, Tom Lane wrote:
> Fujii Masao writes:
> > I'm not sure whether poll(2) should be called for this purpose. But
> > poll(2) and select(2) seem to often come together in the existing code.
> > We should follow such custom?
>
>
> Yes. poll() is usually more efficient, so it's preferre
Simon Riggs wrote:
On Tue, 2010-01-12 at 08:24 +0100, Stefan Kaltenbrunner wrote:
Fujii Masao wrote:
On Tue, Jan 12, 2010 at 1:21 PM, Greg Smith wrote:
I don't think anybody can deploy this feature without at least some very
basic monitoring here. I like the basic proposal you made back in S
Magnus Hagander writes:
> On Tue, Jan 12, 2010 at 08:22, Heikki Linnakangas
> wrote:
>> Maybe we should just change the existing pg_current_xlog_location()
>> function to return that when recovery is in progress. It currently
>> throws an error during hot standby.
> Not sure. I don't really like
On Tue, Jan 12, 2010 at 15:13, Andrew Dunstan wrote:
>
>
> Tom Lane wrote:
>>
>> Fujii Masao writes:
>>
>>>
>>> I'm not sure whether poll(2) should be called for this purpose. But
>>> poll(2) and select(2) seem to often come together in the existing code.
>>> We should follow such custom?
>>>
>>
Tom Lane wrote:
Fujii Masao writes:
I'm not sure whether poll(2) should be called for this purpose. But
poll(2) and select(2) seem to often come together in the existing code.
We should follow such custom?
Yes. poll() is usually more efficient, so it's preferred, but not all
platfo
Fujii Masao writes:
> I'm not sure whether poll(2) should be called for this purpose. But
> poll(2) and select(2) seem to often come together in the existing code.
> We should follow such custom?
Yes. poll() is usually more efficient, so it's preferred, but not all
platforms have it. (On the ot
On Tue, Jan 12, 2010 at 08:22, Heikki Linnakangas
wrote:
> Greg Smith wrote:
>> I don't think anybody can deploy this feature without at least some very
>> basic monitoring here. I like the basic proposal you made back in
>> September for adding a pg_standbys_xlog_location to replace what you
>>
On Tue, Jan 12, 2010 at 4:22 PM, Heikki Linnakangas
wrote:
> It would be more straightforward to have a function in the standby to
> return the current replay location. It feels more logical to poll the
> standby to get the status of the standby, instead of indirectly from the
> master. Besides, t
Heikki Linnakangas wrote:
Greg Smith wrote:
I don't think anybody can deploy this feature without at least some very
basic monitoring here. I like the basic proposal you made back in
September for adding a pg_standbys_xlog_location to replace what you
have to get from ps right now:
http://a
On Tue, 2010-01-12 at 08:24 +0100, Stefan Kaltenbrunner wrote:
> Fujii Masao wrote:
> > On Tue, Jan 12, 2010 at 1:21 PM, Greg Smith wrote:
> >> I don't think anybody can deploy this feature without at least some very
> >> basic monitoring here. I like the basic proposal you made back in
> >> Sep
Fujii Masao wrote:
On Tue, Jan 12, 2010 at 1:21 PM, Greg Smith wrote:
I don't think anybody can deploy this feature without at least some very
basic monitoring here. I like the basic proposal you made back in September
for adding a pg_standbys_xlog_location to replace what you have to get from
Greg Smith wrote:
> I don't think anybody can deploy this feature without at least some very
> basic monitoring here. I like the basic proposal you made back in
> September for adding a pg_standbys_xlog_location to replace what you
> have to get from ps right now:
> http://archives.postgresql.org
On Tue, Jan 12, 2010 at 1:24 PM, Greg Smith wrote:
> It's impossible for the database to have any idea whatsoever how people are
> going to want to be alerted. Provide functions to monitor things like
> replication lag, like the number of segments queued up to feed to
> archive_command, and let p
On Tue, Jan 12, 2010 at 1:21 PM, Greg Smith wrote:
> I don't think anybody can deploy this feature without at least some very
> basic monitoring here. I like the basic proposal you made back in September
> for adding a pg_standbys_xlog_location to replace what you have to get from
> ps right now:
Fujii Masao wrote:
On Mon, Jan 11, 2010 at 5:36 PM, Craig Ringer
wrote:
Personally, I'd be uncomfortable enabling something like that without _both_
an admin alert _and_ the ability to refresh the slave's base backup without
admin intervention.
What feature do you specifically need as
Fujii Masao wrote:
On Sun, Jan 10, 2010 at 8:17 PM, Simon Riggs wrote:
What could happen is that the standby could slowly lag behind master. We
don't have any way of monitoring that, as yet. Setting ps display is not
enough here.
I agree that the statistical information about replicat
On Mon, Jan 11, 2010 at 5:36 PM, Craig Ringer
wrote:
> Personally, I'd be uncomfortable enabling something like that without _both_
> an admin alert _and_ the ability to refresh the slave's base backup without
> admin intervention.
What feature do you specifically need as an alert? Just writing
t
On Sun, Jan 10, 2010 at 8:17 PM, Simon Riggs wrote:
> What could happen is that the standby could slowly lag behind master. We
> don't have any way of monitoring that, as yet. Setting ps display is not
> enough here.
I agree that the statistical information about replication activity is
very usef
On Sat, Jan 9, 2010 at 4:25 PM, Heikki Linnakangas
wrote:
> I don't think we need all that, a simple select() should be enough.
> Though I must admit I'm not very familiar with select/poll().
I'm not sure whether poll(2) should be called for this purpose. But
poll(2) and select(2) seem to often c
Simon Riggs wrote:
> > * I don't think we should require superuser rights for replication.
> > Although you see all WAL and potentially all data in the system through
> > that, a standby doesn't need any write access to the master, so it would
> > be good practice to create a dedicated account with
On 9/01/2010 6:20 AM, Josh Berkus wrote:
On 1/8/10 1:16 PM, Heikki Linnakangas wrote:
* A standby that connects to master, initiates streaming, and then sits
idle without stalls recycling of old WAL files in the master. That will
eventually lead to a full disk in master. Do we need some kind of
> Currently there is no way of knowing what the average/current transit
> time is on replication, no way of knowing what is happening if we go
> idle etc.. Those things need to be included because they are not
> otherwise accessible. Cars need windows, not just a finely tuned engine.
Like I said,
On Sun, 2010-01-10 at 12:10 -0800, Josh Berkus wrote:
> > We need monitoring anywhere we have a max_* parameter. Otherwise we
> > won't know how close we are to disaster until we hit the limit and
> > things break down. Otherwise we will have to set parameters by trial and
> > error, or set them so
> We need monitoring anywhere we have a max_* parameter. Otherwise we
> won't know how close we are to disaster until we hit the limit and
> things break down. Otherwise we will have to set parameters by trial and
> error, or set them so high they are meaningless.
I agree.
Thing is, though, we h
On Sun, 2010-01-10 at 18:40 +0200, Heikki Linnakangas wrote:
> > We
> > don't have any way of monitoring that, as yet. Setting ps display is not
> > enough here.
>
> Yeah, monitoring would be nice too. But what I was wondering is whether
> we need some way of stopping that from filling the disk i
Simon Riggs wrote:
> On Fri, 2010-01-08 at 14:20 -0800, Josh Berkus wrote:
>> On 1/8/10 1:16 PM, Heikki Linnakangas wrote:
>>> * A standby that connects to master, initiates streaming, and then sits
>>> idle without stalls recycling of old WAL files in the master. That will
>>> eventually lead to a
On Fri, 2010-01-08 at 14:20 -0800, Josh Berkus wrote:
> On 1/8/10 1:16 PM, Heikki Linnakangas wrote:
> > * A standby that connects to master, initiates streaming, and then sits
> > idle without stalls recycling of old WAL files in the master. That will
> > eventually lead to a full disk in master.
On Fri, 2010-01-08 at 23:16 +0200, Heikki Linnakangas wrote:
> * I removed the feature that archiver was started during recovery. The
> idea of that was to enable archiving from a standby server, to relieve
> the master server of that duty, but I found it annoying because it
> causes trouble if th
Fujii Masao wrote:
> On Sat, Jan 9, 2010 at 6:16 AM, Heikki Linnakangas
> wrote:
>> * If there's no WAL to send, walsender doesn't notice if the client has
>> closed connection already. This is the issue Fujii reported already.
>> We'll need to add a select() call to the walsender main loop to che
On Sat, Jan 9, 2010 at 10:38 AM, Greg Stark wrote:
>> * Need to add comments somewhere to note that ReadRecord depends on the
>> fact that a WAL record is always send as whole, never split across two
>> messages.
>
> What happens in the case of the very large records Tom was describing
> recently.
On Sat, Jan 9, 2010 at 6:16 AM, Heikki Linnakangas
wrote:
> I've gone through the patch in detail now. Here's my list of remaining
> issues:
Great! Thanks a lot!
> * If there's no WAL to send, walsender doesn't notice if the client has
> closed connection already. This is the issue Fujii reporte
On Fri, Jan 8, 2010 at 9:16 PM, Heikki Linnakangas
wrote:
> * We still have a related issue, though: if standby is configured to
> archive to the same location as master (as it always is on my laptop,
> where I use the postgresql.conf of the master unmodified in the server),
> right after failove
On 1/8/10 1:16 PM, Heikki Linnakangas wrote:
> * A standby that connects to master, initiates streaming, and then sits
> idle without stalls recycling of old WAL files in the master. That will
> eventually lead to a full disk in master. Do we need some kind of a
> emergency valve on that?
WARNING:
75 matches
Mail list logo