On 2014-01-23 13:56:49 +0100, Simon Riggs wrote:
> IMHO we need to resolve the deadlock inherent in the
> disk-full/WALlock-up/checkpoint situation. My view is that can be
> solved in a similar way to the way the buffer pin deadlock was
> resolved for Hot Standby.
I don't think that approach works
On 23 January 2014 01:19, Jim Nasby wrote:
> On 1/21/14, 6:46 PM, Andres Freund wrote:
>>
>> On 2014-01-21 16:34:45 -0800, Peter Geoghegan wrote:
>>>
>>> >On Tue, Jan 21, 2014 at 3:43 PM, Andres Freund
>>> > wrote:
> >I personally think this isn't worth complicating the code for.
>>>
>>>
On 2014-01-22 18:19:25 -0600, Jim Nasby wrote:
> On 1/21/14, 6:46 PM, Andres Freund wrote:
> >On 2014-01-21 16:34:45 -0800, Peter Geoghegan wrote:
> >>>On Tue, Jan 21, 2014 at 3:43 PM, Andres Freund
> >>>wrote:
> >I personally think this isn't worth complicating the code for.
> >>>
> >>>You'
On 1/21/14, 6:46 PM, Andres Freund wrote:
On 2014-01-21 16:34:45 -0800, Peter Geoghegan wrote:
>On Tue, Jan 21, 2014 at 3:43 PM, Andres Freund wrote:
> >I personally think this isn't worth complicating the code for.
>
>You're probably right. However, I don't see why the bar has to be very
>hi
On 22 January 2014 14:25, Simon Riggs wrote:
> On 22 January 2014 13:14, Heikki Linnakangas wrote:
>> On 01/22/2014 02:10 PM, Simon Riggs wrote:
>>>
>>> As Jeff points out, the blocks being modified would be locked until
>>> space is freed up. Which could make other users wait. The code
>>> requi
Andres Freund writes:
> On 2014-01-21 21:42:19 -0500, Tom Lane wrote:
>> Uh, what? The behavior I'm talking about is *exactly the same*
>> as what happens now. The only change is that the data sent to the
>> WAL file is laid out a bit differently, and the replay logic has
>> to work harder to re
On 2014-01-21 21:42:19 -0500, Tom Lane wrote:
> Andres Freund writes:
> > On 2014-01-21 19:45:19 -0500, Tom Lane wrote:
> >> I don't think that's a comparable case. Incomplete actions are actions
> >> to be taken immediately, and which the replayer then has to complete
> >> somehow if it doesn't
Tom Lane wrote:
> Well, PANIC is certainly bad, but what I'm suggesting is that we
> just focus on getting that down to ERROR and not worry about
> trying to get out of the disk-shortage situation automatically.
> Nor do I believe that it's such a good idea to have the database
> freeze up until
On 22 January 2014 13:14, Heikki Linnakangas wrote:
> On 01/22/2014 02:10 PM, Simon Riggs wrote:
>>
>> As Jeff points out, the blocks being modified would be locked until
>> space is freed up. Which could make other users wait. The code
>> required to avoid that wait would be complex and not worth
On 01/22/2014 02:10 PM, Simon Riggs wrote:
As Jeff points out, the blocks being modified would be locked until
space is freed up. Which could make other users wait. The code
required to avoid that wait would be complex and not worth any
overhead.
Checkpoint also acquires the content lock of eve
On 22 January 2014 01:30, Tom Lane wrote:
> Andres Freund writes:
>> How are we supposed to wait while e.g. ProcArrayLock? Aborting
>> transactions doesn't work either, that writes abort records which can
>> get signficantly large.
>
> Yeah, that's an interesting point ;-). We can't *either* com
On 22 January 2014 01:23, Tom Lane wrote:
> Andres Freund writes:
>> On 2014-01-21 18:59:13 -0500, Tom Lane wrote:
>>> Another thing to think about is whether we couldn't put a hard limit on
>>> WAL record size somehow. Multi-megabyte WAL records are an abuse of the
>>> design anyway, when you g
Andres Freund writes:
> On 2014-01-21 19:45:19 -0500, Tom Lane wrote:
>> I don't think that's a comparable case. Incomplete actions are actions
>> to be taken immediately, and which the replayer then has to complete
>> somehow if it doesn't find the rest of the action in the WAL sequence.
>> The
On 2014-01-21 19:45:19 -0500, Tom Lane wrote:
> Andres Freund writes:
> > On 2014-01-21 19:23:57 -0500, Tom Lane wrote:
> >> I'm not suggesting that we stop providing that information! I'm just
> >> saying that we perhaps don't need to store it all in one WAL record,
> >> if instead we put the on
On 2014-01-21 16:34:45 -0800, Peter Geoghegan wrote:
> On Tue, Jan 21, 2014 at 3:43 PM, Andres Freund wrote:
> > I personally think this isn't worth complicating the code for.
>
> You're probably right. However, I don't see why the bar has to be very
> high when we're considering the trade-off be
Andres Freund writes:
> On 2014-01-21 19:23:57 -0500, Tom Lane wrote:
>> I'm not suggesting that we stop providing that information! I'm just
>> saying that we perhaps don't need to store it all in one WAL record,
>> if instead we put the onus on WAL replay to be able to reconstruct what
>> it ne
On 2014-01-21 19:23:57 -0500, Tom Lane wrote:
> Andres Freund writes:
> > On 2014-01-21 18:59:13 -0500, Tom Lane wrote:
> >> Another thing to think about is whether we couldn't put a hard limit on
> >> WAL record size somehow. Multi-megabyte WAL records are an abuse of the
> >> design anyway, whe
On Tue, Jan 21, 2014 at 3:43 PM, Andres Freund wrote:
> I personally think this isn't worth complicating the code for.
You're probably right. However, I don't see why the bar has to be very
high when we're considering the trade-off between taking some
emergency precaution against having a PANIC s
Andres Freund writes:
> How are we supposed to wait while e.g. ProcArrayLock? Aborting
> transactions doesn't work either, that writes abort records which can
> get signficantly large.
Yeah, that's an interesting point ;-). We can't *either* commit or abort
without emitting some WAL, possibly qu
Andres Freund writes:
> On 2014-01-21 18:59:13 -0500, Tom Lane wrote:
>> Another thing to think about is whether we couldn't put a hard limit on
>> WAL record size somehow. Multi-megabyte WAL records are an abuse of the
>> design anyway, when you get right down to it. So for example maybe we
>>
On 2014-01-22 01:18:36 +0100, Simon Riggs wrote:
> > My understanding is that if it runs out of buffer space while in an
> > XLogInsert, it will be holding one or more buffer content locks exclusively,
> > and unless it can complete the xlog (or scrounge up the info to return that
> > buffer to its
On 21 January 2014 23:01, Jeff Janes wrote:
> On Tue, Jan 21, 2014 at 9:35 AM, Tom Lane wrote:
>>
>> Simon Riggs writes:
>> > On 6 June 2013 16:00, Heikki Linnakangas
>> > wrote:
>> >> The current situation is that if you run out of disk space while
>> >> writing
>> >> WAL, you get a PANIC, and
On 2014-01-21 18:59:13 -0500, Tom Lane wrote:
> Another thing to think about is whether we couldn't put a hard limit on
> WAL record size somehow. Multi-megabyte WAL records are an abuse of the
> design anyway, when you get right down to it. So for example maybe we
> could split up commit records
Andres Freund writes:
> On 2014-01-21 18:24:39 -0500, Tom Lane wrote:
>> Maybe we could get some mileage out of the fact that very approximate
>> techniques would be good enough. For instance, I doubt anyone would bleat
>> if the system insisted on having 10MB or even 100MB of future WAL space
>>
On 2014-01-21 18:24:39 -0500, Tom Lane wrote:
> Jeff Janes writes:
> > On Tue, Jan 21, 2014 at 9:35 AM, Tom Lane wrote:
> >> My preference would be that we simply start failing writes with ERRORs
> >> rather than PANICs. I'm not real sure ATM why this has to be a PANIC
> >> condition. Probably
On Tue, Jan 21, 2014 at 3:24 PM, Tom Lane wrote:
> Maybe we could get some mileage out of the fact that very approximate
> techniques would be good enough. For instance, I doubt anyone would bleat
> if the system insisted on having 10MB or even 100MB of future WAL space
> always available. But I
Jeff Janes writes:
> On Tue, Jan 21, 2014 at 9:35 AM, Tom Lane wrote:
>> My preference would be that we simply start failing writes with ERRORs
>> rather than PANICs. I'm not real sure ATM why this has to be a PANIC
>> condition. Probably the cause is that it's being done inside a critical
>> s
On Tue, Jan 21, 2014 at 9:35 AM, Tom Lane wrote:
> Simon Riggs writes:
> > On 6 June 2013 16:00, Heikki Linnakangas
> wrote:
> >> The current situation is that if you run out of disk space while writing
> >> WAL, you get a PANIC, and the server shuts down. That's awful.
>
> > I don't see we nee
Greg Stark writes:
> Fwiw I think "all transactions lock up until space appears" is *much*
> better than PANICing. Often disks fill up due to other transient
> storage or people may have options to manually increase the amount of
> space. it's much better if the database just continues to function
Fwiw I think "all transactions lock up until space appears" is *much*
better than PANICing. Often disks fill up due to other transient
storage or people may have options to manually increase the amount of
space. it's much better if the database just continues to function
after that rather than need
On 21 January 2014 18:35, Tom Lane wrote:
> Simon Riggs writes:
>> On 6 June 2013 16:00, Heikki Linnakangas wrote:
>>> The current situation is that if you run out of disk space while writing
>>> WAL, you get a PANIC, and the server shuts down. That's awful.
>
>> I don't see we need to prevent W
Simon Riggs writes:
> On 6 June 2013 16:00, Heikki Linnakangas wrote:
>> The current situation is that if you run out of disk space while writing
>> WAL, you get a PANIC, and the server shuts down. That's awful.
> I don't see we need to prevent WAL insertions when the disk fills. We
> still have
On 6 June 2013 16:00, Heikki Linnakangas wrote:
> In the "Redesigning checkpoint_segments" thread, many people opined that
> there should be a hard limit on the amount of disk space used for WAL:
> http://www.postgresql.org/message-id/CA+TgmoaOkgZb5YsmQeMg8ZVqWMtR=6s4-ppd+6jiy4oq78i...@mail.gmail.
On Mon, Jun 10, 2013 at 07:28:24AM +0800, Craig Ringer wrote:
> (I'm still learning the details of Pg's WAL, WAL replay and recovery, so
> the below's just my understanding):
>
> The problem is that WAL for all tablespaces is mixed together in the
> archives. If you lose your tablespace then you h
Peter Eisentraut writes:
> I suspect that there are actually only about 5 or 6 common ways to do
> archiving (say, local, NFS, scp, rsync, S3, ...). There's no reason why
> we can't fully specify and/or script what to do in each of these cases.
And provide either fully reliable contrib scripts o
On Wed, Jun 12, 2013 at 6:03 PM, Joshua D. Drake wrote:
>
>> Right now you have to be a rocket
>> scientist no matter what configuration you're running.
>
>
> This is quite a bit overblown. Assuming your needs are simple. Archiving is
> at it is now, a relatively simple process to set up, even wi
On 06/12/2013 08:49 AM, Robert Haas wrote:
Sure, remote archiving is great, and I'm glad you've been working on
it. In general, I think that's a cleaner approach, but there are
still enough people using archive_command that we can't throw them
under the bus.
Correct.
I guess archiving to
On Wed, Jun 12, 2013 at 12:07 PM, Peter Eisentraut wrote:
> On 6/12/13 10:55 AM, Robert Haas wrote:
>> But it's got to be pretty common to archive to a local
>> path that happens to be a remote mount, or to a local directory whose
>> contents are subsequently copied off by a batch job. Making tha
On 6/12/13 10:55 AM, Robert Haas wrote:
> But it's got to be pretty common to archive to a local
> path that happens to be a remote mount, or to a local directory whose
> contents are subsequently copied off by a batch job. Making that work
> nicely with near-zero configuration would be a signific
On Wed, Jun 12, 2013 at 11:32 AM, Magnus Hagander wrote:
> Wouldn't that encourage people to do local archiving, which is almost always
> a bad idea?
Maybe, but refusing to improve the UI because people might then use
the feature seems wrong-headed.
> I'd rather improve the experience with pg_re
On Sat, Jun 8, 2013 at 7:20 PM, Jeff Janes wrote:
> If archiving is on and failure is due to no space, could we just keep trying
> XLogFileInit again for a couple minutes to give archiving a chance to do its
> things? Doing that while holding onto locks and a critical section would be
> unfortuna
On Jun 12, 2013 4:56 PM, "Robert Haas" wrote:
>
> On Sat, Jun 8, 2013 at 10:36 AM, MauMau wrote:
> > Yes, I feel designing reliable archiving, even for the simplest case -
copy
> > WAL to disk, is very difficult. I know there are following three
problems
> > if you just follow the PostgreSQL man
> On Sat, Jun 8, 2013 at 10:36 AM, MauMau wrote:
>> Yes, I feel designing reliable archiving, even for the simplest case - copy
>> WAL to disk, is very difficult. I know there are following three problems
>> if you just follow the PostgreSQL manual. Average users won't notice them.
>> I guess ev
On Wed, Jun 12, 2013 at 11:55 AM, Robert Haas wrote:
>> I hope PostgreSQL will provide a reliable archiving facility that is ready
>> to use.
>
> +1. I think we should have a way to set an archive DIRECTORY, rather
> than an archive command. And if you set it, then PostgreSQL should
> just do al
On Sat, Jun 8, 2013 at 10:36 AM, MauMau wrote:
> Yes, I feel designing reliable archiving, even for the simplest case - copy
> WAL to disk, is very difficult. I know there are following three problems
> if you just follow the PostgreSQL manual. Average users won't notice them.
> I guess even pro
> Not a bad idea. One that supports rsync and another that supports
> robocopy. That should cover every platform we support.
Example script:
=
#!/usr/bin/env bash
# Simple script to copy WAL archives from one server to another
# to be called as archive_command (call
On 06/10/2013 04:42 PM, Josh Berkus wrote:
Actually we describe what archive_command needs to fulfill, and tell them
to use something that accomplishes that. The example with cp is explicitly
given as an example, not a recommendation.
If we offer cp as an example, we *are* recommending it.
On Mon, Jun 10, 2013 at 4:42 PM, Josh Berkus wrote:
> Daniel, Jeff,
>
>> I don't doubt this, that's why I do have a no-op fallback for
>> emergencies. The discussion was about defaults. I still think that
>> drop-wal-from-archiving-whenever is not a good one.
>
> Yeah, we can argue defaults for
Daniel, Jeff,
> I don't doubt this, that's why I do have a no-op fallback for
> emergencies. The discussion was about defaults. I still think that
> drop-wal-from-archiving-whenever is not a good one.
Yeah, we can argue defaults for a long time. What would be better is
some way to actually det
On Mon, Jun 10, 2013 at 11:59 AM, Josh Berkus wrote:
> Anyway, what I'm pointing out is that this is a business decision, and
> there is no way that we can make a decision for the users what to do
> when we run out of WAL space. And that the "stop archiving" option
> needs to be there for users,
On Sat, Jun 8, 2013 at 11:07 AM, Joshua D. Drake wrote:
>
> On 06/08/2013 07:36 AM, MauMau wrote:
>
> 1. If the machine or postgres crashes while archive_command is copying a
>> WAL file, later archive recovery fails.
>> This is because cp leaves a file of less than 16MB in archive area, and
>> p
Josh, Daniel,
>> Right now, what we're telling users is "You can have continuous backup
>> with Postgres, but you'd better hire and expensive consultant to set it
>> up for you, or use this external tool of dubious provenance which
>> there's no packages for, or you might accidentally cause your d
From: "Craig Ringer"
The problem is that WAL for all tablespaces is mixed together in the
archives. If you lose your tablespace then you have to keep *all* WAL
around and replay *all* of it again when the tablespace comes back
online. This would be very inefficient, would require a lot of tricks
On 06/10/2013 06:39 AM, MauMau wrote:
> The problem is that the reliability of the database system decreases
> with more disks, because failure of any one of those disks would result
> in a database PANIC shutdown
More specifically, with more independent sets of disks / file systems.
>> I'd rath
From: "Craig Ringer"
On 06/09/2013 08:32 AM, MauMau wrote:
- Failure of a disk containing data directory or tablespace
If checkpoint can't write buffers to disk because of disk failure,
checkpoint cannot complete, thus WAL files accumulate in pg_xlog/.
This means that one disk failure will lea
On 2013-06-08 13:26:56 -0700, Joshua D. Drake wrote:
> >At the points where the XLogInsert()s happens we're in critical sections
> >out of which we *cannot* ERROR out because we already may have made
> >modifications that cannot be allowed to be performed
> >partially/unlogged. That's why we're thr
On 06/09/2013 03:02 AM, Jeff Janes wrote:
> It would be nice to have the ability to specify multiple log destinations
> with different log_min_messages for each one. I'm sure syslog already must
> implement some kind of method for doing that, but I've been happy enough
> with the text logs that I
On 06/09/2013 08:32 AM, MauMau wrote:
>
> - Failure of a disk containing data directory or tablespace
> If checkpoint can't write buffers to disk because of disk failure,
> checkpoint cannot complete, thus WAL files accumulate in pg_xlog/.
> This means that one disk failure will lead to postgres sh
On 06/08/2013 10:57 AM, Daniel Farina wrote:
>
>> At which point most sensible users say "no thanks, I'll use something else".
> [snip]
>
> I have a clear bias in experience here, but I can't relate to someone
> who sets up archives but is totally okay losing a segment unceremoniously,
> because it
On 06/06/2013 10:00 PM, Heikki Linnakangas wrote:
>
> I've seen a case, where it was even worse than a PANIC and shutdown.
> pg_xlog was on a separate partition that had nothing else on it. The
> partition filled up, and the system shut down with a PANIC. Because
> there was no space left, it could
From: "Josh Berkus"
There's actually three potential failure cases here:
- One Volume: WAL is on the same volume as PGDATA, and that volume is
completely out of space.
- XLog Partition: WAL is on its own partition/volume, and fills it up.
- Archiving: archiving is failing or too slow, causing
From: "Joshua D. Drake"
On 06/08/2013 11:27 AM, Andres Freund wrote:
You know, the PANIC isn't there just because we like to piss of
users. There's actual technical reasons that don't just go away by
judging the PANIC as stupid.
Yes I know we aren't trying to piss off users. What I am saying
From: "Joshua D. Drake"
On 06/08/2013 07:36 AM, MauMau wrote:
3. You cannot know the reason of archive_command failure (e.g. archive
area full) if you don't use PostgreSQL's server logging.
This is because archive_command failure is not logged in syslog/eventlog.
Wait, what? Is this true (som
On Sat, Jun 8, 2013 at 11:27 AM, Andres Freund wrote:
>
> You know, the PANIC isn't there just because we like to piss of
> users. There's actual technical reasons that don't just go away by
> judging the PANIC as stupid.
> At the points where the XLogInsert()s happens we're in critical sections
>
On 7 June 2013 10:02, Heikki Linnakangas wrote:
> On 07.06.2013 00:38, Andres Freund wrote:
>>
>> On 2013-06-06 23:28:19 +0200, Christian Ullrich wrote:
>>>
>>> * Heikki Linnakangas wrote:
>>>
The current situation is that if you run out of disk space while writing
WAL, you get a PANIC,
On 06/08/2013 11:27 AM, Andres Freund wrote:
On 2013-06-08 11:15:40 -0700, Joshua D. Drake wrote:
To me, a more pragmatic approach makes sense. Obviously having some kind of
code that checks the space makes sense but I don't know that it needs to be
around any operation other than we are creat
On Sat, Jun 8, 2013 at 11:15 AM, Joshua D. Drake wrote:
>
> On 06/06/2013 07:52 AM, Heikki Linnakangas wrote:
>
>> I think it can be made fairly robust otherwise, and the performance
>> impact should be pretty easy to measure with e.g pgbench.
>>
>
> Once upon a time in a land far, far away, we ex
On Fri, Jun 7, 2013 at 12:14 PM, Josh Berkus wrote:
>
> >> The archive command can be made a shell script (or that matter a
> >> compiled program) which can do anything it wants upon failure, including
> >> emailing people.
>
> You're talking about using external tools -- frequently hackish,
> wo
On 2013-06-07 12:02:57 +0300, Heikki Linnakangas wrote:
> On 07.06.2013 00:38, Andres Freund wrote:
> >On 2013-06-06 23:28:19 +0200, Christian Ullrich wrote:
> >>* Heikki Linnakangas wrote:
> >>
> >>>The current situation is that if you run out of disk space while writing
> >>>WAL, you get a PANIC,
On 2013-06-08 11:15:40 -0700, Joshua D. Drake wrote:
> To me, a more pragmatic approach makes sense. Obviously having some kind of
> code that checks the space makes sense but I don't know that it needs to be
> around any operation other than we are creating a segment. What do we care
> why the seg
On 06/06/2013 07:52 AM, Heikki Linnakangas wrote:
I think it can be made fairly robust otherwise, and the performance
impact should be pretty easy to measure with e.g pgbench.
Once upon a time in a land far, far away, we expected users to manage
their own systems. We had things like soft and
On 06/08/2013 07:36 AM, MauMau wrote:
1. If the machine or postgres crashes while archive_command is copying a
WAL file, later archive recovery fails.
This is because cp leaves a file of less than 16MB in archive area, and
postgres refuses to start when it finds such a small archive WAL file.
T
On 06/07/2013 12:14 PM, Josh Berkus wrote:
Right now, what we're telling users is "You can have continuous backup
with Postgres, but you'd better hire and expensive consultant to set it
up for you, or use this external tool of dubious provenance which
there's no packages for, or you might accid
From: "Daniel Farina"
On Fri, Jun 7, 2013 at 12:14 PM, Josh Berkus wrote:
Right now, what we're telling users is "You can have continuous backup
with Postgres, but you'd better hire and expensive consultant to set it
up for you, or use this external tool of dubious provenance which
there's no
On Fri, Jun 7, 2013 at 12:14 PM, Josh Berkus wrote:
> Right now, what we're telling users is "You can have continuous backup
> with Postgres, but you'd better hire and expensive consultant to set it
> up for you, or use this external tool of dubious provenance which
> there's no packages for, or y
>> I would oppose that as the solution, either an unconditional one, or
>> configurable with is it as the default. Those segments are not
>> unneeded. I need them. That is why I set up archiving in the first
>> place. If you need to shut down the database rather than violate my
>> established
Heikki Linnakangas writes:
> On 07.06.2013 19:33, Tom Lane wrote:
>> Not only is that a horrible layering/modularity violation, but surely
>> LockBuffer can have no idea how much WAL space will be needed.
> It can be just a conservative guess, like, 32KB. That should be enough
> for almost all W
On 07.06.2013 19:33, Tom Lane wrote:
Heikki Linnakangas writes:
On 06.06.2013 17:00, Heikki Linnakangas wrote:
A more workable idea is to sprinkle checks in higher-level code, before
you hold any critical locks, to check that there is enough preallocated
WAL. Like, at the beginning of heap_ins
Heikki Linnakangas writes:
> On 06.06.2013 17:00, Heikki Linnakangas wrote:
>> A more workable idea is to sprinkle checks in higher-level code, before
>> you hold any critical locks, to check that there is enough preallocated
>> WAL. Like, at the beginning of heap_insert, heap_update, etc., and al
On 06.06.2013 17:00, Heikki Linnakangas wrote:
A more workable idea is to sprinkle checks in higher-level code, before
you hold any critical locks, to check that there is enough preallocated
WAL. Like, at the beginning of heap_insert, heap_update, etc., and all
similar indexam entry points.
Act
--On 6. Juni 2013 16:25:29 -0700 Josh Berkus wrote:
Archiving
-
In some ways, this is the simplest case. Really, we just need a way to
know when the available WAL space has become 90% full, and abort
archiving at that stage. Once we stop attempting to archive, we can
clean up the u
On 07.06.2013 00:38, Andres Freund wrote:
On 2013-06-06 23:28:19 +0200, Christian Ullrich wrote:
* Heikki Linnakangas wrote:
The current situation is that if you run out of disk space while writing
WAL, you get a PANIC, and the server shuts down. That's awful. We can
So we need to somehow s
On Thu, Jun 6, 2013 at 9:30 PM, Jeff Janes wrote:
> I would oppose that as the solution, either an unconditional one, or
> configurable with is it as the default. Those segments are not unneeded. I
> need them. That is why I set up archiving in the first place. If you need
> to shut down the d
On 06/06/2013 09:30 PM, Jeff Janes wrote:
Archiving
-
In some ways, this is the simplest case. Really, we just need a way to
know when the available WAL space has become 90% full, and abort
archiving at that stage. Once we stop attempting to archive, we can
cl
On Thursday, June 6, 2013, Josh Berkus wrote:
> Let's talk failure cases.
>
> There's actually three potential failure cases here:
>
> - One Volume: WAL is on the same volume as PGDATA, and that volume is
> completely out of space.
>
> - XLog Partition: WAL is on its own partition/volume, and fill
On Thu, Jun 6, 2013 at 4:28 PM, Christian Ullrich wrote:
> * Heikki Linnakangas wrote:
>
>> The current situation is that if you run out of disk space while writing
>> WAL, you get a PANIC, and the server shuts down. That's awful. We can
>
>
>> So we need to somehow stop new WAL insertions from ha
Let's talk failure cases.
There's actually three potential failure cases here:
- One Volume: WAL is on the same volume as PGDATA, and that volume is
completely out of space.
- XLog Partition: WAL is on its own partition/volume, and fills it up.
- Archiving: archiving is failing or too slow, cau
On Thu, Jun 6, 2013 at 10:38 PM, Andres Freund wrote:
> That's not a bad technique. I wonder how reliable it would be in
> postgres. Do all filesystems allow a rename() to succeed if there isn't
> actually any space left? E.g. on btrfs I wouldn't be sure. We need to
> rename because WAL files nee
On 2013-06-06 23:28:19 +0200, Christian Ullrich wrote:
> * Heikki Linnakangas wrote:
>
> >The current situation is that if you run out of disk space while writing
> >WAL, you get a PANIC, and the server shuts down. That's awful. We can
>
> >So we need to somehow stop new WAL insertions from happe
* Heikki Linnakangas wrote:
The current situation is that if you run out of disk space while writing
WAL, you get a PANIC, and the server shuts down. That's awful. We can
So we need to somehow stop new WAL insertions from happening, before
it's too late.
A naive idea is to check if there's
On 06.06.2013 17:17, Andres Freund wrote:
On 2013-06-06 17:00:30 +0300, Heikki Linnakangas wrote:
A more workable idea is to sprinkle checks in higher-level code, before you
hold any critical locks, to check that there is enough preallocated WAL.
Like, at the beginning of heap_insert, heap_updat
On 2013-06-06 17:00:30 +0300, Heikki Linnakangas wrote:
> A more workable idea is to sprinkle checks in higher-level code, before you
> hold any critical locks, to check that there is enough preallocated WAL.
> Like, at the beginning of heap_insert, heap_update, etc., and all similar
> indexam entr
In the "Redesigning checkpoint_segments" thread, many people opined that
there should be a hard limit on the amount of disk space used for WAL:
http://www.postgresql.org/message-id/CA+TgmoaOkgZb5YsmQeMg8ZVqWMtR=6s4-ppd+6jiy4oq78i...@mail.gmail.com.
I'm starting a new thread on that, because that
93 matches
Mail list logo