On Tue, Oct 03, 2023 at 04:20:45PM +0900, Michael Paquier wrote:
> If there's no interest in this patch set after the next CF, I'm OK to
> drop it. The state of HEAD is at least correct in the OOM cases now.
I have been thinking about this patch for the last few days, and in
light of 6b18b3fe2c2f
On Tue, Sep 26, 2023 at 03:48:07PM +0900, Michael Paquier wrote:
> By the way, anything that I am proposing here cannot be backpatched
> because of the infrastructure changes required in walreader.c, so I am
> going to create a second thread with something that could be
> backpatched (yeah, likely
On Thu, Aug 10, 2023 at 02:59:07PM +0900, Michael Paquier wrote:
> My apologies if I sounded unclear here. It seems to me that we should
> wrap the patch on [1] first, and get it backpatched. At least that
> makes for less conflicts when 0001 gets merged for HEAD when we are
> able to set a prope
On Thu, Aug 10, 2023 at 02:47:51PM +0900, Kyotaro Horiguchi wrote:
> At Thu, 10 Aug 2023 13:33:48 +0900, Michael Paquier
> wrote in
>> On Thu, Aug 10, 2023 at 10:15:40AM +0900, Kyotaro Horiguchi wrote:
>> > At Thu, 10 Aug 2023 10:00:17 +0900 (JST), Kyotaro Horiguchi
>> > wrote in
>> >> We hav
At Thu, 10 Aug 2023 13:33:48 +0900, Michael Paquier wrote
in
> On Thu, Aug 10, 2023 at 10:15:40AM +0900, Kyotaro Horiguchi wrote:
> > At Thu, 10 Aug 2023 10:00:17 +0900 (JST), Kyotaro Horiguchi
> > wrote in
> >> We have treated every kind of broken data as end-of-recovery, like
> >> incorrect
On Thu, Aug 10, 2023 at 10:15:40AM +0900, Kyotaro Horiguchi wrote:
> At Thu, 10 Aug 2023 10:00:17 +0900 (JST), Kyotaro Horiguchi
> wrote in
>> At Wed, 9 Aug 2023 17:44:49 +0900, Michael Paquier
>> wrote in
While it's a kind of bug in total, we encountered a case where an
excessively
At Thu, 10 Aug 2023 10:00:17 +0900 (JST), Kyotaro Horiguchi
wrote in
> At Wed, 9 Aug 2023 17:44:49 +0900, Michael Paquier
> wrote in
> > > While it's a kind of bug in total, we encountered a case where an
> > > excessively large xl_tot_len actually came from a corrupted
> > > record. [1]
> >
At Wed, 9 Aug 2023 17:44:49 +0900, Michael Paquier wrote
in
> > While it's a kind of bug in total, we encountered a case where an
> > excessively large xl_tot_len actually came from a corrupted
> > record. [1]
>
> Right, I remember this one. I think that Thomas was pretty much right
> that thi
On Wed, Aug 09, 2023 at 05:00:49PM +0900, Kyotaro Horiguchi wrote:
> Looks fine.
Okay, I've updated the patch in consequence. I'll look at 0001 again
at the beginning of next week.
> While it's a kind of bug in total, we encountered a case where an
> excessively large xl_tot_len actually came fr
On Wed, Aug 09, 2023 at 04:13:53PM +0900, Kyotaro Horiguchi wrote:
> I'm not certain if message_deferred is a property of the error
> struct. Callers don't seem to need that information.
True enough, will remove.
> The name "XLOG_RADER_NONE" seems too generic. XLOG_READER_NOERROR will
> be cleare
At Wed, 9 Aug 2023 15:03:21 +0900, Michael Paquier wrote
in
> I have spent more time on 0001, polishing it and fixing a few bugs
> that I have found while reviewing the whole. Most of them were
> related to mistakes in resetting the error state when expected. I
> have also expanded DecodeXLogR
On Tue, Aug 08, 2023 at 05:44:03PM +0900, Kyotaro Horiguchi wrote:
> I like the overall direction. Though, I'm considering enclosing the
> errormsg and errorcode in a struct.
Yes, this suggestion makes sense as it simplifies all the WAL routines
that need to report back a complete error state, and
At Tue, 8 Aug 2023 16:29:49 +0900, Michael Paquier wrote
in
> On Wed, Aug 02, 2023 at 01:16:02PM +0900, Kyotaro Horiguchi wrote:
> > I believe this approach is sufficient to determine whether the error
> > is OOM or not. If total_len is currupted and has an excessively large
> > value, it's high
On Wed, Aug 02, 2023 at 01:16:02PM +0900, Kyotaro Horiguchi wrote:
> I believe this approach is sufficient to determine whether the error
> is OOM or not. If total_len is currupted and has an excessively large
> value, it's highly unlikely that all subsequent pages for that length
> will be consist
On Tue, Aug 01, 2023 at 04:39:54PM -0700, Jeff Davis wrote:
> On Tue, 2023-08-01 at 16:14 +0300, Aleksander Alekseev wrote:
> > Probably I'm missing something, but if memory allocation is required
> > during WAL replay and it fails, wouldn't it be a better solution to
> > log the error and terminat
At Tue, 01 Aug 2023 15:28:54 +0900 (JST), Kyotaro Horiguchi
wrote in
> I thoght that the failure on a stanby results in continuing to retry
> reading the next record. However, I found that there's a case where
> start process stops in response to OOM [1].
I've examined the calls to
MemoryContex
On Tue, 2023-08-01 at 16:14 +0300, Aleksander Alekseev wrote:
> Probably I'm missing something, but if memory allocation is required
> during WAL replay and it fails, wouldn't it be a better solution to
> log the error and terminate the DBMS immediately?
We need to differentiate between:
1. No va
Hi,
> As far as I can see, PerformWalRecovery() uses LOG as elevel
> [...]
> On top of my mind, any solution I can think of needs to add more
> information to XLogReaderState, where we'd either track the type of
> error that happened close to errormsg_buf which is where these errors
> are tracked,
At Tue, 01 Aug 2023 15:28:54 +0900 (JST), Kyotaro Horiguchi
wrote in
> While we will not agree, we could establish a defalut behavior where
> an OOM during recovery immediately triggers an ERROR. Then, we could
> introduce a *GUC* that causes recovery to regard OOM as an
> end-of-recovery error.
At Tue, 1 Aug 2023 14:03:36 +0900, Michael Paquier wrote
in
> On Tue, Aug 01, 2023 at 01:51:13PM +0900, Kyotaro Horiguchi wrote:
> > I believe a database server is not supposed to be executed under such
> > a memory-constrained environment.
>
> I don't really follow this argument. The backend
On Tue, Aug 01, 2023 at 01:51:13PM +0900, Kyotaro Horiguchi wrote:
> I believe a database server is not supposed to be executed under such
> a memory-constrained environment.
I don't really follow this argument. The backend and the frontends
are reliable on OOM, where we generate ERRORs or even F
At Tue, 1 Aug 2023 12:43:21 +0900, Michael Paquier wrote
in
> A colleague, Ethan Mertz (in CC), has discovered that we don't handle
> correctly WAL records that are failing because of an OOM when
> allocating their required space. In the case of Ethan, we have bumped
> on the failure after an a
Hi all,
A colleague, Ethan Mertz (in CC), has discovered that we don't handle
correctly WAL records that are failing because of an OOM when
allocating their required space. In the case of Ethan, we have bumped
on the failure after an allocation failure on XLogReadRecordAlloc():
"out of memory whi
23 matches
Mail list logo