On 13 May 2013 14:45, Jon Nelson wrote:
> I should not derail this thread any further. Perhaps, if interested
> parties would like to discuss the use of fallocate/posix_fallocate, a
> new thread might be more appropriate?
Sounds like a good idea. Always nice to see a fresh take on earlier ideas.
On 2013-05-13 16:03:11 +0100, Greg Stark wrote:
> On Mon, May 13, 2013 at 2:49 PM, Andres Freund wrote:
> > Sure, the initial file creation will be faster. But are the actual
> > individual wal writes (small, frequently fdatasync()ed) still faster?
> > That's the critical path currently.
> > Wheth
On Mon, May 13, 2013 at 2:49 PM, Andres Freund wrote:
> Sure, the initial file creation will be faster. But are the actual
> individual wal writes (small, frequently fdatasync()ed) still faster?
> That's the critical path currently.
> Whether it is pretty much depends on how the filesystem manages
On 2013-05-13 08:45:41 -0500, Jon Nelson wrote:
> On Mon, May 13, 2013 at 8:32 AM, Andres Freund wrote:
> > On 2013-05-12 19:41:26 -0500, Jon Nelson wrote:
> >> On Sun, May 12, 2013 at 3:46 PM, Jim Nasby wrote:
> >> > On 5/10/13 1:06 PM, Jeff Janes wrote:
> >> >>
> >> >> Of course the paranoid DB
On Mon, May 13, 2013 at 8:32 AM, Andres Freund wrote:
> On 2013-05-12 19:41:26 -0500, Jon Nelson wrote:
>> On Sun, May 12, 2013 at 3:46 PM, Jim Nasby wrote:
>> > On 5/10/13 1:06 PM, Jeff Janes wrote:
>> >>
>> >> Of course the paranoid DBA could turn off restart_after_crash and do a
>> >> manual i
On 2013-05-12 19:41:26 -0500, Jon Nelson wrote:
> On Sun, May 12, 2013 at 3:46 PM, Jim Nasby wrote:
> > On 5/10/13 1:06 PM, Jeff Janes wrote:
> >>
> >> Of course the paranoid DBA could turn off restart_after_crash and do a
> >> manual investigation on every crash, but in that case the database wou
On Mon, May 13, 2013 at 7:49 AM, k...@rice.edu wrote:
> On Sun, May 12, 2013 at 07:41:26PM -0500, Jon Nelson wrote:
>> On Sun, May 12, 2013 at 3:46 PM, Jim Nasby wrote:
>> > On 5/10/13 1:06 PM, Jeff Janes wrote:
>> >>
>> >> Of course the paranoid DBA could turn off restart_after_crash and do a
>>
On Sun, May 12, 2013 at 07:41:26PM -0500, Jon Nelson wrote:
> On Sun, May 12, 2013 at 3:46 PM, Jim Nasby wrote:
> > On 5/10/13 1:06 PM, Jeff Janes wrote:
> >>
> >> Of course the paranoid DBA could turn off restart_after_crash and do a
> >> manual investigation on every crash, but in that case the
On Sun, May 12, 2013 at 03:46:00PM -0500, Jim Nasby wrote:
> On 5/10/13 1:06 PM, Jeff Janes wrote:
> >Of course the paranoid DBA could turn off restart_after_crash and do a
> >manual investigation on every crash, but in that case the database would
> >refuse to restart even in the case where it p
On Sun, May 12, 2013 at 3:46 PM, Jim Nasby wrote:
> On 5/10/13 1:06 PM, Jeff Janes wrote:
>>
>> Of course the paranoid DBA could turn off restart_after_crash and do a
>> manual investigation on every crash, but in that case the database would
>> refuse to restart even in the case where it perfectl
On 5/10/13 1:06 PM, Jeff Janes wrote:
Of course the paranoid DBA could turn off restart_after_crash and do a manual
investigation on every crash, but in that case the database would refuse to
restart even in the case where it perfectly clear that all the following WAL
belongs to the recycled f
On 5/9/13 5:18 PM, Jeff Davis wrote:
On Thu, 2013-05-09 at 14:28 -0500, Jim Nasby wrote:
What about moving some critical data from the beginning of the WAL
record to the end? That would make it easier to detect that we don't
have a complete record. It wouldn't necessarily replace the CRC
though,
On 10 May 2013 23:41, Jeff Davis wrote:
> On Fri, 2013-05-10 at 18:32 +0100, Simon Riggs wrote:
>> We don't write() WAL except with an immediate sync(), so the chances
>> of what you say happening are very low to impossible.
>
> Are you sure? An XLogwrtRqst contains a write and a flush pointer, so
On Friday, May 10, 2013 10:24 PM Greg Stark wrote:
On Fri, May 10, 2013 at 5:31 PM, Amit Kapila wrote:
>> In the case where one block is missing, how can it even reach to next record
>> to check "prev" pointer.
>> I think it can be possible when one of the record is corrupt and following
>> are ok
On 5/10/13 1:32 PM, Simon Riggs wrote:
The timing
window between the write and the sync is negligible and yet I/O would
need to occur in that window and also be out of order from the order
of the write, which is unlikely because an I/O elevator would either
not touch the order of writes at all, o
On Fri, 2013-05-10 at 18:32 +0100, Simon Riggs wrote:
> We don't write() WAL except with an immediate sync(), so the chances
> of what you say happening are very low to impossible.
Are you sure? An XLogwrtRqst contains a write and a flush pointer, so I
assume they can be different.
I agree that i
On Fri, May 10, 2013 at 2:06 PM, Jeff Janes wrote:
> But based on your description, perhaps refusing to automatically restart and
> forcing an explicit decision would happen a lot more often, during normal
> crashes with no corruption, than I was thinking it would.
I bet it would. But I think Gr
On Fri, May 10, 2013 at 9:54 AM, Greg Stark wrote:
> On Fri, May 10, 2013 at 5:31 PM, Amit Kapila
> wrote:
> > In the case where one block is missing, how can it even reach to next
> record
> > to check "prev" pointer.
> > I think it can be possible when one of the record is corrupt and
> follow
On 10 May 2013 18:23, Tom Lane wrote:
> Greg Stark writes:
>> A single WAL record can be over 24kB.
>
>
> Actually, WAL records can run to megabytes. Consider for example a
> commit record for a transaction that dropped thousands of tables ---
> there'll be info about each such table in the com
On 10 May 2013 13:39, Greg Stark wrote:
> On Fri, May 10, 2013 at 7:44 AM, Simon Riggs wrote:
>>> Having one corrupt record followed by a valid record is not an
>>> abnormal situation. It could easily be the correct end of WAL.
>>
>> I disagree, that *is* an abnormal situation and would not be th
Greg Stark writes:
> A single WAL record can be over 24kB.
Actually, WAL records can run to megabytes. Consider for example a
commit record for a transaction that dropped thousands of tables ---
there'll be info about each such table in the commit record, to cue
replay to remove those files.
On Fri, May 10, 2013 at 5:31 PM, Amit Kapila wrote:
> In the case where one block is missing, how can it even reach to next record
> to check "prev" pointer.
> I think it can be possible when one of the record is corrupt and following
> are okay which I think is the
> case in which it can proceed
On Friday, May 10, 2013 6:09 PM Greg Stark wrote:
> On Fri, May 10, 2013 at 7:44 AM, Simon Riggs
> wrote:
> >> Having one corrupt record followed by a valid record is not an
> >> abnormal situation. It could easily be the correct end of WAL.
> >
> > I disagree, that *is* an abnormal situation and
On Fri, May 10, 2013 at 7:44 AM, Simon Riggs wrote:
>> Having one corrupt record followed by a valid record is not an
>> abnormal situation. It could easily be the correct end of WAL.
>
> I disagree, that *is* an abnormal situation and would not be the
> "correct end-of-WAL".
>
> Each WAL record c
On 9 May 2013 23:13, Greg Stark wrote:
> On Thu, May 9, 2013 at 10:45 PM, Simon Riggs wrote:
>> On 9 May 2013 22:39, Tom Lane wrote:
>>> Simon Riggs writes:
If the current WAL record is corrupt and the next WAL record is in
every way valid, we can potentially continue.
>>>
>>> That se
On Thu, 2013-05-09 at 23:13 +0100, Greg Stark wrote:
> However it is possible to reduce the window...
Sounds reasonable.
It's fairly limited though -- the window is already a checkpoint
(typically 5-30 minutes), and we'd bring that down an order of magnitude
(10s). I speculate that, if it got cor
On Thu, 2013-05-09 at 14:28 -0500, Jim Nasby wrote:
> What about moving some critical data from the beginning of the WAL
> record to the end? That would make it easier to detect that we don't
> have a complete record. It wouldn't necessarily replace the CRC
> though, so maybe that's not good enough
On Thu, May 9, 2013 at 10:45 PM, Simon Riggs wrote:
> On 9 May 2013 22:39, Tom Lane wrote:
>> Simon Riggs writes:
>>> If the current WAL record is corrupt and the next WAL record is in
>>> every way valid, we can potentially continue.
>>
>> That seems like a seriously bad idea.
>
> I agree. But
On 9 May 2013 22:39, Tom Lane wrote:
> Simon Riggs writes:
>> If the current WAL record is corrupt and the next WAL record is in
>> every way valid, we can potentially continue.
>
> That seems like a seriously bad idea.
I agree. But if you knew that were true, is stopping a better idea?
Any cor
Simon Riggs writes:
> If the current WAL record is corrupt and the next WAL record is in
> every way valid, we can potentially continue.
That seems like a seriously bad idea.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make
On 9 May 2013 20:28, Jim Nasby wrote:
>> Unfortunately, it seems that doing any kind of validation to determine
>> that we have a valid end-of-the-WAL inherently requires some kind of
>> separate durable write somewhere. It would be a tiny amount of data (an
>> LSN and maybe some extra crosscheck
On 5/8/13 7:34 PM, Jeff Davis wrote:
On Wed, 2013-05-08 at 17:56 -0500, Jim Nasby wrote:
Apologies if this is a stupid question, but is this mostly an issue
due to torn pages? IOW, if we had a way to ensure we never see torn
pages, would that mean an invalid CRC on a WAL page indicated there
rea
On Wed, 2013-05-08 at 17:56 -0500, Jim Nasby wrote:
> Apologies if this is a stupid question, but is this mostly an issue
> due to torn pages? IOW, if we had a way to ensure we never see torn
> pages, would that mean an invalid CRC on a WAL page indicated there
> really was corruption on that page?
On 4/5/13 6:39 PM, Jeff Davis wrote:
On Fri, 2013-04-05 at 10:34 +0200, Florian Pflug wrote:
Maybe we could scan forward to check whether a corrupted WAL record is
followed by one or more valid ones with sensible LSNs. If it is,
chances are high that we haven't actually hit the end of the WAL. I
On Tue, 2013-05-07 at 13:20 -0400, Robert Haas wrote:
> Hmm. Rereading your last email, I see your point: since we now have
> HEAP_XLOG_VISIBLE, this is much less of an issue than it would have
> been before. I'm still not convinced that simplifying that code is a
> good idea, but maybe it doesn'
On Mon, May 6, 2013 at 5:04 PM, Jeff Davis wrote:
> On Mon, 2013-05-06 at 15:31 -0400, Robert Haas wrote:
>> On Wed, May 1, 2013 at 3:04 PM, Jeff Davis wrote:
>> > Regardless, you have a reasonable claim that my patch had effects that
>> > were not necessary. I have attached a draft patch to reme
On Mon, 2013-05-06 at 15:31 -0400, Robert Haas wrote:
> On Wed, May 1, 2013 at 3:04 PM, Jeff Davis wrote:
> > Regardless, you have a reasonable claim that my patch had effects that
> > were not necessary. I have attached a draft patch to remedy that. Only
> > rudimentary testing was done.
>
> Thi
On Wed, May 1, 2013 at 3:04 PM, Jeff Davis wrote:
> Regardless, you have a reasonable claim that my patch had effects that
> were not necessary. I have attached a draft patch to remedy that. Only
> rudimentary testing was done.
This looks reasonable to me.
--
Robert Haas
EnterpriseDB: http://ww
On 3 May 2013 21:53, Jeff Davis wrote:
> At this point, I don't think more changes are required.
After detailed further analysis, I agree, no further changes are required.
I think the code in that area needs considerable refactoring to
improve things. I've looked for an easy way to avoid callin
On Fri, 2013-05-03 at 19:52 +0100, Simon Riggs wrote:
> On 1 May 2013 20:40, Jeff Davis wrote:
>
> >> Looks easy. There is no additional logic for checksums, so there's no
> >> third complexity.
> >>
> >> So we either have
> >> * cleanup info with vismap setting info
> >> * cleanup info only
> >>
On 1 May 2013 20:40, Jeff Davis wrote:
>> Looks easy. There is no additional logic for checksums, so there's no
>> third complexity.
>>
>> So we either have
>> * cleanup info with vismap setting info
>> * cleanup info only
>>
>> which is the same number of WAL records as we have now, just that we
On Wed, 2013-05-01 at 20:06 +0100, Simon Riggs wrote:
> >> Why aren't we writing just one WAL record for this action?
...
> >
> > I thought about that, too. It certainly seems like more than we want
> > to try to do for 9.3 at this point. The other complication is that
> > there's a lot of cond
On Wed, 2013-05-01 at 14:16 -0400, Robert Haas wrote:
> Now that I'm looking at this, I'm a bit confused by the new logic in
> visibilitymap_set(). When checksums are enabled, we set the page LSN,
> which is described like this: "we need to protect the heap page from
> being torn". But how does s
On 1 May 2013 19:16, Robert Haas wrote:
> On Wed, May 1, 2013 at 1:02 PM, Simon Riggs wrote:
>> I agree, but that was in the original coding wasn't it?
>
> I believe the problem was introduced by this commit:
>
> commit fdf9e21196a6f58c6021c967dc5776a16190f295
> Author: Heikki Linnakangas
> Date
On Wed, 2013-05-01 at 11:33 -0400, Robert Haas wrote:
> >> The only time the VM and the data page are out of sync during vacuum is
> >> after a crash, right? If that's the case, I didn't think it was a big
> >> deal to dirty one extra page (should be extremely rare). Am I missing
> >> something?
>
On Wed, May 1, 2013 at 1:02 PM, Simon Riggs wrote:
> I agree, but that was in the original coding wasn't it?
I believe the problem was introduced by this commit:
commit fdf9e21196a6f58c6021c967dc5776a16190f295
Author: Heikki Linnakangas
Date: Wed Feb 13 17:46:23 2013 +0200
Update visibil
On 1 May 2013 16:33, Robert Haas wrote:
> On Wed, May 1, 2013 at 11:29 AM, Robert Haas wrote:
>>> I was worried because SyncOneBuffer checks whether it needs writing
>>> without taking a content lock, so the exclusive lock doesn't help. That
>>> makes sense, because you don't want a checkpoint to
On Wed, May 1, 2013 at 11:29 AM, Robert Haas wrote:
>> I was worried because SyncOneBuffer checks whether it needs writing
>> without taking a content lock, so the exclusive lock doesn't help. That
>> makes sense, because you don't want a checkpoint to have to get a
>> content lock on every buffer
On Tue, Apr 30, 2013 at 5:54 PM, Jeff Davis wrote:
> On Tue, 2013-04-30 at 08:34 -0400, Robert Haas wrote:
>> Uh, wait a minute. I think this is completely wrong. The buffer is
>> LOCKED for this entire sequence of operations. For a checkpoint to
>> "happen", it's got to write every buffer, whi
On 30 April 2013 22:54, Jeff Davis wrote:
> On Tue, 2013-04-30 at 08:34 -0400, Robert Haas wrote:
>> Uh, wait a minute. I think this is completely wrong. The buffer is
>> LOCKED for this entire sequence of operations. For a checkpoint to
>> "happen", it's got to write every buffer, which it wil
On Tue, 2013-04-30 at 08:34 -0400, Robert Haas wrote:
> Uh, wait a minute. I think this is completely wrong. The buffer is
> LOCKED for this entire sequence of operations. For a checkpoint to
> "happen", it's got to write every buffer, which it will not be able to
> do for so long as the buffer
On 30 April 2013 13:34, Robert Haas wrote:
> On Tue, Apr 30, 2013 at 6:58 AM, Simon Riggs wrote:
>> On 9 April 2013 08:36, Jeff Davis wrote:
>>
>>> 1. I believe that the issue I brought up at the end of this email:
>>>
>>> http://www.postgresql.org/message-id/1365035537.7580.380.camel@sussancws0
On Tue, Apr 30, 2013 at 6:58 AM, Simon Riggs wrote:
> On 9 April 2013 08:36, Jeff Davis wrote:
>
>> 1. I believe that the issue I brought up at the end of this email:
>>
>> http://www.postgresql.org/message-id/1365035537.7580.380.camel@sussancws0025
>>
>> is a real issue. In lazy_vacuum_page(), t
On 9 April 2013 08:36, Jeff Davis wrote:
> 1. I believe that the issue I brought up at the end of this email:
>
> http://www.postgresql.org/message-id/1365035537.7580.380.camel@sussancws0025
>
> is a real issue. In lazy_vacuum_page(), the following sequence can
> happen when checksums are on:
>
>
On 11 April 2013 00:37, Robert Haas wrote:
> On Sat, Apr 6, 2013 at 10:44 AM, Andres Freund
> wrote:
> > I feel pretty strongly that we shouldn't add any such complications to
> > XLogInsert() itself, its complicated enough already and it should be
> > made simpler, not more complicated.
>
> +1,
On Sat, Apr 6, 2013 at 10:44 AM, Andres Freund wrote:
> I feel pretty strongly that we shouldn't add any such complications to
> XLogInsert() itself, its complicated enough already and it should be
> made simpler, not more complicated.
+1, emphatically. XLogInsert is a really nasty scalability
b
On Mon, 2013-04-08 at 09:19 +0100, Simon Riggs wrote:
> Applied, with this as the only code change.
>
>
> Thanks everybody for good research and coding and fast testing.
>
>
> We're in good shape now.
Thank you.
I have attached two more patches:
1. I believe that the issue I brought up at t
On Sat, 2013-04-06 at 16:44 +0200, Andres Freund wrote:
> I think we can just make up the rule that changing full page writes also
> requires SpinLockAcquire(&xlogctl->info_lck);. Then its easy enough. And
> it can hardly be a performance bottleneck given how infrequently its
> modified.
That seem
On 6 April 2013 15:44, Andres Freund wrote:
> > * In xlog_redo, it seemed slightly awkward to call XLogRecGetData twice.
> > Merely a matter of preference but I thought I would mention it.
>
> Youre absolutely right, memcpy should have gotten passed 'data', not
> XLogRecGetData().
Applied, wit
On Sat, Apr 6, 2013 at 1:36 PM, Jeff Janes wrote:
> On Fri, Apr 5, 2013 at 6:09 AM, Andres Freund
> wrote:
>
>>
>> How does the attached version look? I verified that it survives
>> recovery, but not more.
>>
>> Jeff, any chance you can run this for a round with your suite?
>
>
>
> I've run it fo
On Fri, Apr 5, 2013 at 6:09 AM, Andres Freund wrote:
> How does the attached version look? I verified that it survives
> recovery, but not more.
>
> Jeff, any chance you can run this for a round with your suite?
>
I've run it for a while now and have found no problems.
Thanks,
Jeff
On 2013-04-05 16:29:47 -0700, Jeff Davis wrote:
> On Fri, 2013-04-05 at 15:09 +0200, Andres Freund wrote:
> > How does the attached version look? I verified that it survives
> > recovery, but not more.
>
> Comments:
>
> * Regarding full page writes, we can:
> - always write full pages (as in yo
On Fri, Apr 5, 2013 at 7:39 PM, Jeff Davis wrote:
> On Fri, 2013-04-05 at 19:22 -0500, Jaime Casanova wrote:
>> On Fri, Apr 5, 2013 at 8:09 AM, Andres Freund wrote:
>> >
>> > How does the attached version look? I verified that it survives
>> > recovery, but not more.
>> >
>>
>> I still got errors
On Fri, 2013-04-05 at 19:22 -0500, Jaime Casanova wrote:
> On Fri, Apr 5, 2013 at 8:09 AM, Andres Freund wrote:
> >
> > How does the attached version look? I verified that it survives
> > recovery, but not more.
> >
>
> I still got errors when executing make installcheck in a just compiled
> 9.3d
On Fri, Apr 5, 2013 at 8:09 AM, Andres Freund wrote:
>
> How does the attached version look? I verified that it survives
> recovery, but not more.
>
I still got errors when executing make installcheck in a just compiled
9.3devel + this_patch, this is when setting wal_level higher than
minimal.
At
On Fri, 2013-04-05 at 10:34 +0200, Florian Pflug wrote:
> Maybe we could scan forward to check whether a corrupted WAL record is
> followed by one or more valid ones with sensible LSNs. If it is,
> chances are high that we haven't actually hit the end of the WAL. In
> that case, we could either log
On Fri, 2013-04-05 at 15:09 +0200, Andres Freund wrote:
> How does the attached version look? I verified that it survives
> recovery, but not more.
Comments:
* Regarding full page writes, we can:
- always write full pages (as in your current patch), regardless of
the current settings
- take W
On 2013-04-04 17:39:16 -0700, Jeff Davis wrote:
> On Thu, 2013-04-04 at 22:39 +0200, Andres Freund wrote:
> > I don't think its really slower. Earlier the code took WalInsertLock
> > everytime, even if we ended up not logging anything. Thats far more
> > epensive than a single spinlock. And the cop
On Apr4, 2013, at 23:21 , Jeff Janes wrote:
> This brings up a pretty frightening possibility to me, unrelated to data
> checksums. If a bit gets twiddled in the WAL file due to a hardware issue or
> a "cosmic ray", and then a crash happens, automatic recovery will stop early
> with the failed
On Thu, 2013-04-04 at 21:06 -0400, Tom Lane wrote:
> I can't escape the feeling that we'd just be reinventing software RAID.
> There's no reason to think that we can deal with this class of problems
> better than the storage system can.
The goal would be to reliably detect a situation where WAL th
Jeff Davis writes:
> On Thu, 2013-04-04 at 14:21 -0700, Jeff Janes wrote:
>> This brings up a pretty frightening possibility to me, unrelated to
>> data checksums. If a bit gets twiddled in the WAL file due to a
>> hardware issue or a "cosmic ray", and then a crash happens, automatic
>> recovery
On Thu, 2013-04-04 at 22:39 +0200, Andres Freund wrote:
> I don't think its really slower. Earlier the code took WalInsertLock
> everytime, even if we ended up not logging anything. Thats far more
> epensive than a single spinlock. And the copy should also only be taken
> in the case we need to log
On Thu, 2013-04-04 at 14:21 -0700, Jeff Janes wrote:
> This brings up a pretty frightening possibility to me, unrelated to
> data checksums. If a bit gets twiddled in the WAL file due to a
> hardware issue or a "cosmic ray", and then a crash happens, automatic
> recovery will stop early with the
On Thu, Apr 4, 2013 at 5:30 AM, Simon Riggs wrote:
> On 4 April 2013 02:39, Andres Freund wrote:
>
> > If by now the first backend has proceeded to PageSetLSN() we are writing
> > different data to disk than the one we computed the checksum of
> > before. Boom.
>
> Right, so nothing else we were
On 2013-04-04 12:59:36 -0700, Jeff Davis wrote:
> Andres,
>
> Thank you for diagnosing this problem!
>
> On Thu, 2013-04-04 at 16:53 +0200, Andres Freund wrote:
> > I think the route you quickly sketched is more realistic. That would
> > remove all knowledge obout XLOG_HINT from generic code hich
Andres,
Thank you for diagnosing this problem!
On Thu, 2013-04-04 at 16:53 +0200, Andres Freund wrote:
> I think the route you quickly sketched is more realistic. That would
> remove all knowledge obout XLOG_HINT from generic code hich is a very
> good thing, I spent like 15minutes yesterday wond
On 4 April 2013 15:53, Andres Freund wrote:
> Unfortunately I find that approach unacceptably ugly.
Yeh. If we can confirm its a fix we can discuss a cleaner patch and
that is much better.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Trai
On 2013-04-04 13:30:40 +0100, Simon Riggs wrote:
> On 4 April 2013 02:39, Andres Freund wrote:
>
> > Ok, I think I see the bug. And I think its been introduced in the
> > checkpoints patch.
>
> Well spotted. (I think you mean checksums patch).
Heh, yes. I was slightly tired at that point ;)
> >
On 4 April 2013 02:39, Andres Freund wrote:
> Ok, I think I see the bug. And I think its been introduced in the
> checkpoints patch.
Well spotted. (I think you mean checksums patch).
> If by now the first backend has proceeded to PageSetLSN() we are writing
> different data to disk than the one
On 2013-04-04 02:58:43 +0200, Andres Freund wrote:
> On 2013-04-03 20:45:51 -0400, Tom Lane wrote:
> > and...@anarazel.de (Andres Freund) writes:
> > > Looking at the page lsn's with dd I noticed something peculiar:
> >
> > > page 0:
> > > 01 00 00 00 18 c2 00 31 => 1/3100C218
> > > page 1:
> > > 0
On 2013-04-03 20:45:51 -0400, Tom Lane wrote:
> and...@anarazel.de (Andres Freund) writes:
> > Looking at the page lsn's with dd I noticed something peculiar:
>
> > page 0:
> > 01 00 00 00 18 c2 00 31 => 1/3100C218
> > page 1:
> > 01 00 00 00 80 44 01 31 => 1/31014480
> > page 10:
> > 01 00 00 00
and...@anarazel.de (Andres Freund) writes:
> Looking at the page lsn's with dd I noticed something peculiar:
> page 0:
> 01 00 00 00 18 c2 00 31 => 1/3100C218
> page 1:
> 01 00 00 00 80 44 01 31 => 1/31014480
> page 10:
> 01 00 00 00 60 ce 05 31 => 1/3105ce60
> page 43:
> 01 00 00 00 58 7a 16 31 =
On 2013-04-04 02:28:32 +0200, Andres Freund wrote:
> On 2013-04-04 01:52:41 +0200, Andres Freund wrote:
> > On 2013-04-03 15:57:49 -0700, Jeff Janes wrote:
> > > I've changed the subject from "regression test failed when enabling
> > > checksum" because I now know they are totally unrelated.
> > >
On Wed, 2013-04-03 at 15:57 -0700, Jeff Janes wrote:
> You don't know that the cluster is in the bad state until after it
> goes through recovery because most crashes recover perfectly fine. So
> it would have to make a side-copy of the cluster after the crash, then
> recover the original and
On 2013-04-04 01:52:41 +0200, Andres Freund wrote:
> On 2013-04-03 15:57:49 -0700, Jeff Janes wrote:
> > I've changed the subject from "regression test failed when enabling
> > checksum" because I now know they are totally unrelated.
> >
> > My test case didn't need to depend on archiving being on
On 2013-04-03 15:57:49 -0700, Jeff Janes wrote:
> I've changed the subject from "regression test failed when enabling
> checksum" because I now know they are totally unrelated.
>
> My test case didn't need to depend on archiving being on, and so with a
> simple tweak I rendered the two issues orth
I've changed the subject from "regression test failed when enabling
checksum" because I now know they are totally unrelated.
My test case didn't need to depend on archiving being on, and so with a
simple tweak I rendered the two issues orthogonal.
On Wed, Apr 3, 2013 at 12:15 PM, Jeff Davis wro
87 matches
Mail list logo