Jonah H. Harris wrote:
Rather than potentially letting this slide past 8.4, I threw together
an extremely quick-hack patch at the smgr-layer for block-level
checksums.
One hard problem is how to deal with torn pages with non-WAL-logged
changes. Like heap hint bit updates, killing tuples in ind
On Thu, Oct 2, 2008 at 1:29 AM, Jonah H. Harris <[EMAIL PROTECTED]> wrote:
> I ran the regressions and several concurrent benchmark tests which
> passed successfully, but I'm sure I'm missing quite a bit due to the
> the fact that it's late, it's just a quick hack, and I haven't gone
> through the
Rather than potentially letting this slide past 8.4, I threw together
an extremely quick-hack patch at the smgr-layer for block-level
checksums.
There are some nasties in that the CRC is the first member of
PageHeaderData (in order to guarantee inclusion of the LSN), and that
it bumps the size of
"Hitoshi Harada" <[EMAIL PROTECTED]> writes:
> 2008/10/2 Tom Lane <[EMAIL PROTECTED]>:
>> Okay, there's a patch in CVS HEAD that works this way. Let me know if
>> it needs further tweaking for your purposes.
> Hmm, I've looked over the patch. Logically window functions can access
> arbitrary rows
Gurjeet Singh wrote:
I think the -hackers subscribers list is a subset of -general
subscribers; so this wasn't necessary.
Not so. I don't have time to read -general, and I am not subscribed.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make change
Now we have just a month due to the final deadline.
I think we could sort out and make clear its conceptual issues
during CommitFest:Sep. So, I think it is good time that we can
move to the disucussion about its implementation.
Anyway, I want any suggestions what should I pay my efforts to
durin
I think the -hackers subscribers list is a subset of -general subscribers;
so this wasn't necessary.
Any hackers out there who is not on general list?
On Thu, Oct 2, 2008 at 4:24 AM, Darren Weber
<[EMAIL PROTECTED]>wrote:
> I put the email below on the general list, without response. I hope
> i
2008/10/2 Tom Lane <[EMAIL PROTECTED]>:
> "Hitoshi Harada" <[EMAIL PROTECTED]> writes:
>>> I hadn't realized that this would be relevant to window functions.
>>> Now that I know that, I propose fixing tuplestore for multiple
>>> positions and committing it separately, before I go back to the CTE
>>
* Tom Lane <[EMAIL PROTECTED]> [081001 19:42]:
> Gregory Stark <[EMAIL PROTECTED]> writes:
> > a) You wouldn't have to keep the lock while doing the I/O.
>
> Hoo, yeah, so the period of holding the share-lock could well be
> *shorter* than it is now. Most especially so if the write() blocks
> ins
Josh Berkus wrote:
For the September commitfest, 29 patches were applied (one to pgFoundry)
and 18 patches were sent back for more work.
More importantly, six *new* reviewers completed reviews of of various
patches: Abbas Butt, Alex Hunsaker, Markus Wanner, Ibrar Ahmed, Ryan
Bradetich and Gi
Gregory Stark <[EMAIL PROTECTED]> writes:
> a) You wouldn't have to keep the lock while doing the I/O.
Hoo, yeah, so the period of holding the share-lock could well be
*shorter* than it is now. Most especially so if the write() blocks
instead of just transferring the data to kernel space and retu
[EMAIL PROTECTED] writes:
> If you are going to double buffer, one presumes that for some non-zero
> period of time, the block must be locked during which it is copied. You
> wouldn't want it changing "mid-copy" would you? How is this any less of a
> hit than just calculating the checksum?
a) You
[EMAIL PROTECTED] writes:
>> That actually seems like a really good idea.
> I don't think it make sense at all!!!
> If you are going to double buffer, one presumes that for some non-zero
> period of time, the block must be locked during which it is copied. You
> wouldn't want it changing "mid-cop
I put the email below on the general list, without response. I hope
it gets some response from the hackers list.
Thanks, Darren
-- Forwarded message --
From: Darren Weber <[EMAIL PROTECTED]>
Date: Tue, Sep 30, 2008 at 6:10 PM
Subject: Has anyone built pgbash-7.3 against postgreS
> Aidan Van Dyk <[EMAIL PROTECTED]> writes:
>> One possibility would be to "double-buffer" the write... i.e. as you
>> calculate your CRC, you're doing it on a local copy of the block, which
>> you hand to the OS to write... If you're touching the whole block of
>> memory to CRC it, it isn't *ridi
Just want to make sure that this wasn't lost in the shuffle somewhere…
Best,
David
On Sep 14, 2008, at 15:42, David E. Wheeler wrote:
On Sep 12, 2008, at 12:49, Alvaro Herrera wrote:
Looks like the IO conversions handle char and "char", so the
attached
patch just updates the regression tes
Aidan Van Dyk <[EMAIL PROTECTED]> writes:
> One possibility would be to "double-buffer" the write... i.e. as you
> calculate your CRC, you're doing it on a local copy of the block, which
> you hand to the OS to write... If you're touching the whole block of
> memory to CRC it, it isn't *ridiculous
On Wed, Oct 1, 2008 at 4:06 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
> Paul Schlie <[EMAIL PROTECTED]> writes:
>> - however regardless, if some form of error detection ends up being
>> implemented, it might be nice to actually log corrupted blocks of data
>> along with their previously computed chec
Paul Schlie <[EMAIL PROTECTED]> writes:
> - however regardless, if some form of error detection ends up being
> implemented, it might be nice to actually log corrupted blocks of data
> along with their previously computed checksums for subsequent analysis
> in an effort to ascertain if there's an o
"Hitoshi Harada" <[EMAIL PROTECTED]> writes:
>> I hadn't realized that this would be relevant to window functions.
>> Now that I know that, I propose fixing tuplestore for multiple
>> positions and committing it separately, before I go back to the CTE
>> patch. Then Hitoshi-san will have something
Kevin Grittner wrote:
Tom Lane <[EMAIL PROTECTED]> wrote:
>> Paul Schlie <[EMAIL PROTECTED]> writes:
>>> - yes, if you're willing to compute true CRC's as opposed to
>>> simpler checksums, which may be worth the price if in fact many/most
>>> data check failures are truly caused by single bit
>>> Tom Lane <[EMAIL PROTECTED]> wrote:
> Paul Schlie <[EMAIL PROTECTED]> writes:
>> - yes, if you're willing to compute true CRC's as opposed to
simpler
>> checksums, which may be worth the price if in fact many/most data
>> check failures are truly caused by single bit errors somewhere in
the
>>
Hackers,
As it is now October 1st, I have taken the remaining 5 patches still
under discussion and moved them to the top of the list for November,
though hopefully they will be applied sooner.
For the September commitfest, 29 patches were applied (one to pgFoundry)
and 18 patches were sent
On Wed, Oct 01, 2008 at 11:57:31AM -0400, Alvaro Herrera wrote:
> Tom Lane escribió:
> > "Jonah H. Harris" <[EMAIL PROTECTED]> writes:
>
> > > I probably wouldn't compare checksumming *every* WAL record to a
> > > single block-level checksum.
> >
> > No, not at all. Block-level checksums would b
On Wed, Oct 1, 2008 at 10:27 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
>> I don't think that the amount of time it would take to calculate and test
>> the sum is even important. It may be in older CPUs, but these days CPUs
>> are so fast in RAM and a block is very small. On x86 systems, depending on
Paul Schlie <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Paul Schlie writes:
>>> - yes, if you're willing to compute true CRC's as opposed to simpler
>>> checksums, which may be worth the price if in fact many/most data
>>> check failures are truly caused by single bit errors somewhere in the
Aidan Van Dyk <[EMAIL PROTECTED]> writes:
> * Gregory Stark <[EMAIL PROTECTED]> [081001 11:59]:
>
>> If setting a hint bit cleared a flag on the buffer header then the
>> checksumming process could set that flag, begin checksumming, and check that
>> the flag is still set when he's finished.
>>
Tom Lane wrote:
Paul Schlie <[EMAIL PROTECTED]> writes:
- yes, if you're willing to compute true CRC's as opposed to simpler
checksums, which may be worth the price if in fact many/most data
check failures are truly caused by single bit errors somewhere in the
chain,
FWIW, not one of t
Aidan Van Dyk wrote:
One possibility would be to "double-buffer" the write... i.e. as you
calculate your CRC, you're doing it on a local copy of the block, which
you hand to the OS to write... If you're touching the whole block of
memory to CRC it, it isn't *ridiculously* more expensive to copy
Tom Lane wrote:
> Paul Schlie writes:
>> - yes, if you're willing to compute true CRC's as opposed to simpler
>> checksums, which may be worth the price if in fact many/most data
>> check failures are truly caused by single bit errors somewhere in the
>> chain,
>
> FWIW, not one of the corrupted-d
* Gregory Stark <[EMAIL PROTECTED]> [081001 11:59]:
> If setting a hint bit cleared a flag on the buffer header then the
> checksumming process could set that flag, begin checksumming, and check that
> the flag is still set when he's finished.
>
> Actually I suppose that wouldn't actually be goo
* Tom Lane:
> No, not at all. Block-level checksums would be an order of magnitude
> more expensive: they're on bigger chunks of data and they'd be done more
> often.
For larger blocks, checksumming can be parallelized at the instruction
level, especially if the block size is statically known.
On Wed, 2008-10-01 at 16:57 +0100, Gregory Stark wrote:
> I wonder if we could do something clever here though. Only one process
> is busy
> calculating the checksum -- it just has to know if anyone fiddles the hint
> bits while it's busy.
What if the hint bits are added at the very end to the che
>> One other reason the tuplestore should know the position of all the
>> readers is that ideally it would want to be able to discard any tuples
>> older than the oldest read position. That also means it needs to know
>> when all the call sites have allocated their position and don't need
>> to res
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane escribió:
>> No, not at all. Block-level checksums would be an order of magnitude
>> more expensive: they're on bigger chunks of data and they'd be done more
>> often.
> More often? My intention is that they are checked when the buffer is
> r
* Heikki Linnakangas:
> Currently, hint bit updates are not WAL-logged, and thus no full page
> write is done when only hint bits are changed. Imagine what happens if
> hint bits are updated on a page, but there's no other changes, and we
> crash so that only one half of the new page version makes
> [EMAIL PROTECTED] writes:
>>> No, it's all about time penalties and loss of concurrency.
>
>> I don't think that the amount of time it would take to calculate and
>> test
>> the sum is even important. It may be in older CPUs, but these days CPUs
>> are so fast in RAM and a block is very small. On
> Jonah H. Harris wrote:
>> Tom Lane wrote:
>> "Harald Armin Massa" writes:
>>> WHAT should happen when corrupted data is detected?
>>
>> Same thing that happens now, ie, query fails with an error. This would
>> just be an extension of the existing validity checks done at page read
>> time.
>
> Ag
On Wed, Oct 1, 2008 at 11:57 AM, Alvaro Herrera
<[EMAIL PROTECTED]> wrote:
> Tom Lane escribió:
>> No, not at all. Block-level checksums would be an order of magnitude
>> more expensive: they're on bigger chunks of data and they'd be done more
>> often.
>
> More often? My intention is that they a
Tom Lane <[EMAIL PROTECTED]> writes:
> "Jonah H. Harris" <[EMAIL PROTECTED]> writes:
>> On Wed, Oct 1, 2008 at 10:27 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
>>> Your optimism is showing ;-). XLogInsert routinely shows up as a major
>>> CPU hog in any update-intensive test, and AFAICT that's mostl
Tom Lane escribió:
> "Jonah H. Harris" <[EMAIL PROTECTED]> writes:
> > I probably wouldn't compare checksumming *every* WAL record to a
> > single block-level checksum.
>
> No, not at all. Block-level checksums would be an order of magnitude
> more expensive: they're on bigger chunks of data and
Paul Schlie <[EMAIL PROTECTED]> writes:
> - yes, if you're willing to compute true CRC's as opposed to simpler
> checksums, which may be worth the price if in fact many/most data
> check failures are truly caused by single bit errors somewhere in the
> chain,
FWIW, not one of the corrupted-data pr
On Wed, Oct 1, 2008 at 11:36 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
>> I probably wouldn't compare checksumming *every* WAL record to a
>> single block-level checksum.
>
> No, not at all. Block-level checksums would be an order of magnitude
> more expensive: they're on bigger chunks of data and t
"Jonah H. Harris" <[EMAIL PROTECTED]> writes:
> On Wed, Oct 1, 2008 at 10:27 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
>> Your optimism is showing ;-). XLogInsert routinely shows up as a major
>> CPU hog in any update-intensive test, and AFAICT that's mostly from the
>> CRC calculation for WAL recor
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Unfortunately, it doesn't. See hint bits.
> Hmm, so it seems we need to keep held of the bufferhead's spinlock while
> calculating the checksum, just after resetting BM_JUST_DIRTIED. Yuck.
No, holding a spinlock that long is entire
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > A buffer's io_in_progress lock protects the buffer's CRC.
>
> Unfortunately, it doesn't. See hint bits.
Hmm, so it seems we need to keep held of the bufferhead's spinlock while
calculating the checksum, just after resetting BM_JUS
Brian Hurt wrote:
> Brian Hurt wrote:
>> Paul Schlie wrote:
>>>
>>> ... if that doesn't fix
>>> the problem, assume a single bit error, and iteratively flip
>>> single bits until the check sum matches ...
>> This can actually be done much faster, if you're doing a CRC checksum
>> (aka modulo over
Brian Hurt wrote:
> Paul Schlie wrote:
>>
>> ... if that doesn't fix
>> the problem, assume a single bit error, and iteratively flip
>> single bits until the check sum matches ...
> This can actually be done much faster, if you're doing a CRC checksum
> (aka modulo over GF(2^n)). Basically, an err
On Wed, Oct 1, 2008 at 2:54 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
> So it seems like the appropriate generalization is to have an array of
> read positions inside the tuplestore and allow callers to say "read
> using position N", plus some API to allow positions to be allocated to
> different req
[EMAIL PROTECTED] writes:
>> No, it's all about time penalties and loss of concurrency.
> I don't think that the amount of time it would take to calculate and test
> the sum is even important. It may be in older CPUs, but these days CPUs
> are so fast in RAM and a block is very small. On x86 syste
"Greg Stark" <[EMAIL PROTECTED]> writes:
> On Wed, Oct 1, 2008 at 2:54 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
>> So it seems like the appropriate generalization is to have an array of
>> read positions inside the tuplestore and allow callers to say "read
>> using position N", plus some API to allo
Brian Hurt wrote:
Paul Schlie wrote:
... if that doesn't fix
the problem, assume a single bit error, and iteratively flip
single bits until the check sum matches ...
This can actually be done much faster, if you're doing a CRC checksum
(aka modulo over GF(2^n)). Basically, an error flipping bi
> Hannu Krosing <[EMAIL PROTECTED]> writes:
>> So I don't think that this is a space issue.
>
> No, it's all about time penalties and loss of concurrency.
I don't think that the amount of time it would take to calculate and test
the sum is even important. It may be in older CPUs, but these days CP
"Hitoshi Harada" <[EMAIL PROTECTED]> writes:
>> It seems to me to share some ideas with the MemoryContext concept: what
>> about a TupstoreContext associated with tuplestore, you get a common default
>> one if you don't register your own, and use
>> tuplestore_gettuple(MyTupstoreContext, ...);
> I
Paul Schlie wrote:
... if that doesn't fix
the problem, assume a single bit error, and iteratively flip
single bits until the check sum matches ...
This can actually be done much faster, if you're doing a CRC checksum
(aka modulo over GF(2^n)). Basically, an error flipping bit n will
always cr
> On Tue, 2008-09-30 at 17:13 -0400, [EMAIL PROTECTED] wrote:
>> >
>> > I believe the idea was to make this as non-invasive as possible. And
>> > it would be really nice if this could be enabled without a dump/
>> > reload (maybe the upgrade stuff would make this possible?)
>> > --
>>
>> It's all a
On Wed, Oct 1, 2008 at 9:25 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Harald Armin Massa" <[EMAIL PROTECTED]> writes:
>> WHAT should happen when corrupted data is detected?
>
> Same thing that happens now, ie, query fails with an error. This would
> just be an extension of the existing validity c
"Harald Armin Massa" <[EMAIL PROTECTED]> writes:
> WHAT should happen when corrupted data is detected?
Same thing that happens now, ie, query fails with an error. This would
just be an extension of the existing validity checks done at page read
time.
regards, tom lane
--
Hannu Krosing <[EMAIL PROTECTED]> writes:
> So I don't think that this is a space issue.
No, it's all about time penalties and loss of concurrency.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscriptio
On Tue, 2008-09-30 at 17:13 -0400, [EMAIL PROTECTED] wrote:
> >
> > I believe the idea was to make this as non-invasive as possible. And
> > it would be really nice if this could be enabled without a dump/
> > reload (maybe the upgrade stuff would make this possible?)
> > --
>
> It's all about the
Simon Riggs <[EMAIL PROTECTED]> writes:
> On Mon, 2008-09-22 at 16:46 +0100, Gregory Stark wrote:
>
>> Simon Riggs <[EMAIL PROTECTED]> writes:
>>
>> > I'd prefer to set this as a tablespace level storage parameter.
>>
>> Sounds, like a good idea, except... what's a tablespace level storage
>>
CRC-checks will help to detect corrupt data.
my question:
WHAT should happen when corrupted data is detected?
a) PostgreSQL can end with some paniccode
b) a log can be written, with some rather high level
a) has the benefit that it surely will be noticed. Which is a real
benefet, as I suppose
Alvaro Herrera napsal(a):
This code would be run-time or compile-time configurable. I'm not
absolutely sure which yet; the problem with run-time is what to do if
the user restarts the server with the setting flipped. It would have
almost no impact on users who don't enable it.
I prefer runti
Brief overview of plans over the next month:
There are two patches available now that are essential to Hot Standby.
* Infrastructure Changes for Recovery (recovery_infrastruc.v8.patch)
* Subtransaction Commits and Hot Standby (atomic_subxids.v4.patch)
Other patches will be coming soon, probably i
Tom Lane wrote:
I just managed to make a backend dump core while fooling with the CTE
patch, and found out that the system failed to recover, because the
ensuing startup process *also* dumped core. Here's the backtrace:
...
We should of course not be attempting XLogInsert during WAL replay.
Now
Jonah H. Harris wrote:
I'd like to submit this for 8.4, but I want to ensure that -hackers
at large approve of this feature before starting serious coding.
>>>
>>> IMHO, this is a functionality that should be enabled by default (as it
>>> is on most other RDBMS). It would've prevented se
66 matches
Mail list logo