On Wed, Mar 21, 2012 at 10:59 PM, Tom Lane wrote:
> Heikki Linnakangas writes:
>> ... although none of the issues alone is a show-stopper, but considering
>> all these things together, I'm starting to feel that this needs to be
>> pushed to 9.3. Thoughts?
>
> Agreed. In particular, I think you a
Heikki Linnakangas writes:
> ... although none of the issues alone is a show-stopper, but considering
> all these things together, I'm starting to feel that this needs to be
> pushed to 9.3. Thoughts?
Agreed. In particular, I think you are right that it'd be prudent to
simplify the WAL-locatio
On Wed, Mar 21, 2012 at 7:52 AM, Heikki Linnakangas
wrote:
> So, although none of the issues alone is a show-stopper, but considering all
> these things together, I'm starting to feel that this needs to be pushed to
> 9.3. Thoughts?
I think I agree. I like the refactoring ideas that you're propo
On 21.03.2012 13:14, Fujii Masao wrote:
PANIC: space reserved for WAL record does not match what was
written, CurrPos: C/0, EndPos: B/FF00
So I think that the patch would have a bug which handles WAL boundary wrongly.
Thanks for the testing! These WAL boundary issues are really trick
On Thu, Mar 15, 2012 at 5:52 AM, Heikki Linnakangas
wrote:
> When all those changes are put together, the patched version now beats or
> matches the current code in the RAM drive tests, except that the
> single-client case is still about 10% slower. I added the new test results
> at http://communi
On Tue, Mar 13, 2012 at 1:59 AM, Jeff Janes wrote:
> On Fri, Mar 9, 2012 at 2:45 AM, Heikki Linnakangas
> wrote:
>>
>>
>> Thanks!
>>
>> BTW, I haven't forgotten about the recovery bugs Jeff found earlier. I'm
>> planning to do a longer run with his test script - I only run it for about
>> 1000 it
On 09.03.2012 12:04, Heikki Linnakangas wrote:
I've been doing some performance testing with this, using a simple C
function that just inserts a dummy WAL record of given size. I'm not
totally satisfied. Although the patch helps with scalability at 3-4
concurrent backends doing WAL insertions, it
On Fri, Mar 9, 2012 at 2:45 AM, Heikki Linnakangas
wrote:
>
>
> Thanks!
>
> BTW, I haven't forgotten about the recovery bugs Jeff found earlier. I'm
> planning to do a longer run with his test script - I only run it for about
> 1000 iterations - to see if I can reproduce the PANIC with both the ea
On Fri, Mar 9, 2012 at 7:04 PM, Heikki Linnakangas
wrote:
> Here's an updated patch. It now only loops once per segment that a record
> crosses. Plus a lot of other small cleanup.
Thanks! But you forgot to attach the patch.
> I've been doing some performance testing with this, using a simple C
>
On 07.03.2012 17:28, Tom Lane wrote:
Simon Riggs writes:
On Wed, Mar 7, 2012 at 3:04 PM, Tom Lane wrote:
Alvaro Herrera writes:
So they are undoubtely rare. Not sure if as rare as Higgs bosons.
Even if they're rare, having a major performance hiccup when one happens
is not a side-effect
On Mon, Mar 5, 2012 at 8:50 AM, Heikki Linnakangas
wrote:
>
> That particular issue would be very hard to hit in practice, so I don't know
> if this could explain the recovery failures that Jeff saw. I got the test
> script running (thanks for that Jeff!), but unfortunately have not seen any
> fai
Simon Riggs writes:
> On Wed, Mar 7, 2012 at 3:04 PM, Tom Lane wrote:
>> Alvaro Herrera writes:
>>> So they are undoubtely rare. Not sure if as rare as Higgs bosons.
>> Even if they're rare, having a major performance hiccup when one happens
>> is not a side-effect I want to see from a patch w
On Wed, Mar 7, 2012 at 3:04 PM, Tom Lane wrote:
> Alvaro Herrera writes:
>> Just to keep things in perspective -- For a commit record to reach one
>> megabyte, it would have to be a transaction that drops over 43k tables.
>> Or have 64k smgr inval messages (for example, a TRUNCATE might send half
Alvaro Herrera writes:
> Just to keep things in perspective -- For a commit record to reach one
> megabyte, it would have to be a transaction that drops over 43k tables.
> Or have 64k smgr inval messages (for example, a TRUNCATE might send half
> a dozen of these messages). Or have 262k subtransac
Excerpts from Simon Riggs's message of mié mar 07 05:35:44 -0300 2012:
> On Tue, Mar 6, 2012 at 8:32 PM, Tom Lane wrote:
> > Heikki Linnakangas writes:
> >> On 06.03.2012 17:12, Tom Lane wrote:
> >>> How long is the current locked code exactly --- does it contain a loop?
> >
> >> Perhaps best if
On Tue, Mar 6, 2012 at 8:32 PM, Tom Lane wrote:
> Heikki Linnakangas writes:
>> On 06.03.2012 17:12, Tom Lane wrote:
>>> How long is the current locked code exactly --- does it contain a loop?
>
>> Perhaps best if you take a look for yourself, the function is called
>> ReserveXLogInsertLocation()
On Wed, Mar 7, 2012 at 5:32 AM, Tom Lane wrote:
> What I suggest is that it should not be necessary to crawl forward one
> page at a time to figure out how many pages will be needed to store N
> bytes worth of WAL data. You're basically implementing a division
> problem as repeated subtraction.
Heikki Linnakangas writes:
> On 06.03.2012 17:12, Tom Lane wrote:
>> How long is the current locked code exactly --- does it contain a loop?
> Perhaps best if you take a look for yourself, the function is called
> ReserveXLogInsertLocation() in patch. It calls a helper function called
> Advan
On 06.03.2012 17:12, Tom Lane wrote:
Heikki Linnakangas writes:
On 06.03.2012 14:52, Fujii Masao wrote:
This also strikes me that the usage of the spinlock insertpos_lck might
not be OK in ReserveXLogInsertLocation() because a few dozen instructions
can be performed while holding the spinlock.
Heikki Linnakangas writes:
> On 06.03.2012 14:52, Fujii Masao wrote:
>> This also strikes me that the usage of the spinlock insertpos_lck might
>> not be OK in ReserveXLogInsertLocation() because a few dozen instructions
>> can be performed while holding the spinlock
> I admit that block is l
On Tue, Mar 6, 2012 at 10:07 AM, Heikki Linnakangas
wrote:
> I admit that block is longer than any of our existing spinlock blocks.
> However, it's important for performance. I tried using a lwlock earlier, and
> that negated the gains. So if that's a serious objection, then let's resolve
> that n
On 06.03.2012 14:52, Fujii Masao wrote:
On Tue, Mar 6, 2012 at 2:17 AM, Tom Lane wrote:
Heikki Linnakangas writes:
On 21.02.2012 13:19, Fujii Masao wrote:
In some places, the spinlock "insertpos_lck" is taken while another
spinlock "info_lck" is being held. Is this OK? What if unfortunately
On Tue, Mar 6, 2012 at 1:50 AM, Heikki Linnakangas
wrote:
>> + * An xlog-switch record consumes all the remaining space on the
>> + * WAL segment. We have already reserved it for us, but we still
>> need
>> + * to make sure it's been allocated and zeroed in the WAL buffers
>>
On Tue, Mar 6, 2012 at 2:17 AM, Tom Lane wrote:
> Heikki Linnakangas writes:
>> On 21.02.2012 13:19, Fujii Masao wrote:
>>> In some places, the spinlock "insertpos_lck" is taken while another
>>> spinlock "info_lck" is being held. Is this OK? What if unfortunately
>>> inner spinlock takes long to
Heikki Linnakangas writes:
> On 21.02.2012 13:19, Fujii Masao wrote:
>> In some places, the spinlock "insertpos_lck" is taken while another
>> spinlock "info_lck" is being held. Is this OK? What if unfortunately
>> inner spinlock takes long to be taken?
> Hmm, that's only done at a checkpoint (an
On 21.02.2012 13:19, Fujii Masao wrote:
On Sat, Feb 18, 2012 at 12:36 AM, Heikki Linnakangas
wrote:
Attached is a new version, fixing that, and off-by-one bug you pointed out
in the slot wraparound handling. I also moved code around a bit, I think
this new division of labor between the XLogIns
On 20.02.2012 08:00, Amit Kapila wrote:
I was trying to understand this patch and had few doubts:
1. In PerformXLogInsert(), why there is need to check freespace when
already during ReserveXLogInsertLocation(), the space is reserved. Is
it possible that the record size is more than actually calc
On Tue, Feb 21, 2012 at 5:34 PM, Fujii Masao wrote:
> On Tue, Feb 21, 2012 at 8:19 PM, Fujii Masao wrote:
>> On Sat, Feb 18, 2012 at 12:36 AM, Heikki Linnakangas
>> wrote:
>>> Attached is a new version, fixing that, and off-by-one bug you pointed out
>>> in the slot wraparound handling. I also m
On Tue, Feb 21, 2012 at 8:19 PM, Fujii Masao wrote:
> On Sat, Feb 18, 2012 at 12:36 AM, Heikki Linnakangas
> wrote:
>> Attached is a new version, fixing that, and off-by-one bug you pointed out
>> in the slot wraparound handling. I also moved code around a bit, I think
>> this new division of lab
On Sat, Feb 18, 2012 at 12:36 AM, Heikki Linnakangas
wrote:
> Attached is a new version, fixing that, and off-by-one bug you pointed out
> in the slot wraparound handling. I also moved code around a bit, I think
> this new division of labor between the XLogInsert subroutines is more
> readable.
T
On Sun, Feb 19, 2012 at 3:01 AM, Jeff Janes wrote:
> I've tested your v9 patch. I no longer see any inconsistencies or
> lost transactions in the recovered database. But occasionally I get
> databases that fail to recover at all.
> It has always been with the exact same failed assertion, at xlog
Jeff Janes; Robert Haas; PostgreSQL-development
Subject: Re: Scaling XLog insertion (was Re: [HACKERS] Moving more work
outside WALInsertLock)
On 17.02.2012 07:27, Fujii Masao wrote:
> Got another problem: when I ran pg_stop_backup to take an online
> backup, it got stuck until I had genera
On Fri, Feb 17, 2012 at 7:36 AM, Heikki Linnakangas
wrote:
> On 17.02.2012 07:27, Fujii Masao wrote:
>>
>> Got another problem: when I ran pg_stop_backup to take an online backup,
>> it got stuck until I had generated new WAL record. This happens because,
>> in the patch, when pg_stop_backup force
On 16.02.2012 13:31, Fujii Masao wrote:
On Thu, Feb 16, 2012 at 6:15 PM, Fujii Masao wrote:
BTW, when I ran the test on my Ubuntu, I could not reproduce the problem.
I could reproduce the problem only in MacOS.
+ nextslot = Insert->nextslot;
+ if (NextSlotNo(nextslot) == lastslot)
On Mon, Feb 13, 2012 at 8:37 PM, Heikki Linnakangas
wrote:
> On 13.02.2012 01:04, Jeff Janes wrote:
>>
>> Attached is my quick and dirty attempt to set XLP_FIRST_IS_CONTRECORD.
>> I have no idea if I did it correctly, in particular if calling
>> GetXLogBuffer(CurrPos) twice is OK or if GetXLogBuf
On Thu, Feb 16, 2012 at 6:15 PM, Fujii Masao wrote:
> On Thu, Feb 16, 2012 at 5:02 AM, Heikki Linnakangas
> wrote:
>> On 15.02.2012 18:52, Fujii Masao wrote:
>>>
>>> On Thu, Feb 16, 2012 at 1:01 AM, Heikki Linnakangas
>>> wrote:
Are you still seeing this failure with the latest patch
On Thu, Feb 16, 2012 at 5:02 AM, Heikki Linnakangas
wrote:
> On 15.02.2012 18:52, Fujii Masao wrote:
>>
>> On Thu, Feb 16, 2012 at 1:01 AM, Heikki Linnakangas
>> wrote:
>>>
>>> Are you still seeing this failure with the latest patch I posted
>>>
>>> (http://archives.postgresql.org/message-id/4f3
On 15.02.2012 18:52, Fujii Masao wrote:
On Thu, Feb 16, 2012 at 1:01 AM, Heikki Linnakangas
wrote:
Are you still seeing this failure with the latest patch I posted
(http://archives.postgresql.org/message-id/4f38f5e5.8050...@enterprisedb.com)?
Yes. Just to be safe, I again applied the latest
On Thu, Feb 16, 2012 at 1:01 AM, Heikki Linnakangas
wrote:
> On 13.02.2012 19:13, Fujii Masao wrote:
>>
>> On Mon, Feb 13, 2012 at 8:37 PM, Heikki Linnakangas
>> wrote:
>>>
>>> On 13.02.2012 01:04, Jeff Janes wrote:
Attached is my quick and dirty attempt to set XLP_FIRST_IS_CONTRE
On 13.02.2012 19:13, Fujii Masao wrote:
On Mon, Feb 13, 2012 at 8:37 PM, Heikki Linnakangas
wrote:
On 13.02.2012 01:04, Jeff Janes wrote:
Attached is my quick and dirty attempt to set XLP_FIRST_IS_CONTRECORD.
I have no idea if I did it correctly, in particular if calling
GetXLogBuffer(Curr
On Mon, Feb 13, 2012 at 8:37 PM, Heikki Linnakangas
wrote:
> On 13.02.2012 01:04, Jeff Janes wrote:
>>
>> Attached is my quick and dirty attempt to set XLP_FIRST_IS_CONTRECORD.
>> I have no idea if I did it correctly, in particular if calling
>> GetXLogBuffer(CurrPos) twice is OK or if GetXLogBuf
On Thu, Feb 9, 2012 at 3:02 AM, Fujii Masao wrote:
> On Thu, Feb 9, 2012 at 7:25 PM, Fujii Masao wrote:
>> On Thu, Feb 9, 2012 at 3:32 AM, Jeff Janes wrote:
>>>
>>> After applying this patch and then forcing crashes, upon recovery the
>>> database is not correct.
>>>
>>> If I make a table with 1
On Thu, Feb 9, 2012 at 7:25 PM, Fujii Masao wrote:
> On Thu, Feb 9, 2012 at 3:32 AM, Jeff Janes wrote:
>> On Wed, Feb 1, 2012 at 11:46 PM, Heikki Linnakangas
>> wrote:
>>> On 31.01.2012 17:35, Fujii Masao wrote:
On Fri, Jan 20, 2012 at 11:11 PM, Heikki Linnakangas
wrote:
>
>
On Thu, Feb 9, 2012 at 3:32 AM, Jeff Janes wrote:
> On Wed, Feb 1, 2012 at 11:46 PM, Heikki Linnakangas
> wrote:
>> On 31.01.2012 17:35, Fujii Masao wrote:
>>>
>>> On Fri, Jan 20, 2012 at 11:11 PM, Heikki Linnakangas
>>> wrote:
On 20.01.2012 15:32, Robert Haas wrote:
>
>
>
On Wed, Feb 1, 2012 at 11:46 PM, Heikki Linnakangas
wrote:
> On 31.01.2012 17:35, Fujii Masao wrote:
>>
>> On Fri, Jan 20, 2012 at 11:11 PM, Heikki Linnakangas
>> wrote:
>>>
>>> On 20.01.2012 15:32, Robert Haas wrote:
On Sat, Jan 14, 2012 at 9:32 AM, Heikki Linnakangas
wr
On Fri, Jan 20, 2012 at 11:11 PM, Heikki Linnakangas
wrote:
> On 20.01.2012 15:32, Robert Haas wrote:
>>
>> On Sat, Jan 14, 2012 at 9:32 AM, Heikki Linnakangas
>> wrote:
>>>
>>> Here's another version of the patch to make XLogInsert less of a
>>> bottleneck
>>> on multi-CPU systems. The basic id
On Fri, Jan 20, 2012 at 2:11 PM, Heikki Linnakangas
wrote:
> On 20.01.2012 15:32, Robert Haas wrote:
>>
>> On Sat, Jan 14, 2012 at 9:32 AM, Heikki Linnakangas
>> wrote:
>>>
>>> Here's another version of the patch to make XLogInsert less of a
>>> bottleneck
>>> on multi-CPU systems. The basic ide
On Sat, Jan 14, 2012 at 9:32 AM, Heikki Linnakangas
wrote:
> Here's another version of the patch to make XLogInsert less of a bottleneck
> on multi-CPU systems. The basic idea is the same as before, but several bugs
> have been fixed, and lots of misc. clean up has been done.
This seems to need a
On Mon, Jan 9, 2012 at 2:29 PM, Heikki Linnakangas
wrote:
>> Can we also try aligning the actual insertions onto cache lines rather
>> than just MAXALIGNing them? The WAL header fills half a cache line as
>> it is, so many other records will fit nicely. I'd like to see what
>> that does to space
On 09.01.2012 15:44, Simon Riggs wrote:
On Sat, Jan 7, 2012 at 9:31 AM, Heikki Linnakangas
wrote:
Anyway, here's a new version of the patch. It no longer busy-waits for
in-progress insertions to finish, and handles xlog-switches. This is now
feature-complete. It's a pretty complicated patch,
On Sat, Jan 7, 2012 at 9:31 AM, Heikki Linnakangas
wrote:
> Anyway, here's a new version of the patch. It no longer busy-waits for
> in-progress insertions to finish, and handles xlog-switches. This is now
> feature-complete. It's a pretty complicated patch, so I would appreciate
> more eyeballs
On Sun, Dec 25, 2011 at 7:48 PM, Robert Haas wrote:
> m01 tps = 631.875547 (including connections establishing)
> x01 tps = 611.443724 (including connections establishing)
> m08 tps = 4573.701237 (including connections establishing)
> x08 tps = 4576.242333 (including connections establishing)
> m
On Fri, Dec 23, 2011 at 2:54 PM, Heikki Linnakangas
wrote:
> Sorry. Last minute changes, didn't retest properly.. Here's another attempt.
I tried this one out on Nate Boley's system. Looks pretty good.
m = master, x = with xloginsert-scale-2 patch. shared_buffers = 8GB,
maintenance_work_mem =
On Fri, Dec 16, 2011 at 3:27 AM, Tom Lane wrote:
>> On its own that sounds dangerous, but its not. When we need to confirm
>> the prev link we already know what we expect it to be, so CRC-ing it
>> is overkill. That isn't true of any other part of the WAL record, so
>> the prev link is the only th
On Sat, Dec 24, 2011 at 4:54 AM, Heikki Linnakangas
wrote:
> Sorry. Last minute changes, didn't retest properly.. Here's another attempt.
When I tested the patch, initdb failed:
$ initdb -D data
initializing dependencies ... PANIC: could not locate a valid checkpoint record
Regards,
--
On Fri, Dec 23, 2011 at 3:15 AM, Heikki Linnakangas
wrote:
> On 23.12.2011 10:13, Heikki Linnakangas wrote:
>> So, here's a WIP patch of what I've been working on.
>
> And here's the patch I forgot to attach..
Fails regression tests for me. I found this in postmaster.log:
PANIC: could not find
On 16.12.2011 15:42, Heikki Linnakangas wrote:
On 16.12.2011 15:03, Simon Riggs wrote:
On Fri, Dec 16, 2011 at 12:50 PM, Heikki Linnakangas
wrote:
On 16.12.2011 14:37, Simon Riggs wrote:
I already proposed a design for that using page-level share locks any
reason not to go with that?
Sorry
On 16.12.2011 15:03, Simon Riggs wrote:
On Fri, Dec 16, 2011 at 12:50 PM, Heikki Linnakangas
wrote:
On 16.12.2011 14:37, Simon Riggs wrote:
I already proposed a design for that using page-level share locks any
reason not to go with that?
Sorry, I must've missed that. Got a link?
From ne
On Fri, Dec 16, 2011 at 12:50 PM, Heikki Linnakangas
wrote:
> On 16.12.2011 14:37, Simon Riggs wrote:
>>
>> On Fri, Dec 16, 2011 at 12:07 PM, Heikki Linnakangas
>> wrote:
>>
>>> Anyway, I'm looking at ways to make the memcpy() of the payload happen
>>> without the lock, in parallel, and once you
On 16.12.2011 14:37, Simon Riggs wrote:
On Fri, Dec 16, 2011 at 12:07 PM, Heikki Linnakangas
wrote:
Anyway, I'm looking at ways to make the memcpy() of the payload happen
without the lock, in parallel, and once you do that the record header CRC
calculation can be done in parallel, too. That m
On Fri, Dec 16, 2011 at 12:07 PM, Heikki Linnakangas
wrote:
> Anyway, I'm looking at ways to make the memcpy() of the payload happen
> without the lock, in parallel, and once you do that the record header CRC
> calculation can be done in parallel, too. That makes it irrelevant from a
> performanc
On 16.12.2011 05:27, Tom Lane wrote:
* We write a WAL record that starts 8 bytes before a sector boundary,
so that the prev_link is in one sector and the rest of the record in
the next one(s).
prev-link is not the first field in the header. The CRC is.
* Time passes, and we recycle that WAL f
Simon Riggs writes:
> You missed your cue to discuss leaving the prev link out of the CRC
> altogether.
> On its own that sounds dangerous, but its not. When we need to confirm
> the prev link we already know what we expect it to be, so CRC-ing it
> is overkill. That isn't true of any other part
On Thu, Dec 15, 2011 at 6:50 PM, Heikki Linnakangas
wrote:
>> unless you are proposing to remove
>> the prev_link from the scope of the CRC, which is not exactly a
>> penalty-free change.
>
>
> We could CRC the rest of the record header before getting the lock, though,
> and only include the prev
On Thu, Dec 15, 2011 at 7:06 PM, Heikki Linnakangas
wrote:
>> Please try again to explain what you're doing?
>
>
> Ok: I'm moving the creation of rdata entries for backup blocks outside the
> critical section, so that it's done before grabbing the lock. I'm also
> moving the CRC calculation so th
On 15.12.2011 17:34, Tom Lane wrote:
Heikki Linnakangas writes:
I've been experimenting with different approaches to do that, but one
thing is common among all of them: you need to know the total amount of
WAL space needed for the record, including backup blocks, before you
take the lock. So, h
On 15.12.2011 18:48, Tom Lane wrote:
Jeff Janes writes:
On Thu, Dec 15, 2011 at 7:34 AM, Tom Lane wrote:
This patch may or may not be useful, but this description of it is utter
nonsense, because we already do compute that before taking the lock.
Please try again to explain what you're doing?
Jeff Janes writes:
> On Thu, Dec 15, 2011 at 7:34 AM, Tom Lane wrote:
>> This patch may or may not be useful, but this description of it is utter
>> nonsense, because we already do compute that before taking the lock.
>> Please try again to explain what you're doing?
> Currently the CRC of all t
On Thu, Dec 15, 2011 at 7:34 AM, Tom Lane wrote:
> Heikki Linnakangas writes:
>> I've been experimenting with different approaches to do that, but one
>> thing is common among all of them: you need to know the total amount of
>> WAL space needed for the record, including backup blocks, before you
Heikki Linnakangas writes:
> I've been experimenting with different approaches to do that, but one
> thing is common among all of them: you need to know the total amount of
> WAL space needed for the record, including backup blocks, before you
> take the lock. So, here's a patch to move things
On Thursday, December 15, 2011 02:51:33 PM Heikki Linnakangas wrote:
> I've been looking at various ways to make WALInsertLock less of a
> bottleneck on multi-CPU servers. The key is going to be to separate the
> two things that are done while holding the WALInsertLock: a) allocating
> the required
I've been looking at various ways to make WALInsertLock less of a
bottleneck on multi-CPU servers. The key is going to be to separate the
two things that are done while holding the WALInsertLock: a) allocating
the required space in the WAL, and b) calculating the CRC of the record
header and co
72 matches
Mail list logo