On 7/29/13 2:04 AM, KONDO Mitsumasa wrote:
I think that it is almost same as small dirty_background_ratio or
dirty_background_bytes.
The main difference here is that all writes pushed out this way will be
to a single 1GB relation chunk. The odds are better that multiple
writes will combine,
On 7/26/13 7:32 AM, Tom Lane wrote:
Greg Smith writes:
On 7/26/13 5:59 AM, Hannu Krosing wrote:
Well, SSD disks do it in the way proposed by didier (AFAIK), by putting
"random"
fs pages on one large disk page and having an extra index layer for
resolving
random-to-sequential ordering.
If yo
(2013/07/24 1:13), Greg Smith wrote:
On 7/23/13 10:56 AM, Robert Haas wrote:
On Mon, Jul 22, 2013 at 11:48 PM, Greg Smith wrote:
We know that a 1GB relation segment can take a really long time to write
out. That could include up to 128 changed 8K pages, and we allow all of
them to get dirty b
Hi,
On Fri, Jul 26, 2013 at 3:41 PM, Greg Smith
(needrant.com
> wrote:
> On 7/26/13 9:14 AM, didier wrote:
>
>> During recovery you have to load the log in cache first before applying
>> WAL.
>>
>
> Checkpoints exist to bound recovery time after a crash. That is their
> only purpose. What you'
On 7/26/13 8:32 AM, Tom Lane wrote:
What I'd point out is that that is exactly what WAL does for us, ie
convert a bunch of random writes into sequential writes. But sooner or
later you have to put the data where it belongs.
Hannu was observing that SSD often doesn't do that at all. They can
On 7/26/13 9:14 AM, didier wrote:
During recovery you have to load the log in cache first before applying WAL.
Checkpoints exist to bound recovery time after a crash. That is their
only purpose. What you're suggesting moves a lot of work into the
recovery path, which will slow down how long
Hi,
On Fri, Jul 26, 2013 at 11:42 AM, Greg Smith wrote:
> On 7/25/13 6:02 PM, didier wrote:
>
>> It was surely already discussed but why isn't postresql writing
>> sequentially its cache in a temporary file?
>>
>
> If you do that, reads of the data will have to traverse that temporary
> file t
Greg Smith writes:
> On 7/26/13 5:59 AM, Hannu Krosing wrote:
>> Well, SSD disks do it in the way proposed by didier (AFAIK), by putting
>> "random"
>> fs pages on one large disk page and having an extra index layer for
>> resolving
>> random-to-sequential ordering.
> If your solution to avoiding
On 7/26/13 5:59 AM, Hannu Krosing wrote:
Well, SSD disks do it in the way proposed by didier (AFAIK), by putting
"random"
fs pages on one large disk page and having an extra index layer for
resolving
random-to-sequential ordering.
If your solution to avoiding random writes now is to do sequenti
On 07/26/2013 11:42 AM, Greg Smith wrote:
> On 7/25/13 6:02 PM, didier wrote:
>> It was surely already discussed but why isn't postresql writing
>> sequentially its cache in a temporary file?
>
> If you do that, reads of the data will have to traverse that temporary
> file to assemble their data.
On 07/26/2013 11:42 AM, Greg Smith wrote:
> On 7/25/13 6:02 PM, didier wrote:
>> It was surely already discussed but why isn't postresql writing
>> sequentially its cache in a temporary file?
>
> If you do that, reads of the data will have to traverse that temporary
> file to assemble their data.
On 7/25/13 6:02 PM, didier wrote:
It was surely already discussed but why isn't postresql writing
sequentially its cache in a temporary file?
If you do that, reads of the data will have to traverse that temporary
file to assemble their data. You'll make every later reader pay the
random I/O
Hi,
> Sure, that's what the WAL does. But you still have to checkpoint
> eventually.
>
> Sure, when you run pg_ctl stop.
Unlike the WAL it only needs two files, shared_buffers size.
I did bogus tests by replacing mask |= BM_PERMANENT; with mask = -1 in
BufferSync() and simulating checkpoint w
On Thu, Jul 25, 2013 at 6:02 PM, didier wrote:
> It was surely already discussed but why isn't postresql writing
> sequentially its cache in a temporary file? With storage random speed at
> least five to ten time slower it could help a lot.
> Thanks
Sure, that's what the WAL does. But you still
Hi
On Tue, Jul 23, 2013 at 5:48 AM, Greg Smith wrote:
> Recently I've been dismissing a lot of suggested changes to checkpoint
> fsync timing without suggesting an alternative. I have a simple one in
> mind that captures the biggest problem I see: that the number of backend
> and checkpoint w
On Tue, Jul 23, 2013 at 12:13 PM, Greg Smith wrote:
> On 7/23/13 10:56 AM, Robert Haas wrote:
>> On Mon, Jul 22, 2013 at 11:48 PM, Greg Smith wrote:
>>>
>>> We know that a 1GB relation segment can take a really long time to write
>>> out. That could include up to 128 changed 8K pages, and we all
On 7/23/13 10:56 AM, Robert Haas wrote:
On Mon, Jul 22, 2013 at 11:48 PM, Greg Smith wrote:
We know that a 1GB relation segment can take a really long time to write
out. That could include up to 128 changed 8K pages, and we allow all of
them to get dirty before any are forced to disk with fsyn
On Mon, Jul 22, 2013 at 11:48 PM, Greg Smith wrote:
> Recently I've been dismissing a lot of suggested changes to checkpoint fsync
> timing without suggesting an alternative. I have a simple one in mind that
> captures the biggest problem I see: that the number of backend and
> checkpoint writes
On Mon, Jul 22, 2013 at 8:48 PM, Greg Smith wrote:
> And I can't get too excited about making this as my volunteer effort when I
> consider what the resulting credit will look like. Coding is by far the
> smallest part of work like this, first behind coming up with the design in
> the first place
Recently I've been dismissing a lot of suggested changes to checkpoint
fsync timing without suggesting an alternative. I have a simple one in
mind that captures the biggest problem I see: that the number of
backend and checkpoint writes to a file are not connected at all.
We know that a 1GB
20 matches
Mail list logo