"test" version, but I am putting in the queue so we can track it there.
Your patch has been added to the PostgreSQL unapplied patches list at:
http://momjian.postgresql.org/cgi-bin/pgpatches
It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.
---
Simon, is this patch ready to be added to the patch queue? I assume not.
---
Simon Riggs wrote:
> On Mon, 2007-03-12 at 09:14 +, Simon Riggs wrote:
> > On Mon, 2007-03-12 at 16:21 +0900, ITAGAKI Takahiro wrote:
>
> > >
Simon,
On 3/13/07 2:37 AM, "Simon Riggs" <[EMAIL PROTECTED]> wrote:
>> We're planning a modification that I think you should consider: when there
>> is a sequential scan of a table larger than the size of shared_buffers, we
>> are allowing the scan to write through the shared_buffers cache.
>
>
On Tue, 2007-03-13 at 13:40 +0900, ITAGAKI Takahiro wrote:
> "Simon Riggs" <[EMAIL PROTECTED]> wrote:
>
> > > > With the default
> > > > value of scan_recycle_buffers(=0), VACUUM seems to use all of buffers
> > > > in pool,
> > > > just like existing sequential scans. Is this intended?
> > >
> >
On Mon, 2007-03-12 at 22:16 -0700, Luke Lonergan wrote:
> You may know we've built something similar and have seen similar gains.
Cool
> We're planning a modification that I think you should consider: when there
> is a sequential scan of a table larger than the size of shared_buffers, we
> are a
Simon,
You may know we've built something similar and have seen similar gains.
We're planning a modification that I think you should consider: when there
is a sequential scan of a table larger than the size of shared_buffers, we
are allowing the scan to write through the shared_buffers cache.
The
"Simon Riggs" <[EMAIL PROTECTED]> wrote:
> > > With the default
> > > value of scan_recycle_buffers(=0), VACUUM seems to use all of buffers in
> > > pool,
> > > just like existing sequential scans. Is this intended?
> >
> New test version enclosed, where scan_recycle_buffers = 0 doesn't change
On Mon, 2007-03-12 at 10:30 -0400, Tom Lane wrote:
> ITAGAKI Takahiro <[EMAIL PROTECTED]> writes:
> > I tested your patch with VACUUM FREEZE. The performance was improved when
> > I set scan_recycle_buffers > 32. I used VACUUM FREEZE to increase WAL
> > traffic,
> > but this patch should be useful
ITAGAKI Takahiro <[EMAIL PROTECTED]> writes:
> I tested your patch with VACUUM FREEZE. The performance was improved when
> I set scan_recycle_buffers > 32. I used VACUUM FREEZE to increase WAL traffic,
> but this patch should be useful for normal VACUUMs with backgrond jobs!
Proving that you can s
On Mon, 2007-03-12 at 09:14 +, Simon Riggs wrote:
> On Mon, 2007-03-12 at 16:21 +0900, ITAGAKI Takahiro wrote:
> > With the default
> > value of scan_recycle_buffers(=0), VACUUM seems to use all of buffers in
> > pool,
> > just like existing sequential scans. Is this intended?
>
> Yes, but i
On Mon, 2007-03-12 at 16:21 +0900, ITAGAKI Takahiro wrote:
> "Simon Riggs" <[EMAIL PROTECTED]> wrote:
>
> > I've implemented buffer recycling, as previously described, patch being
> > posted now to -patches as "scan_recycle_buffers".
> >
> > - for VACUUMs of any size, with the objective of reduci
"Simon Riggs" <[EMAIL PROTECTED]> wrote:
> I've implemented buffer recycling, as previously described, patch being
> posted now to -patches as "scan_recycle_buffers".
>
> - for VACUUMs of any size, with the objective of reducing WAL thrashing
> whilst keeping VACUUM's behaviour of not spoiling t
;
PGSQL Hackers; Doug Rady
Subject:Re: [HACKERS] Bug: Buffer cache is not scan resistant
On Tue, 2007-03-06 at 22:32 -0500, Luke Lonergan wrote:
> Incidentally, we tried triggering NTA (L2 cache bypass)
> unconditionally and in various patterns and did not see the
> substantial gai
On Tue, 2007-03-06 at 22:32 -0500, Luke Lonergan wrote:
> Incidentally, we tried triggering NTA (L2 cache bypass)
> unconditionally and in various patterns and did not see the
> substantial gain as with reducing the working set size.
>
> My conclusion: Fixing the OS is not sufficient to alleviate
Hi Simon,
> and what you haven't said
>
> - all of this is orthogonal to the issue of buffer cache spoiling in
> PostgreSQL itself. That issue does still exist as a non-OS issue, but
> we've been discussing in detail the specific case of L2 cache effects
> with specific kernel calls. All of the t
On 3/7/07, Hannu Krosing <[EMAIL PROTECTED]> wrote:
Do any of you know about a way to READ PAGE ONLY IF IN CACHE in *nix
systems ?
Supposedly you could mmap() a file and then do mincore() on the
area to see which pages are cached.
But you were talking about postgres cache before, there it shou
Ühel kenal päeval, T, 2007-03-06 kell 18:28, kirjutas Jeff Davis:
> On Tue, 2007-03-06 at 18:29 +, Heikki Linnakangas wrote:
> > Jeff Davis wrote:
> > > On Mon, 2007-03-05 at 21:02 -0700, Jim Nasby wrote:
> > >> On Mar 5, 2007, at 2:03 PM, Heikki Linnakangas wrote:
> > >>> Another approach I pr
; Pavan Deolasee;
Gavin Sherry; PGSQL Hackers; Doug Rady
Subject:Re: [HACKERS] Bug: Buffer cache is not scan resistant
Hi Simon,
> and what you haven't said
>
> - all of this is orthogonal to the issue of buffer cache spoiling in
> PostgreSQL itself. That issue does sti
On Tue, 2007-03-06 at 18:29 +, Heikki Linnakangas wrote:
> Jeff Davis wrote:
> > On Mon, 2007-03-05 at 21:02 -0700, Jim Nasby wrote:
> >> On Mar 5, 2007, at 2:03 PM, Heikki Linnakangas wrote:
> >>> Another approach I proposed back in December is to not have a
> >>> variable like that at all,
On Tue, 2007-03-06 at 17:43 -0700, Jim Nasby wrote:
> On Mar 6, 2007, at 10:56 AM, Jeff Davis wrote:
> >> We also don't need an exact count, either. Perhaps there's some way
> >> we could keep a counter or something...
> >
> > Exact count of what? The pages already in cache?
>
> Yes. The idea bein
On Mar 6, 2007, at 10:56 AM, Jeff Davis wrote:
We also don't need an exact count, either. Perhaps there's some way
we could keep a counter or something...
Exact count of what? The pages already in cache?
Yes. The idea being if you see there's 10k pages in cache, you can
likely start 9k page
On Mar 6, 2007, at 12:17 AM, Tom Lane wrote:
Jim Nasby <[EMAIL PROTECTED]> writes:
An idea I've been thinking about would be to have the bgwriter or
some other background process actually try and keep the free list
populated,
The bgwriter already tries to keep pages "just in front" of the cloc
On Tue, 2007-03-06 at 18:47 +, Heikki Linnakangas wrote:
> Tom Lane wrote:
> > Jeff Davis <[EMAIL PROTECTED]> writes:
> >> If I were to implement this idea, I think Heikki's bitmap of pages
> >> already read is the way to go.
> >
> > I think that's a good way to guarantee that you'll not finis
On Mon, 2007-03-05 at 21:34 -0800, Sherry Moore wrote:
> - Based on a lot of the benchmarks and workloads I traced, the
> target buffer of read operations are typically accessed again
> shortly after the read, while writes are usually not. Therefore,
> the default operation
Tom Lane wrote:
Jeff Davis <[EMAIL PROTECTED]> writes:
If I were to implement this idea, I think Heikki's bitmap of pages
already read is the way to go.
I think that's a good way to guarantee that you'll not finish in time
for 8.3. Heikki's idea is just at the handwaving stage at this point,
Jeff Davis wrote:
On Mon, 2007-03-05 at 21:02 -0700, Jim Nasby wrote:
On Mar 5, 2007, at 2:03 PM, Heikki Linnakangas wrote:
Another approach I proposed back in December is to not have a
variable like that at all, but scan the buffer cache for pages
belonging to the table you're scanning to i
On Tue, 2007-03-06 at 12:59 -0500, Tom Lane wrote:
> Jeff Davis <[EMAIL PROTECTED]> writes:
> > If I were to implement this idea, I think Heikki's bitmap of pages
> > already read is the way to go.
>
> I think that's a good way to guarantee that you'll not finish in time
> for 8.3. Heikki's idea
Jeff Davis <[EMAIL PROTECTED]> writes:
> If I were to implement this idea, I think Heikki's bitmap of pages
> already read is the way to go.
I think that's a good way to guarantee that you'll not finish in time
for 8.3. Heikki's idea is just at the handwaving stage at this point,
and I'm not even
On Mon, 2007-03-05 at 21:02 -0700, Jim Nasby wrote:
> On Mar 5, 2007, at 2:03 PM, Heikki Linnakangas wrote:
> > Another approach I proposed back in December is to not have a
> > variable like that at all, but scan the buffer cache for pages
> > belonging to the table you're scanning to initiali
Hi Tom,
Sorry about the delay. I have been away from computers all day.
In the current Solaris release in development (Code name Nevada,
available for download at http://opensolaris.org), I have implemented
non-temporal access (NTA) which bypasses L2 for most writes, and reads
larger than copyou
On Tue, 2007-03-06 at 00:54 +0100, Florian G. Pflug wrote:
> Simon Riggs wrote:
> But it would break the idea of letting a second seqscan follow in the
> first's hot cache trail, no?
No, but it would make it somewhat harder to achieve without direct
synchronization between scans. It could still w
Jim Nasby <[EMAIL PROTECTED]> writes:
> An idea I've been thinking about would be to have the bgwriter or
> some other background process actually try and keep the free list
> populated,
The bgwriter already tries to keep pages "just in front" of the clock
sweep pointer clean.
On Mar 5, 2007, at 2:03 PM, Heikki Linnakangas wrote:
Another approach I proposed back in December is to not have a
variable like that at all, but scan the buffer cache for pages
belonging to the table you're scanning to initialize the scan.
Scanning all the BufferDescs is a fairly CPU and l
On Mar 5, 2007, at 11:46 AM, Josh Berkus wrote:
Tom,
I seem to recall that we've previously discussed the idea of
letting the
clock sweep decrement the usage_count before testing for 0, so that a
buffer could be reused on the first sweep after it was initially
used,
but that we rejected it
"Luke Lonergan" <[EMAIL PROTECTED]> writes:
> Here's the x86 assembler routine for Solaris:
> http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/ia32
> /ml/copy.s
> The actual uiomove routine is a simple wrapper that calls the assembler
> kcopy or xcopyout routines. There are
Tom,
On 3/5/07 7:58 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote:
> I looked a bit at the Linux code that's being used here, but it's all
> x86_64 assembler which is something I've never studied :-(.
Here's the C wrapper routine in Solaris:
http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/
"Luke Lonergan" <[EMAIL PROTECTED]> writes:
> Good info - it's the same in Solaris, the routine is uiomove (Sherry
> wrote it).
Cool. Maybe Sherry can comment on the question whether it's possible
for a large-scale-memcpy to not take a hit on filling a cache line
that wasn't previously in cache?
Gregory Stark <[EMAIL PROTECTED]> writes:
> What happens if VACUUM comes across buffers that *are* already in the buffer
> cache. Does it throw those on the freelist too?
Not unless they have usage_count 0, in which case they'd be subject to
recycling by the next clock sweep anyway.
Pavan Deolasee; Gavin Sherry; Luke Lonergan; PGSQL Hackers; Doug Rady;
Sherry Moore
Subject: Re: [HACKERS] Bug: Buffer cache is not scan resistant
Mark Kirkwood <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> But what I wanted to see was the curve of
>> elap
"Tom Lane" <[EMAIL PROTECTED]> writes:
> I don't see any good reason why overwriting a whole cache line oughtn't be
> the same speed either way.
I can think of a couple theories, but I don't know if they're reasonable. The
one the comes to mind is the inter-processor cache coherency protocol. Wh
Mark Kirkwood <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> But what I wanted to see was the curve of
>> elapsed time vs shared_buffers?
> ...
> Looks *very* similar.
Yup, thanks for checking.
I've been poking into this myself. I find that I can reproduce the
behavior to some extent even with
Tom Lane wrote:
But what I wanted to see was the curve of
elapsed time vs shared_buffers?
Of course! (lets just write that off to me being pre coffee...).
With the patch applied:
Shared Buffers Elapsed vmstat IO rate
-- --- --
400MB 101 s122 MB/
Simon Riggs wrote:
On Mon, 2007-03-05 at 14:41 -0500, Tom Lane wrote:
"Simon Riggs" <[EMAIL PROTECTED]> writes:
Itakgaki-san and I were discussing in January the idea of cache-looping,
whereby a process begins to reuse its own buffers in a ring of ~32
buffers. When we cycle back round, if usage
On Mon, 2007-03-05 at 21:03 +, Heikki Linnakangas wrote:
> Another approach I proposed back in December is to not have a variable
> like that at all, but scan the buffer cache for pages belonging to the
> table you're scanning to initialize the scan. Scanning all the
> BufferDescs is a fairl
Mark Kirkwood <[EMAIL PROTECTED]> writes:
> Elapsed time is exactly the same (101 s). Is is expected that HEAD would
> behave differently?
Offhand I don't think so. But what I wanted to see was the curve of
elapsed time vs shared_buffers?
regards, tom lane
-
Tom Lane wrote:
Hm, not really a smoking gun there. But just for grins, would you try
this patch and see if the numbers change?
Applied to 8.2.3 (don't have lineitem loaded in HEAD yet) - no change
that I can see:
procs ---memory-- ---swap-- -io --system--
cp
Mark Kirkwood <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Mark, can you detect "hiccups" in the read rate using
>> your setup?
> I think so, here's the vmstat output for 400MB of shared_buffers during
> the scan:
Hm, not really a smoking gun there. But just for grins, would you try
this pat
Tom Lane wrote:
So the
problem is not so much the clock sweep overhead as that it's paid in a
very nonuniform fashion: with N buffers you pay O(N) once every N reads
and O(1) the rest of the time. This is no doubt slowing things down
enough to delay that one read, instead of leaving it nicely I
Jeff Davis wrote:
On Mon, 2007-03-05 at 15:30 -0500, Tom Lane wrote:
Jeff Davis <[EMAIL PROTECTED]> writes:
Absolutely. I've got a parameter in my patch "sync_scan_offset" that
starts a seq scan N pages before the position of the last seq scan
running on that table (or a current seq scan if the
Jeff Davis <[EMAIL PROTECTED]> writes:
> On Mon, 2007-03-05 at 15:30 -0500, Tom Lane wrote:
>> Strikes me that expressing that parameter as a percentage of
>> shared_buffers might make it less in need of manual tuning ...
> The original patch was a percentage of effective_cache_size, because in
>
On Mon, 2007-03-05 at 15:30 -0500, Tom Lane wrote:
> Jeff Davis <[EMAIL PROTECTED]> writes:
> > Absolutely. I've got a parameter in my patch "sync_scan_offset" that
> > starts a seq scan N pages before the position of the last seq scan
> > running on that table (or a current seq scan if there's sti
Jeff Davis <[EMAIL PROTECTED]> writes:
> Absolutely. I've got a parameter in my patch "sync_scan_offset" that
> starts a seq scan N pages before the position of the last seq scan
> running on that table (or a current seq scan if there's still a scan
> going).
Strikes me that expressing that param
On Mon, 2007-03-05 at 09:09 +, Heikki Linnakangas wrote:
> In fact, the pages that are left in the cache after the seqscan finishes
> would be useful for the next seqscan of the same table if we were smart
> enough to read those pages first. That'd make a big difference for
> seqscanning a t
"Simon Riggs" <[EMAIL PROTECTED]> writes:
> Best way is to prove it though. Seems like not too much work to have a
> private ring data structure when the hint is enabled. The extra
> bookeeping is easily going to be outweighed by the reduction in mem->L2
> cache fetches. I'll do it tomorrow, if no
On Mon, 2007-03-05 at 11:10 +0200, Hannu Krosing wrote:
> > My proposal for a fix: ensure that when relations larger (much larger?)
> > than buffer cache are scanned, they are mapped to a single page in the
> > shared buffer cache.
>
> How will this approach play together with synchronized scan pa
On Mon, 2007-03-05 at 03:51 -0500, Luke Lonergan wrote:
> The Postgres shared buffer cache algorithm appears to have a bug. When
> there is a sequential scan the blocks are filling the entire shared
> buffer cache. This should be "fixed".
>
> My proposal for a fix: ensure that when relations lar
On Mon, 2007-03-05 at 14:41 -0500, Tom Lane wrote:
> "Simon Riggs" <[EMAIL PROTECTED]> writes:
> > Itakgaki-san and I were discussing in January the idea of cache-looping,
> > whereby a process begins to reuse its own buffers in a ring of ~32
> > buffers. When we cycle back round, if usage_count==1
; PGSQL Hackers; Doug Rady; Sherry Moore
Cc: pgsql-hackers@postgresql.org
Subject:Re: [HACKERS] Bug: Buffer cache is not scan resistant
On Mon, 2007-03-05 at 10:46 -0800, Josh Berkus wrote:
> Tom,
>
> > I seem to recall that we've previously discussed the idea of lett
"Simon Riggs" <[EMAIL PROTECTED]> writes:
> Itakgaki-san and I were discussing in January the idea of cache-looping,
> whereby a process begins to reuse its own buffers in a ring of ~32
> buffers. When we cycle back round, if usage_count==1 then we assume that
> we can reuse that buffer. This avoid
On Mon, 2007-03-05 at 10:46 -0800, Josh Berkus wrote:
> Tom,
>
> > I seem to recall that we've previously discussed the idea of letting the
> > clock sweep decrement the usage_count before testing for 0, so that a
> > buffer could be reused on the first sweep after it was initially used,
> > but t
"Tom Lane" <[EMAIL PROTECTED]> writes:
> I seem to recall that we've previously discussed the idea of letting the
> clock sweep decrement the usage_count before testing for 0, so that a
> buffer could be reused on the first sweep after it was initially used,
> but that we rejected it as being a b
"Pavan Deolasee" <[EMAIL PROTECTED]> writes:
> I am wondering whether seqscan would set the usage_count to 1 or to a higher
> value. usage_count is incremented while unpinning the buffer. Even if
> we use
> page-at-a-time mode, won't the buffer itself would get pinned/unpinned
> every time seqsca
Tom Lane wrote:
Nope, Pavan's nailed it: the problem is that after using a buffer, the
seqscan leaves it with usage_count = 1, which means it has to be passed
over once by the clock sweep before it can be re-used. I was misled in
the 32-buffer case because catalog accesses during startup had le
Tom,
> I seem to recall that we've previously discussed the idea of letting the
> clock sweep decrement the usage_count before testing for 0, so that a
> buffer could be reused on the first sweep after it was initially used,
> but that we rejected it as being a bad idea. But at least with large
>
Here's four more points on the curve - I'd use a "dirac delta function" for
your curve fit ;-)
Shared_buffers Select CountVacuum
(KB)(s) (s)
===
248 5.522.46
368 4.772.40
552
I wrote:
> "Pavan Deolasee" <[EMAIL PROTECTED]> writes:
>> Isn't the size of the shared buffer pool itself acting as a performance
>> penalty in this case ? May be StrategyGetBuffer() needs to make multiple
>> passes over the buffers before the usage_count of any buffer is reduced
>> to zero and th
Tom,
On 3/5/07 8:53 AM, "Tom Lane" <[EMAIL PROTECTED]> wrote:
> Hm, that seems to blow the "it's an L2 cache effect" theory out of the
> water. If it were a cache effect then there should be a performance
> cliff at the point where the cache size is exceeded. I see no such
> cliff, in fact the
Tom,
> Yes, autovacuum is off, and bgwriter shouldn't have anything useful to
> do either, so I'm a bit at a loss what's going on --- but in any case,
> it doesn't look like we are cycling through the entire buffer space
> for each fetch.
I'd be happy to DTrace it, but I'm a little lost as to whe
"Pavan Deolasee" <[EMAIL PROTECTED]> writes:
> Isn't the size of the shared buffer pool itself acting as a performance
> penalty in this case ? May be StrategyGetBuffer() needs to make multiple
> passes over the buffers before the usage_count of any buffer is reduced
> to zero and the buffer is cho
Hi Tom,
On 3/5/07 8:53 AM, "Tom Lane" <[EMAIL PROTECTED]> wrote:
> Hm, that seems to blow the "it's an L2 cache effect" theory out of the
> water. If it were a cache effect then there should be a performance
> cliff at the point where the cache size is exceeded. I see no such
> cliff, in fact t
Tom Lane wrote:
Mark Kirkwood <[EMAIL PROTECTED]> writes:
Shared Buffers Elapsed IO rate (from vmstat)
-- --- -
400MB 101 s122 MB/s
2MB 100 s
1MB 97 s
768KB93 s
512KB86 s
256KB77
Mark Kirkwood <[EMAIL PROTECTED]> writes:
> Shared Buffers Elapsed IO rate (from vmstat)
> -- --- -
> 400MB 101 s122 MB/s
> 2MB 100 s
> 1MB 97 s
> 768KB93 s
> 512KB86 s
> 256KB77 s
> 1
"Luke Lonergan" <[EMAIL PROTECTED]> writes:
> The evidence seems to clearly indicate reduced memory writing due to an
> L2 related effect.
You might try using valgrind's cachegrind tool which I understand can actually
emulate various processors' cache to show how efficiently code uses it. I
hav
Hi Mark,
> lineitem has 1535724 pages (11997 MB)
>
> Shared Buffers Elapsed IO rate (from vmstat)
> -- --- -
> 400MB 101 s122 MB/s
>
> 2MB 100 s
> 1MB 97 s
> 768KB93 s
> 512KB86 s
> 256KB
Gavin Sherry wrote:
On Mon, 5 Mar 2007, Mark Kirkwood wrote:
To add a little to this - forgetting the scan resistant point for the
moment... cranking down shared_buffers to be smaller than the L2 cache
seems to help *any* sequential scan immensely, even on quite modest HW:
(snipped)
When I'v
> > The Postgres shared buffer cache algorithm appears to have a bug.
> > When there is a sequential scan the blocks are filling the entire
> > shared buffer cache. This should be "fixed".
>
> No, this is not a bug; it is operating as designed. The
> point of the current bufmgr algorithm
Ühel kenal päeval, E, 2007-03-05 kell 04:15, kirjutas Tom Lane:
> "Luke Lonergan" <[EMAIL PROTECTED]> writes:
> > I think you're missing my/our point:
>
> > The Postgres shared buffer cache algorithm appears to have a bug. When
> > there is a sequential scan the blocks are filling the entire shar
* Tom Lane:
> That makes absolutely zero sense. The data coming from the disk was
> certainly not in processor cache to start with, and I hope you're not
> suggesting that it matters whether the *target* page of a memcpy was
> already in processor cache. If the latter, it is not our bug to fix.
"Luke Lonergan" <[EMAIL PROTECTED]> writes:
> I think you're missing my/our point:
> The Postgres shared buffer cache algorithm appears to have a bug. When
> there is a sequential scan the blocks are filling the entire shared
> buffer cache. This should be "fixed".
No, this is not a bug; it is
Ühel kenal päeval, E, 2007-03-05 kell 03:51, kirjutas Luke Lonergan:
> Hi Tom,
>
> > Even granting that your conclusions are accurate, we are not
> > in the business of optimizing Postgres for a single CPU architecture.
>
> I think you're missing my/our point:
>
> The Postgres shared buffer ca
Luke Lonergan wrote:
The Postgres shared buffer cache algorithm appears to have a bug. When
there is a sequential scan the blocks are filling the entire shared
buffer cache. This should be "fixed".
My proposal for a fix: ensure that when relations larger (much larger?)
than buffer cache are sc
Hi Tom,
> Even granting that your conclusions are accurate, we are not
> in the business of optimizing Postgres for a single CPU architecture.
I think you're missing my/our point:
The Postgres shared buffer cache algorithm appears to have a bug. When
there is a sequential scan the blocks are
"Luke Lonergan" <[EMAIL PROTECTED]> writes:
>> So either way, it isn't in processor cache after the read.
>> So how can there be any performance benefit?
> It's the copy from kernel IO cache to the buffer cache that is L2
> sensitive. When the shared buffer cache is polluted, it thrashes the L2
On Mar 5, 2007, at 2:36 AM, Tom Lane wrote:
n into account.
I'm also less than convinced that it'd be helpful for a big seqscan:
won't reading a new disk page into memory via DMA cause that memory to
get flushed from the processor cache anyway?
Nope. DMA is writing directly into main memory.
Hi Tom,
> Now this may only prove that the disk subsystem on this
> machine is too cheap to let the system show any CPU-related
> issues.
Try it with a warm IO cache. As I posted before, we see double the
performance of a VACUUM from a table in IO cache when the shared buffer
cache isn't bein
> So either way, it isn't in processor cache after the read.
> So how can there be any performance benefit?
It's the copy from kernel IO cache to the buffer cache that is L2
sensitive. When the shared buffer cache is polluted, it thrashes the L2
cache. When the number of pages being written to
Grzegorz Jaskiewicz <[EMAIL PROTECTED]> writes:
> On Mar 5, 2007, at 2:36 AM, Tom Lane wrote:
>> I'm also less than convinced that it'd be helpful for a big seqscan:
>> won't reading a new disk page into memory via DMA cause that memory to
>> get flushed from the processor cache anyway?
> Nope. DM
Gavin Sherry <[EMAIL PROTECTED]> writes:
> Could you demonstrate that point by showing us timings for shared_buffers
> sizes from 512K up to, say, 2 MB? The two numbers you give there might
> just have to do with managing a large buffer.
Using PG CVS HEAD on 64-bit Intel Xeon (1MB L2 cache), Fedor
Gavin, Mark,
> Could you demonstrate that point by showing us timings for
> shared_buffers sizes from 512K up to, say, 2 MB? The two
> numbers you give there might just have to do with managing a
> large buffer.
I suggest two experiments that we've already done:
1) increase shared buffers to d
On Mon, 5 Mar 2007, Mark Kirkwood wrote:
> To add a little to this - forgetting the scan resistant point for the
> moment... cranking down shared_buffers to be smaller than the L2 cache
> seems to help *any* sequential scan immensely, even on quite modest HW:
>
> e.g: PIII 1.26Ghz 512Kb L2 cache,
Tom Lane wrote:
"Luke Lonergan" <[EMAIL PROTECTED]> writes:
The issue is summarized like this: the buffer cache in PGSQL is not "scan
resistant" as advertised.
Sure it is. As near as I can tell, your real complaint is that the
bufmgr doesn't attempt to limit its usage footprint to fit in L2 c
Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Sunday, March 04, 2007 08:36 PM Eastern Standard Time
To: Luke Lonergan
Cc: PGSQL Hackers; Doug Rady; Sherry Moore
Subject: Re: [HACKERS] Bug: Buffer cache is not scan resistant
"Luke Lonergan" <[EMAIL PROTECTED]> writes:
g is shrt cuz m on ma treo
-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Sunday, March 04, 2007 08:36 PM Eastern Standard Time
To: Luke Lonergan
Cc: PGSQL Hackers; Doug Rady; Sherry Moore
Subject: Re: [HACKERS] Bug: Buffer cache is not scan resistant
"Luke Lonergan" <[EMAIL PROTECTED]> writes:
> The issue is summarized like this: the buffer cache in PGSQL is not "scan
> resistant" as advertised.
Sure it is. As near as I can tell, your real complaint is that the
bufmgr doesn't attempt to limit its usage footprint to fit in L2 cache;
which is h
94 matches
Mail list logo