Re: [Patch][WiP] Tweaked LRU for shared buffers

2019-02-16 Thread Benjamin Manes
Hi,

I was not involved with Andrey and his team's work, which looks like a very
promising first pass. I can try to clarify a few minor details.

What is CAR? Did you mean ARC, perhaps?


CAR is the Clock variants of ARC: CAR: Clock with Adaptive Replacement


I believe the main interest in ARC is its claim of adaptability with high
hit rates. Unfortunately the reality is less impressive as it fails to
handle many frequency workloads, e.g. poor scan resistance, and the
adaptivity is poor due to the accesses being quite noisy. For W-TinyLFU, we
have recent improvements which show near perfect adaptivity

in
our stress case that results in double the hit rate of ARC and is less than
1% from optimal.

Can you point us to the thread/email discussing those ideas? I have tried
> searching through archives, but I haven't found anything :-(


This thread

doesn't explain Andrey's work, but includes my minor contribution. The
longer thread discusses the interest in CAR, et al.

Are you suggesting to get rid of the buffer rings we use for sequential
> scans, for example? IMHO that's going to be tricky, e.g. because of
> synchronized sequential scans.


If you mean "synchronized" in terms of multi-threading and locks, then this
is a solved problem

in terms of caching. My limited understanding is that the buffer rings are
used to protect the cache from being polluted by scans which flush the
LRU-like algorithms. This allows those policies to capture more frequent
items. It also avoids lock contention on the cache due to writes caused by
misses, where Clock allows lock-free reads but uses a global lock on
writes. A smarter cache eviction policy and concurrency model can handle
this without needing buffer rings to compensate.

Somebody should write a patch to make buffer eviction completely random,
> without aiming to get it committed. That isn't as bad of a strategy as it
> sounds, and it would help with assessing improvements in this area.
>

A related and helpful patch would be to capture the access log and provide
anonymized traces. We have a simulator
 with dozens of
policies to quickly provide a breakdown. That would let you know the hit
rates before deciding on the policy to adopt.

Cheers.

On Fri, Feb 15, 2019 at 4:22 PM Tomas Vondra 
wrote:

>
> On 2/16/19 12:51 AM, Peter Geoghegan wrote:
> > On Fri, Feb 15, 2019 at 3:30 PM Tomas Vondra
> >  wrote:
> >> That TPS chart looks a bit ... wild. How come the master jumps so much
> >> up and down? That's a bit suspicious, IMHO.
> >
> > Somebody should write a patch to make buffer eviction completely
> > random, without aiming to get it committed. That isn't as bad of a
> > strategy as it sounds, and it would help with assessing improvements
> > in this area.
> >
> > We know that the cache replacement algorithm behaves randomly when
> > there is extreme contention, while also slowing everything down due
> > to maintaining the clock.
>
> Possibly, although I still find it strange that the throughput first
> grows, then at shared_buffers 1GB it drops, and then at 3GB it starts
> growing again. Considering this is on 200GB data set, I doubt the
> pressure/contention is much different with 1GB and 3GB, but maybe it is.
>
> > A unambiguously better caching algorithm would at a minimum be able
> > to beat our "cheap random replacement" prototype as well as the
> > existing clocksweep algorithm in most or all cases. That seems like a
> > reasonably good starting point, at least.
> >
>
> Yes, comparison to cheap random replacement would be an interesting
> experiment.
>
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [Patch][WiP] Tweaked LRU for shared buffers

2019-02-16 Thread Benjamin Manes
>
> No, I "synchronized scans" are an optimization to reduce I/O when multiple
> processes do sequential scan on the same table.


Oh, very neat. Thanks!

Interesting. I assume the trace is essentially a log of which blocks were
> requested? Is there some trace format specification somewhere?
>

Yes, whatever constitutes a cache key (block address, item hash, etc). We
write parsers for each trace so there isn't a format requirement. The
parsers produce a 64-bit int stream of keys, which are broadcasted to each
policy via an actor framework. The trace files can be text or binary,
optionally compressed (xz preferred). The ARC traces are block I/O and this
is their format description
<https://www.dropbox.com/sh/9ii9sc7spcgzrth/j1CJ72HiWa/Papers/ARCTraces/README.txt>,
so perhaps something similar? That parser is only 5 lines of custom code
<https://github.com/ben-manes/caffeine/blob/b752c586f7bf143f774a51a0a10593ae3b77802b/simulator/src/main/java/com/github/benmanes/caffeine/cache/simulator/parser/arc/ArcTraceReader.java#L36-L42>
.

On Fri, Feb 15, 2019 at 5:51 PM Tomas Vondra 
wrote:

> On 2/16/19 1:48 AM, Benjamin Manes wrote:
> > Hi,
> >
> > I was not involved with Andrey and his team's work, which looks like a
> > very promising first pass. I can try to clarify a few minor details.
> >
> > What is CAR? Did you mean ARC, perhaps?
> >
> >
> > CAR is the Clock variants of ARC: CAR: Clock with Adaptive Replacement
> > <
> https://www.usenix.org/legacy/publications/library/proceedings/fast04/tech/full_papers/bansal/bansal.pdf
> >
> >
>
> Thanks, will check.
>
> > I believe the main interest in ARC is its claim of adaptability with
> > high hit rates. Unfortunately the reality is less impressive as it fails
> > to handle many frequency workloads, e.g. poor scan resistance, and the
> > adaptivity is poor due to the accesses being quite noisy. For W-TinyLFU,
> > we have recent improvements which show near perfect adaptivity
> > <
> https://user-images.githubusercontent.com/378614/52891655-2979e380-3141-11e9-91b3-2a3cc80b.png
> > in
> > our stress case that results in double the hit rate of ARC and is less
> > than 1% from optimal.
> >
>
> Interesting.
>
> > Can you point us to the thread/email discussing those ideas? I have
> > tried searching through archives, but I haven't found anything :-(
> >
> >
> > This thread
> > <
> https://www.postgresql.org/message-id/1526057854777-0.post%40n3.nabble.com
> >
> > doesn't explain Andrey's work, but includes my minor contribution. The
> > longer thread discusses the interest in CAR, et al.
> >
>
> Thanks.
>
> > Are you suggesting to get rid of the buffer rings we use for
> > sequential scans, for example? IMHO that's going to be tricky, e.g.
> > because of synchronized sequential scans.
> >
> >
> > If you mean "synchronized" in terms of multi-threading and locks, then
> > this is a solved problem
> > <http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html>
> in
> > terms of caching.
>
> No, I "synchronized scans" are an optimization to reduce I/O when
> multiple processes do sequential scan on the same table. Say one process
> is half-way through the table, when another process starts another scan.
> Instead of scanning it from block #0 we start at the position of the
> first process (at which point they "synchronize") and then wrap around
> to read the beginning.
>
> I was under the impression that this somehow depends on the small
> circular buffers, but I've now checked the code and I see that's bogus.
>
>
> > My limited understanding is that the buffer rings are
> > used to protect the cache from being polluted by scans which flush the
> > LRU-like algorithms. This allows those policies to capture more frequent
> > items. It also avoids lock contention on the cache due to writes caused
> > by misses, where Clock allows lock-free reads but uses a global lock on
> > writes. A smarter cache eviction policy and concurrency model can handle
> > this without needing buffer rings to compensate.
> >
>
> Right - that's the purpose of circular buffers.
>
> > Somebody should write a patch to make buffer eviction completely
> > random, without aiming to get it committed. That isn't as bad of a
> > strategy as it sounds, and it would help with assessing improvements
> > in this area.
> >
> >
> > A related and helpful patch would be to capture the access log and
> > provide anonymized traces. We have a simulator
> > <https://github.com/ben-manes/caffeine/wiki/Simulator> with dozens of
> > policies to quickly provide a breakdown. That would let you know the hit
> > rates before deciding on the policy to adopt.
> >
>
> Interesting. I assume the trace is essentially a log of which blocks
> were requested? Is there some trace format specification somewhere?
>
> cheers
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [Patch][WiP] Tweaked LRU for shared buffers

2019-02-26 Thread Benjamin Manes
Hi Tomas,

If you are on a benchmarking binge and feel like generating some trace
files (as mentioned earlier), I'd be happy to help in regards to running
them through simulations to show how different policies behave. We can add
more types to match this patch / Postgres' GClock as desired, too.

On Tue, Feb 26, 2019 at 3:04 PM Tomas Vondra 
wrote:

> On 2/17/19 2:53 PM, Tomas Vondra wrote:
> > On 2/17/19 2:14 PM, Едигарьев, Иван Григорьевич wrote:
> >> Hi there. I was responsible for the benchmarks, and I would be glad to
> >> make clear that part for you.
> >>
> >> On Sat, 16 Feb 2019 at 02:30, Tomas Vondra <
> tomas.von...@2ndquadrant.com> wrote:
> >>> Interesting. Where do these numbers (5/8 and 1/8) come from?
> >>
> >> The first number came from MySQL realization of LRU algorithm
> >> 
> >> and the second from simple tuning, we've tried to change 1/8 a little,
> >> but it didn't change metrics significantly.
> >>
> >>> That TPS chart looks a bit ... wild. How come the master jumps so much
> >>> up and down? That's a bit suspicious, IMHO.
> >>
> >> Yes, it is. It would be great if someone will try to reproduce those
> results.
> >>
> >
> > I'll try.
> >
>
> I've tried to reproduce this behavior, and I've done a quite extensive
> set of tests on two different (quite different) machines, but so far I
> have not observed anything like that. The results are attached, along
> with the test scripts used.
>
> I wonder if this might be due to pg_ycsb using random_zipfian, which has
> somewhat annoying behavior for some parameters (as I've mentioned in a
> separate thread). But that should affect all the runs, not just some
> shared_buffers sizes.
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>