Re: 5-STABLE Roadmap

2003-02-16 Thread Alex Rousskov
On Sun, 16 Feb 2003, Pawel Jakub Dawidek wrote:

> On Sun, Feb 16, 2003 at 02:08:35PM -0700, Scott Long wrote:
> +> Pawel Jakub Dawidek wrote:
> +>
> +> >On Thu, Feb 13, 2003 at 08:28:43PM -0800, Sam Leffler wrote:
> +> >+> This can quickly turn into a bikeshed, but suggest ones.  We're
> +> >looking for
> +> >+> good benchmarks. [...]
> +> >
> +> >Look at:
> +> >
> +> >  http://www.web-polygraph.org
> +> >
> +> >It provides tests for www-cache/proxy stuff.
> +> >We can test many things with it:
> +> >
> +> >  - how fast could we generate workload,
> +> >  - how heavy load could we handle,
> +> >  - how fast is squid running on FreeBSD,
> +> >  - how fast is squid rewritten with libkse,
> +> >  - etc.
> +> >
> +> >And this is good stablility test.
> +> >This is real good and free stuff, I use it on 4.x.
> +> >
> +> Thanks for the pointer, this looks very interesting.  How hard
> +> is it to set up?  [...]
>
> Setting it up is quite simple, but it doesn't compile with gcc 3.x...

There were too many non-backwards compatible changes in 3.x and the
compiler was not stable enough last time we checked. We will probably
check and port again some time soon. GCC 2.9x should work fine though.

Polygraph is relatively easy to setup on FreeBSD for standard tests,
using two PCs. Testing with more PCs, with non-standard workloads,
and/or on a regular basis requires writing scripts and can get pretty
evolved (which let's us sell a pre-configured appliance that does
Polygraph test management :).

How-Tos for standard tests on FreeBSD are available at:
http://www.measurement-factory.com/support.html

> Yes, on website kernel patches are avaliable for tunning, but for new
> releases of 4.x this isn't necessary, all could be configure with kernel
> options and sysctls (for 4.8):
>
>   options MAXFILES=16384
>   options HZ=1000
>   options NMBCLUSTERS=32678
>
>   kern.ipc.somaxconn=1024
>   net.inet.ip.portrange.last=4
>   net.inet.tcp.delayed_ack=0
>   net.inet.tcp.msl=3000

One of our kernel patches optimizes handling of 1000s of IP aliases
per FreeBSD box. The patch is required for older 4.x kernels to
perform at decent levels. IIRC, the patch does not work for recent
kernels, probably because of the SYN cache changes. I do not know
whether any alias-related optimizations are still needed for recent
kernels though. Perhaps the SYN cache solves the original scalability
problem.

> Rest is quite simple/well documented. Tests in theory could be run
> on one machine, so... And some nice looking results generated by
> web-polygraph:
>
>   Without any proxy:
>   http://garage.freebsd.pl/pm3-15-11-2k2
>   With squid:
>   http://garage.freebsd.pl/pm3-05-11-2k2
>   http://garage.freebsd.pl/pm3-06-11-2k2
>   With external proxy:
>   http://garage.freebsd.pl/pm3-29-01-2k3

Please note that a couple of the results I looked at are invalid from
PolyMix workload rules/design point of view. The first thing to check
is that you have huge numbers of request in waiting queue, compared to
active transactions (shown on the same "xact_lvl" graph). Most likely,
you overloaded the device under test, and most request ended up in
queues instead of on the wire.

I may be missing something though -- I am just looking at your
results without much knowledge of their history/purpose... See last
cache-off results for valid examples:
http://www.measurement-factory.com/results/


If you have any Polygraph-specific questions, I would be happy to
answer them, especially if it can help FreeBSD folks in any way.

Good luck,

Alex.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Polygraph Considered Evil 8^) (was: Re: 5-STABLE Roadmap)

2003-02-17 Thread Alex Rousskov
On Mon, 17 Feb 2003, Terry Lambert wrote:

> First, I just have a slight editorial comment, about cheating on
> Polygraph.

Terry,

This is not the place to start a long discussion about our
Polygraph testing methodology, but I have to say, with all due
respect, that many of your statements are either misleading or based
on misinformation about Web Polygraph and the way standard tests are
executed. I have to respond because I both love and understand cache
benchmarking. I apologize to the majority of the audience for what may
be considered an out-of-scope thread.

> One issue I have with Polygraph is that it intentionally works for a
> very long time to get worst case performance out of caches;
> basically, it cache-busts on purpose.  Then the test runs.

This is plain wrong. I assume that you are referring to PolyMix
workloads that have a filling-the-cache phase and measurement phases.
Filling the cache phase does not bust the cache. Its primary purpose
is to bring cache's storage to a steady state (hopefully). If you
tested many caches, including Squid, then you know that cache
performance "on an empty stomach" often differs from sustained
performance by 50%. Since we must start from scratch, we must pump
enough data to approach steady state.

You might have been misinformed that all the fill objects are used
during the measurement phases; this is not true. Polygraph keeps the
size of the working set constant.  That size is usually much smaller
than the amount of traffic during the fill phase. Again, the fill
phase is there to reach a steady state after you start with an empty
disk.

> This seems to be an editorial comment on end-to-end guarantees, much
> more than it seems a valid measurement of actual cache performance.

Not sure what end-to-end guarantees you are referring here.

> If you change squid to force a random page preplacement, then you
> end up with a bounded worst case which is a better number than you
> would be able to get with your best (in terms of the real-world
> performance) algorithm (e.g. LRU or whatever), because you make it
> arbitrarily hard to characterize what that would be.

Random page replacement should not cause better performance, Polygraph
simulates hot subsets (aka flash crowds), which you would not be able
to take advantage of if you replace randomly. Also, random replacement
will lose partial advantages of temporal locality that Polygraph also
simulates (e.g., same HTML containers have same images).

> NetApp has a tunable in their cache product which might as well be
> labelled "get a good Polygraph score"; all it does is turn on random
> page replacement, so that the Polygraph code is unable to
> characterize "what would constitute worst case performance on this
> cache?", and then intentionally exercise that code path, which is
> what it would do, otherwise (i.e. pick a working set slightly larger
> than the cache size so everythings a miss, etc.).

I am unaware of any tunables of that kind. Moreover, I suspect they
simply would not work (see above). Are you a rich? If not, you may
want to sell a proof of the above to NetApp competitor. I, myself,
would be very interested to hear it as well. Keep in mind that NetApp
and most other vendors use Polygraph for day-to-day regression tests
so they are interested in making the tests realistic.

Also, offered Polygraph traffic does not depend on cache performance.
Polygraph code does not "characterize" anything run-time, at leat not
during PolyMix tests.

> Basically, most of the case numbers are 99.xx% miss rates.  With
> this modification, that number drops down to closer to 80%.

Actually, the measured miss ratio is usually about 50% (hit rate of
50+%), which is quite realistic. Offered hit ratio is about 55%. Byte
hit ratio is lower. Not sure where you got 99 or 80% numbers. See
cache-off results for true values.

> That's kind of evil; but at least it's a level playing field, and
> we can make a FreeBSD-specific patch for SQUID to get better numbers
> for FreeBSD.  8-) 8-).

I would not encourage you to cheat, even if there is a way. I would
recommend that you suggest ways to improve the benchmark instead.
Chances are, Polygraph can already do what you want.

> > >   options MAXFILES=16384
> > >   options NMBCLUSTERS=32678
>
> These I understand, though I think they are on the low end.

We have never run out of related resources with these settings during
a valid test. Keep in mind that we have to keep the number of open
concurrent HTTP connections below 5-8K to get robust performance given
PolyMix burstiness and other factors.

> > >   options HZ=1000
>
> This one, I don't understand at all.  The web page says it's for faster
> dummynet processing.  But maybe this is an artifact of using NETISR.

This setting is a must-have if you use dummynet. We did not invent it,
it was suggested by the dummynet author himself, and it did solve
performance problems we experienced with s