[go-nuts] sync.Pool wrapper with stats

Diego Augusto Molina Mon, 28 Oct 2024 00:46:23 -0700

Hi, everyone! Thank you for reading. I wanted to share a pet project I 
started a few weeks ago in my free time, it's a wrapper around sync.Pool 
that has limited set of online statistics about the memory cost 
(arbitrarily defined by the user) for two reasons: (1) to know whether it 
wants it or not in the pool in the first place; (2) to preallocate items 
with a memory cost that is less likely to need further allocation.
sync.Pool is amazing, and one thing that is great to know as well is how to 
use it, since keeping a few big allocations in the pool might prove 
counterproductive. In the fmt package there's a canonical example of proper 
usage, where small buffers are put into a sync.Pool for (potential) reuse 
if they are up to certain length, and otherwise dropped for gc.
When writing client/server code I found it wasn't easy to find a right 
point for them, and partitioning the problem by endpoint and a few other 
dimensions was a good start. But still, I need to know beforehand what are 
going to be those dimensions, and sometimes I got MiB length for some 
endpoints and <1KiB for others, and I had to probe all of them, then write 
a hardcoded constant for them, etc.
So I thought it would be useful to be able to gather some statistics on the 
buffers and structures I was using, so as to know where I was getting to, 
and it turned out in most cases I got... a Normal Distribution-like set of 
sizes.
So I wrapped all that and tried to improve the abstraction, and packaged it 
into a fancy name that is probably too much for it: AdaptivePool. See the 
code here: https://github.com/diegommm/adaptivepool
I found it too fun to work with and didn't bother to look first if there 
was something already done for this, so please let me know if you know of 
something already there that is proved (still worth the fun, though).

The principle is that when you put an item into the pool, it will feed an
online stats algorithm that will compute mean and standard deviation. It
then checks if the size is within the Mean +/- Threshold * StdDev and puts
it in the pool if so, otherwise just drops it on the floor. And when you
Get something from the pool, and the pool doesn't have something, it will
create for you an item with Mean + Threshold * StdDev.

I then added a lot of other things, decoupled the Normal Distribution logic
and put it in an Estimator interface (so you don't depend on Normal
Distribution and can write your own). This interface decides whether an
item should be accepted, and suggests the size of new objects. I then
decoupled the concept of "byte size" and changed it to "cost", and created
an "ItemProvider" interface that creates items of the given type, measures
them to the their cost, and clears them before reuse.

Finally, I added something for my HTTP server to make it easier to buffer
the bodies of some endpoints, a ReaderBufferer (that can also buffer
ReadClosers and call their Close method). I would not use that for large or
otherwise streaming payloads, of course.

So well, that was the pet, I would love some feedback, I know I could be
making many mistakes and wrong assumptions, so open to learning. Thank you!

P.S.: btw, added some benchmarks and put them in the commit messages

--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion visit
https://groups.google.com/d/msgid/golang-nuts/aee21a36-741e-4047-829f-15390c5e9d7cn%40googlegroups.com.

[go-nuts] sync.Pool wrapper with stats

Reply via email to