[go-nuts] fmap: a faster map

christoph...@gmail.com Thu, 19 Jun 2025 04:51:29 -0700

Hello, 

trying to implement a fast cache, I inadvertently entered a rabbit hole 
that led me to implement my own map. In the process I tried to make it 
faster than the go map just to see if it is possible. I worked weeks on it 
trying out various architectures and methods.


On my mac book air M2, I get the following benchmarks for a Get operation. 
The numbers are the number of items inserted in the table.  My keys are 8 
byte long strings. 

goos: darwin
goarch: arm64
pkg: fastCache/map
cpu: Apple M2
                     │ dirstr12/stats_arm64.txt │     
 puremapstr/stats_arm64.txt       │
                     │          sec/op          │    sec/op      vs base   
             │
Cache2Hit/_______1-8               6.151n ±  6%    7.087n ±  1%   +15.22% 
(p=0.002 n=6)
Cache2Hit/______10-8               8.491n ±  0%    8.156n ± 29%         ~ 
(p=0.394 n=6)
Cache2Hit/_____100-8               8.141n ±  7%   14.185n ± 13%   +74.24% 
(p=0.002 n=6)
Cache2Hit/____1000-8               8.252n ±  3%   10.635n ± 39%   +28.89% 
(p=0.002 n=6)
Cache2Hit/___10000-8               10.45n ±  2%    20.99n ±  4%  +100.81% 
(p=0.002 n=6)
Cache2Hit/__100000-8               12.16n ±  1%    19.11n ± 10%   +57.05% 
(p=0.002 n=6)
Cache2Hit/_1000000-8               42.28n ±  2%    47.90n ±  2%   +13.29% 
(p=0.002 n=6)
Cache2Hit/10000000-8               56.38n ± 12%    61.82n ±  6%         ~ 
(p=0.065 n=6)
geomean                            13.44n          17.86n         +32.91%

On my amd64 i5 11th gen I get the following benchmarks.

goos: linux
goarch: amd64
pkg: fastCache/map
cpu: 11th Gen Intel(R) Core(TM) i5-11400 @ 2.60GHz
                      │ dirstr12/stats_amd64.txt │     
 puremapstr/stats_amd64.txt      │
                      │          sec/op          │    sec/op      vs base   
            │
Cache2Hit/_______1-12               9.207n ±  1%    7.506n ±  3%  -18.48% 
(p=0.002 n=6)
Cache2Hit/______10-12               9.223n ±  0%    8.806n ±  6%        ~ 
(p=0.058 n=6)
Cache2Hit/_____100-12               9.279n ±  2%   10.175n ±  3%   +9.66% 
(p=0.002 n=6)
Cache2Hit/____1000-12               10.45n ±  2%    11.29n ±  3%   +8.04% 
(p=0.002 n=6)
Cache2Hit/___10000-12               16.00n ±  2%    17.21n ±  5%   +7.59% 
(p=0.002 n=6)
Cache2Hit/__100000-12               22.20n ± 17%    24.73n ± 22%  +11.42% 
(p=0.026 n=6)
Cache2Hit/_1000000-12               87.75n ±  2%    91.05n ±  5%   +3.76% 
(p=0.009 n=6)
Cache2Hit/10000000-12               104.2n ±  2%    105.6n ±  5%        ~ 
(p=0.558 n=6)
geomean                             20.11n          20.49n         +1.90%

On amd64 the performance is on par with go map, but go map uses inlined 
simd instructions which I don’t use because I don’t have access to it in 
pure go. I use xxh3 right out of the box for the hash function. The 
differences are not due to the hash function.

If there is interest in this, please let me know.



-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/1ce56ac2-662c-410f-a25a-b0ed09a706a4n%40googlegroups.com.

[go-nuts] fmap: a faster map

Reply via email to