I'm trying to prove an optimization technique for ring buffer is effective. 
One of the technique is using bitmask instead of modulo to calculate a wrap 
around. However, in my environment, modulo is slightly faster in a test 
where 1 billion items are enqueued /dequeued by a single goroutine. What do 
you think could be the cause? 

Full code:
https://go.dev/play/p/H933oqrhPI-

Environment:
* go version go1.21.4 darwin/arm64
* Apple M1 Pro

RingBuffer with modulo:
```
type RingBuffer0 struct {
writeIdx uint64
readIdx  uint64
buffers  []any
size     uint64
}

func NewRingBuffer0(size uint64) *RingBuffer0 {
rb := &RingBuffer0{}
rb.init(size)
return rb
}

func (rb *RingBuffer0) init(size uint64) {
rb.buffers = make([]any, size)
rb.size = size
}

func (rb *RingBuffer0) Enqueue(item any) error {
if rb.writeIdx-rb.readIdx == rb.size {
return ErrBufferFull
}
rb.buffers[rb.writeIdx%rb.size] = item
rb.writeIdx++
return nil
}

func (rb *RingBuffer0) Dequeue() (any, error) {
if rb.writeIdx == rb.readIdx {
return nil, ErrBufferEmpty
}
item := rb.buffers[rb.readIdx%rb.size]
rb.readIdx++
return item, nil
}
```

RingBuffer with bitmask:
change each module calculation to the code below
* rb.buffers[rb.writeIdx&(rb.size-1)] = item
* item := rb.buffers[rb.readIdx&(rb.size-1)]

Test:
func TestSingle(rb RingBuffer) {
start := time.Now()
total := 500000
for i := 0; i < total; i++ {
for j := 0; j < 1000; j++ {
rb.Enqueue(j)
}
for j := 0; j < 1000; j++ {
rb.Dequeue()
}
}
end := time.Now()
count := total * 2000
duration := end.Sub(start).Milliseconds()
fmt.Printf("%d ops in %d ms\n", count, duration)
}

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/b9c4d2e0-4ab4-4d27-9359-abd8c090ae33n%40googlegroups.com.

Reply via email to