The two most likely limiters in performance will be your network pipe to
the cloud and the QPS quota offered by the service. If you are not reaching
those limits you should increase the parallelism until you do. If your cpu
becomes saturated first you probably need larger buffer sizes in the I/o
The article is very suspect. In the first section "A simple implementation"
the code is badly broken. You can't get a reader lock if the writer has the
write lock - which the code doesn't test:
func (l *ReaderCountRWLock) RLock()
{
l.m.Lock()
l.readerCount++
l.m.Unlock()
}
A single writer RWM