On Mon, Jul 17, 2017 at 10:14 PM Zohaib Sibte Hassan <
zohaib.has...@gmail.com> wrote:

> So the scenario I am trying to implement is basically a chat scenario.
>

My first approach would be to measure.

* The full system will have a bottleneck as per Amdahl's law. You need to
take a holistic view since the bottleneck may be outside Go (ie, the
kernel, hardware, ...). Knowing the bottleneck will usually suggest an
angle of attack on the problem at hand.

* High memory usage suggests either an overload situation (because you
don't have enough resources) or excessive copying of data is going on. Try
doing some napkin math. A million active users with 1k of data each takes
one gigabyte of memory and so on.

* High CPU can be indicative of real work, that the CPU is bandwidth
constrained toward memory, or that lock contention is going on. Kernels
usually will not discriminate between these states, so some investigation
is necessary.

* Keeping thousands of connections on small hardware is expensive. Each TCP
connection needs some kernel space in addition to userland space. It
quickly adds up.

* TCP sending is going to cost a lot of time. In laboratory tests, the
network is fast. Inside a datacenter, such as one operated by Google,
network transmission is fast. The internet in general is slow,
latency-inducing, and brittle. This forces your system to keep data
lingering for longer, and this puts more pressure on memory usage.

* Observation: in a noisy chatroom, you want to skip messages if they flow
out of the view on the client. You don't need to process every message in
this case. Just the messages that are visible. This suggests a polling
construction like the disruptor: only care about the K newest messages in
the fast path. The slow path does historical lookups. Keep an "epoch" count
of where we are in the message flow. When a socket is ready for data, use
the epoch count to figure out what happened in the meantime.

* If messages are immutable, they don't change and can be concurrently read
with little overhead. Edits to messages can be handled by a
patching-construction which overrides an earlier message.

* The *publisher* should take the effort of constructing as much of the
payload as possible and place it into a buffer everyone writes directly
into the network socket. If every subscriber has to do work, things get
expensive.

* Channels are likely to be fast enough, and resources are likely to run
out quickly: especially memory on a 512 megabyte computer.

* Channels should be used to send epochs around. The actual payload ought
to be somewhere else where it is ready and barriered appropriately for
read-only consumption. Alternative: pass a reference to the data around.
This is simple to do and is likely to be fast.

* Your goal is to get data to the socket so the kernel can do work. If you
do this correctly, it is likely the socket is going to be the bottleneck of
the system. Also, consider the possibility that you will block a goroutine
on data transfer to the outside world while its channel buffer fills up.
The publisher should not block in this situation.

In general, system engineering tend to trump local tuning. Effort is a
constrained resource, so it is usually best spent in the areas where the
cost/benefit analysis falls out nicely.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to