Re: How to parallelization

Geoff Steckel Mon, 23 Dec 2024 13:23:47 -0800

On 12/23/24 1:43 PM, Christian Schulte wrote:

Not criticizing OpenBSD in any way. Let me try to explain a common use
case. There is a data source capable of providing X bytes per second at
max. The application needs to be setup in a way it can receive those X
bytes per second without spin locking or waiting for data. If it would
be "polling" too fast, it would slow down the whole system waiting for
data. If it would be "polling" too slow, it would not be able to process
those bytes fast enough. Those bytes need to be processed. So there is a
receiving process which needs to be able to consume exactly those X
bytes per second. That consumer also needs to be defined in a way it can
process those bytes in parallel as fast as possible. Sizing the consumer
too small, the producer will start spin locking or such and cannot keep
up with the data rate it needs to process, because the consumer does not
process the data fast enough. Sizing the consumer too big, the consumer
will start spin locking or such waiting for the producer to provide more
data. I am searching for an API to make the application adhere to those
situations automatically. Data rate on the receiving part decreases,
consumer part does not need to use Y processes in parallel all spin
locking waiting for more data. Data rate on the receiving part
increases, consumer needs to increase compute to not slow down the
receiver. Does this make things more clear?

Thank you for your explanation.

I'm assuming that there multiple logical streams which are
extracted from the incoming packet streams and distributed
to multiple consumers.

Is it true that you are attempting to perfectly assign and utilize
all CPUs for packet & application processing?
If packet ordering is to be preserved it's not clear that
perfect allocation is possible.

I am searching for an API to make the application adhere to those
situations automatically


The only way I can see to achieve 100% perfect usage this is:
  buffer the incoming packet stream deeply
  inspect each packet to measure application resources needed
  summarize those results over some time period
  systemwide, measure resource capability and current utilization
  determine systemwide resource allocation using some algorithm
  adjust systemwide application resources
  forward packets to each application

I worked on a network appliance which did complex resource
allocation while forwarding packets. It wasn't simple.

geoff steckel

Re: How to parallelization

Reply via email to