What is the rationale behind this statement:
"... - CPU: maximum SINGLE CORE "turbo" speed. Disable the other cores, they're not helping you at all..."? /wbr Ariel Burbaickij On Fri, Nov 9, 2012 at 3:47 PM, Ryan McBride <mcbr...@openbsd.org> wrote: > My immediate reaction is "don't do it", but on the other hand I've never > known people for whom 'money is not a problem' to shy away from > something because of boring concerns like security. So... > > > Software: > > Basically, to do this "correctly" you need to parse all the packets > running in both directions between the two endpoints, tracking the acks > and correctly emulating the behaviour of the TCP stacks on both sides to > determine what is valid data to convert to UDP. > > Things to think about: > > - IP fragment reassembly > - duplicate packets > - out of order packets > - lost packets > - TCP resends > - TCP checksums > - IP checksums > - TCP sequence number validation > - etc, etc. > > Look at pf_normalise_state_tcp() in pf_norm.c and pf_test_state_tcp() in > pf.c for a small taste of the scope of what you're considering if you > want to write this in the kernel. Further examples for TCP reassembly > could be found in the source code for ports/net/snort or > ports/net/tcpflow. > > Of course you can take some shortcuts if you assume that the data you're > getting is clean, and even more if you don't have to parse the TCP > stream but can handle each individual TCP packet as an individual > payload. Perhaps your current problematic implementation already does > this? If so, it's also probably trivial to inject bogus data into the > stream and have it accepted. Maybe that's a feature. > > Remember: Lots of attacks can be performed against this hacked up > monstrosity unless everything is exactly perfect. Good luck with the > frankenstein code, it's not supported. > > > Hardware: > > - NIC: something that allows you to adjust the interrupt rate, e.g. em, > bnx. On the other hand if the packet rate is not too high a cheaper > network card without any bells and whistles might give you better > performance (less overhead in the interrupt handler). I'd say you'd be > best off buying a bunch and testing them. > > - CPU: maximum SINGLE CORE "turbo" speed. Disable the other cores, > they're not helping you at all; in theory you want the biggest, > fastest cache possible, but perhaps not necessary depending on how much > software you're running. > > - Fast RAM might help, but you don't need much. probably the minimum you > can get in a board with the above CPU. > > Also, remember to use the shortest patch cables possible, to reduce > signal propagation latency. > > > > On Thu, Nov 08, 2012 at 08:08:05PM +0200, Dan Shechter wrote: > > For unrelated reasons, I can't directly receive the TCP stream. > > > > I must copy the TCP data from a running stream to another server. I > > can use tap or just port-mirroring on the switch. So I can't use any > > network stack or leverage any offloading. > > > > I also need to modify the received data, and add few application > > headers before sending it as a multicast udp stream. > > > > Winsock is userland. What I want to do is in the kernel, even before > > ip_input. I guess it should be faster. > > > > > > On Thu, Nov 8, 2012 at 7:36 PM, Johan Beisser <j...@caustic.org> wrote: > > > On Thu, Nov 8, 2012 at 4:12 AM, Dan Shechter <dans...@gmail.com> > wrote: > > >> Hi All, > > >> > > >> <current situation> > > >> A windows 2008 server is receiving TCP traffic from a stock exchange > > >> and sends it, almost as is, using UDP multicast to automated high > > >> frequancy traders. > > >> > > >> StockExchange --TCP---> windows2008 ---MCAST-UDP----> > > >> > > >> On average, the time it take to do the TCP to UDP translation, using > > >> winsock, is 240 micro seconds. It can even be as high as 60,000 micro > > >> seconds. > > >> </current situation> > > >> > > >> <my idea> > > >> 1. Use port mirroring to get the TCP data sent to a dedicated OpenBSD > > >> box with two NICs. One for the TCP, the other for the multicast UDP. > > > > > > You'll incur an extra penalty offloading to the kernel. Winsock is > > > already doing that, though. > > > > > >> 2. Put the TCP port in a promiscuous mode. > > > > > > Why? You can just set up the right bits to listen to on the network, > > > and pull raw frames to be processed. Or, just let the network stack > > > behave as it should. > > > > > >> 3. Write my TCP->UDP logic directly into ether_input.c > > > > > > Any reason to not use pf for this translation? > > > > > >> </my idea> > > >> > > >> Now for the questions: > > >> 1. Am I on the right track? or in other words how crazy is my idea? > > > > > > Pretty crazy. You may want to see if there's hardware accelerated or > > > on NIC TCP off-load options instead. > > > > > >> 2. What would be the latency? Can I achieve 50 microseconds between > > >> getting the interrupt and until sending the new packet through the > > >> NIC? > > > > > > See above. You'll end up having to do some tuning. > > > > > >> 3. Which NIC/CPU/Memory should I use? Money is not a problem. > > > > > > Custom order a few NICs, hire a developer to write a driver to offload > > > TCP/UDP on the NIC, and enable as little kernel interference as > > > possible. > > > > > > Money's not a problem, right?