Hi Antoine, I think here 5 GB/s is in localhost. As localhost does not depend on network speed and I've checked the CPU is not the bottleneck when running benchmark, I think flight can get a higher throughput.
Thanks, Jiajia -----Original Message----- From: Antoine Pitrou <anto...@python.org> Sent: Friday, April 24, 2020 5:47 PM To: dev@arrow.apache.org Subject: Re: Question regarding Arrow Flight Throughput The problem with gRPC is that it was designed with relatively small requests and payloads in mind. We're using it for a large data application which it wasn't optimized for. Also, its threading model is inscrutable (yielding those weird benchmark results). However, 5 GB/s is indeed very good if between different machines. Regards Antoine. Le 24/04/2020 à 05:15, Wes McKinney a écrit : > On Thu, Apr 23, 2020 at 10:02 PM Wes McKinney <wesmck...@gmail.com> wrote: >> >> hi Jiajia, >> >> See my TODO here >> >> https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/flig >> ht_benchmark.cc#L182 >> >> My guess is that if you want to get faster throughput with multiple >> cores, you need to run more than one server and serve on different >> ports rather than having all threads go to the same server through >> the same port. I don't think we've made any manycore scalability >> claims, though. >> >> I tried to run this myself but I can't get the benchmark executable >> to run on my machine right now -- this seems to be a regression. >> >> https://issues.apache.org/jira/browse/ARROW-8578 > > This turned out to be a false alarm and went away after a reboot. > > On my laptop a single thread is faster than multiple threads making > requests to a sole server, so this supports the hypothesis that > concurrent requests on the same port does not increase throughput. > > $ ./release/arrow-flight-benchmark -num_threads 1 > Speed: 5131.73 MB/s > > $ ./release/arrow-flight-benchmark -num_threads 16 > Speed: 4258.58 MB/s > > I'd suggest improving the benchmark executable to spawn multiple > servers as the next step to study multicore throughput. That said with > the above being ~40gbps already it's unclear how higher throughput can > go realistically. > > >> >> - Wes >> >> On Thu, Apr 23, 2020 at 8:17 PM Li, Jiajia <jiajia...@intel.com> wrote: >>> >>> Hi all, >>> >>> I have some doubts about arrow flight throughput. In this >>> article(https://www.dremio.com/understanding-apache-arrow-flight/), it >>> said "High efficiency. Flight is designed to work without any serialization >>> or deserialization of records, and with zero memory copies, achieving over >>> 20 Gbps per core." And in the other article >>> (https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/), it >>> said "As far as absolute speed, in our C++ data throughput benchmarks, we >>> are seeing end-to-end TCP throughput in excess of 2-3GB/s on localhost >>> without TLS enabled. This benchmark shows a transfer of ~12 gigabytes of >>> data in about 4 seconds:" >>> >>> Here 20 Gbps /8 = 2.5GB/s, does it mean if we test benchmark in a server >>> with two cores, the throughput will be 5 GB/s? But I have run the >>> arrow-flight-benchmark, my server with 40 cores, but the result is " Speed: >>> 2420.82 MB/s" . >>> >>> So what should I do to increase the throughput? Please correct me if I am >>> wrong. Thank you in advance! >>> >>> Thanks, >>> Jiajia >>> >>> >>>