Hi Wes, Thanks for your reply!
Thanks, Jiajia -----Original Message----- From: Wes McKinney <wesmck...@gmail.com> Sent: Friday, April 24, 2020 11:15 AM To: dev <dev@arrow.apache.org> Subject: Re: Question regarding Arrow Flight Throughput On Thu, Apr 23, 2020 at 10:02 PM Wes McKinney <wesmck...@gmail.com> wrote: > > hi Jiajia, > > See my TODO here > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/fligh > t_benchmark.cc#L182 > > My guess is that if you want to get faster throughput with multiple > cores, you need to run more than one server and serve on different > ports rather than having all threads go to the same server through the > same port. I don't think we've made any manycore scalability claims, > though. > > I tried to run this myself but I can't get the benchmark executable to > run on my machine right now -- this seems to be a regression. > > https://issues.apache.org/jira/browse/ARROW-8578 This turned out to be a false alarm and went away after a reboot. On my laptop a single thread is faster than multiple threads making requests to a sole server, so this supports the hypothesis that concurrent requests on the same port does not increase throughput. $ ./release/arrow-flight-benchmark -num_threads 1 Speed: 5131.73 MB/s $ ./release/arrow-flight-benchmark -num_threads 16 Speed: 4258.58 MB/s I'd suggest improving the benchmark executable to spawn multiple servers as the next step to study multicore throughput. That said with the above being ~40gbps already it's unclear how higher throughput can go realistically. > > - Wes > > On Thu, Apr 23, 2020 at 8:17 PM Li, Jiajia <jiajia...@intel.com> wrote: > > > > Hi all, > > > > I have some doubts about arrow flight throughput. In this > > article(https://www.dremio.com/understanding-apache-arrow-flight/), it > > said "High efficiency. Flight is designed to work without any serialization > > or deserialization of records, and with zero memory copies, achieving over > > 20 Gbps per core." And in the other article > > (https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/), it > > said "As far as absolute speed, in our C++ data throughput benchmarks, we > > are seeing end-to-end TCP throughput in excess of 2-3GB/s on localhost > > without TLS enabled. This benchmark shows a transfer of ~12 gigabytes of > > data in about 4 seconds:" > > > > Here 20 Gbps /8 = 2.5GB/s, does it mean if we test benchmark in a server > > with two cores, the throughput will be 5 GB/s? But I have run the > > arrow-flight-benchmark, my server with 40 cores, but the result is " Speed: > > 2420.82 MB/s" . > > > > So what should I do to increase the throughput? Please correct me if I am > > wrong. Thank you in advance! > > > > Thanks, > > Jiajia > > > > > >