Hi Antoine, >The question, though, is: do you *need* those higher speeds on localhost? In >which context are you considering Flight?
We want to send large data(in cache) to the data analytic application(in local). Thanks, Jiajia -----Original Message----- From: Antoine Pitrou <anto...@python.org> Sent: Saturday, April 25, 2020 1:01 AM To: dev@arrow.apache.org Subject: Re: Question regarding Arrow Flight Throughput Hi Jiajia, It's true one should be able to reach higher speeds. For example, I can reach more than 7 GB/s on a simple TCP connection, in pure Python, using only two threads: https://gist.github.com/pitrou/6cdf7bf6ce7a35f4073a7820a891f78e The question, though, is: do you *need* those higher speeds on localhost? In which context are you considering Flight? Regards Antoine. Le 24/04/2020 à 18:52, Li, Jiajia a écrit : > Hi Antoine, > > I think here 5 GB/s is in localhost. As localhost does not depend on network > speed and I've checked the CPU is not the bottleneck when running benchmark, > I think flight can get a higher throughput. > > Thanks, > Jiajia > > -----Original Message----- > From: Antoine Pitrou <anto...@python.org> > Sent: Friday, April 24, 2020 5:47 PM > To: dev@arrow.apache.org > Subject: Re: Question regarding Arrow Flight Throughput > > > The problem with gRPC is that it was designed with relatively small requests > and payloads in mind. We're using it for a large data application which it > wasn't optimized for. Also, its threading model is inscrutable (yielding > those weird benchmark results). > > However, 5 GB/s is indeed very good if between different machines. > > Regards > > Antoine. > > > Le 24/04/2020 à 05:15, Wes McKinney a écrit : >> On Thu, Apr 23, 2020 at 10:02 PM Wes McKinney <wesmck...@gmail.com> wrote: >>> >>> hi Jiajia, >>> >>> See my TODO here >>> >>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/fli >>> g >>> ht_benchmark.cc#L182 >>> >>> My guess is that if you want to get faster throughput with multiple >>> cores, you need to run more than one server and serve on different >>> ports rather than having all threads go to the same server through >>> the same port. I don't think we've made any manycore scalability >>> claims, though. >>> >>> I tried to run this myself but I can't get the benchmark executable >>> to run on my machine right now -- this seems to be a regression. >>> >>> https://issues.apache.org/jira/browse/ARROW-8578 >> >> This turned out to be a false alarm and went away after a reboot. >> >> On my laptop a single thread is faster than multiple threads making >> requests to a sole server, so this supports the hypothesis that >> concurrent requests on the same port does not increase throughput. >> >> $ ./release/arrow-flight-benchmark -num_threads 1 >> Speed: 5131.73 MB/s >> >> $ ./release/arrow-flight-benchmark -num_threads 16 >> Speed: 4258.58 MB/s >> >> I'd suggest improving the benchmark executable to spawn multiple >> servers as the next step to study multicore throughput. That said >> with the above being ~40gbps already it's unclear how higher >> throughput can go realistically. >> >> >>> >>> - Wes >>> >>> On Thu, Apr 23, 2020 at 8:17 PM Li, Jiajia <jiajia...@intel.com> wrote: >>>> >>>> Hi all, >>>> >>>> I have some doubts about arrow flight throughput. In this >>>> article(https://www.dremio.com/understanding-apache-arrow-flight/), it >>>> said "High efficiency. Flight is designed to work without any >>>> serialization or deserialization of records, and with zero memory copies, >>>> achieving over 20 Gbps per core." And in the other article >>>> (https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/), it >>>> said "As far as absolute speed, in our C++ data throughput benchmarks, we >>>> are seeing end-to-end TCP throughput in excess of 2-3GB/s on localhost >>>> without TLS enabled. This benchmark shows a transfer of ~12 gigabytes of >>>> data in about 4 seconds:" >>>> >>>> Here 20 Gbps /8 = 2.5GB/s, does it mean if we test benchmark in a server >>>> with two cores, the throughput will be 5 GB/s? But I have run the >>>> arrow-flight-benchmark, my server with 40 cores, but the result is " >>>> Speed: 2420.82 MB/s" . >>>> >>>> So what should I do to increase the throughput? Please correct me if I am >>>> wrong. Thank you in advance! >>>> >>>> Thanks, >>>> Jiajia >>>> >>>> >>>>