Hi Jiajia,

I see.  I think there are two possible avenues to try and improve this:

* better use gRPC in the hope of achieving higher performance.  This
doesn't seem to be easy, though.  I've already tried to change some of
the parameters listed here, but didn't get any benefits:
https://grpc.github.io/grpc/cpp/group__grpc__arg__keys.html

(perhaps there are other, lower-level APIs that we should use? I don't know)

* take the time to design and start implementing another I/O backend for
Flight.  gRPC is just one possible backend, but the Flight remote API is
simple enough that we could envision other backends (for example a HTTP
REST-like API).  If you opt for this, I would strongly suggest start the
discussion on the mailing-list in order to coordinate with other developers.

Best regards

Antoine.


Le 24/04/2020 à 19:16, Li, Jiajia a écrit :
> Hi Antoine,
> 
>> The question, though, is: do you *need* those higher speeds on localhost?  
>> In which context are you considering Flight?
> 
> We want to send large data(in cache) to the data analytic application(in 
> local).
> 
> Thanks,
> Jiajia
> 
> -----Original Message-----
> From: Antoine Pitrou <anto...@python.org> 
> Sent: Saturday, April 25, 2020 1:01 AM
> To: dev@arrow.apache.org
> Subject: Re: Question regarding Arrow Flight Throughput
> 
> 
> Hi Jiajia,
> 
> It's true one should be able to reach higher speeds.  For example, I can 
> reach more than 7 GB/s on a simple TCP connection, in pure Python, using only 
> two threads:
> https://gist.github.com/pitrou/6cdf7bf6ce7a35f4073a7820a891f78e
> 
> The question, though, is: do you *need* those higher speeds on localhost?  In 
> which context are you considering Flight?
> 
> Regards
> 
> Antoine.
> 
> 
> Le 24/04/2020 à 18:52, Li, Jiajia a écrit :
>> Hi Antoine,
>>
>> I think here 5 GB/s is in localhost. As localhost does not depend on network 
>> speed and I've checked the CPU is not the bottleneck when running benchmark, 
>> I think flight can get a higher throughput.
>>
>> Thanks,
>> Jiajia
>>
>> -----Original Message-----
>> From: Antoine Pitrou <anto...@python.org>
>> Sent: Friday, April 24, 2020 5:47 PM
>> To: dev@arrow.apache.org
>> Subject: Re: Question regarding Arrow Flight Throughput
>>
>>
>> The problem with gRPC is that it was designed with relatively small requests 
>> and payloads in mind.  We're using it for a large data application which it 
>> wasn't optimized for.  Also, its threading model is inscrutable (yielding 
>> those weird benchmark results).
>>
>> However, 5 GB/s is indeed very good if between different machines.
>>
>> Regards
>>
>> Antoine.
>>
>>
>> Le 24/04/2020 à 05:15, Wes McKinney a écrit :
>>> On Thu, Apr 23, 2020 at 10:02 PM Wes McKinney <wesmck...@gmail.com> wrote:
>>>>
>>>> hi Jiajia,
>>>>
>>>> See my TODO here
>>>>
>>>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/fli
>>>> g
>>>> ht_benchmark.cc#L182
>>>>
>>>> My guess is that if you want to get faster throughput with multiple 
>>>> cores, you need to run more than one server and serve on different 
>>>> ports rather than having all threads go to the same server through 
>>>> the same port. I don't think we've made any manycore scalability 
>>>> claims, though.
>>>>
>>>> I tried to run this myself but I can't get the benchmark executable 
>>>> to run on my machine right now -- this seems to be a regression.
>>>>
>>>> https://issues.apache.org/jira/browse/ARROW-8578
>>>
>>> This turned out to be a false alarm and went away after a reboot.
>>>
>>> On my laptop a single thread is faster than multiple threads making 
>>> requests to a sole server, so this supports the hypothesis that 
>>> concurrent requests on the same port does not increase throughput.
>>>
>>> $ ./release/arrow-flight-benchmark -num_threads 1
>>> Speed: 5131.73 MB/s
>>>
>>> $ ./release/arrow-flight-benchmark -num_threads 16
>>> Speed: 4258.58 MB/s
>>>
>>> I'd suggest improving the benchmark executable to spawn multiple 
>>> servers as the next step to study multicore throughput. That said 
>>> with the above being ~40gbps already it's unclear how higher 
>>> throughput can go realistically.
>>>
>>>
>>>>
>>>> - Wes
>>>>
>>>> On Thu, Apr 23, 2020 at 8:17 PM Li, Jiajia <jiajia...@intel.com> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I have some doubts about arrow flight throughput. In this 
>>>>> article(https://www.dremio.com/understanding-apache-arrow-flight/),  it 
>>>>> said "High efficiency. Flight is designed to work without any 
>>>>> serialization or deserialization of records, and with zero memory copies, 
>>>>> achieving over 20 Gbps per core."  And in the other article 
>>>>> (https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/), it 
>>>>> said "As far as absolute speed, in our C++ data throughput benchmarks, we 
>>>>> are seeing end-to-end TCP throughput in excess of 2-3GB/s on localhost 
>>>>> without TLS enabled. This benchmark shows a transfer of ~12 gigabytes of 
>>>>> data in about 4 seconds:"
>>>>>
>>>>> Here 20 Gbps /8 = 2.5GB/s, does it mean if we test benchmark in a server 
>>>>> with two cores, the throughput will be 5 GB/s?  But I have run the 
>>>>> arrow-flight-benchmark, my server with 40 cores, but the result is " 
>>>>> Speed: 2420.82 MB/s" .
>>>>>
>>>>> So what should I do to increase the throughput? Please correct me if I am 
>>>>> wrong. Thank you in advance!
>>>>>
>>>>> Thanks,
>>>>> Jiajia
>>>>>
>>>>>
>>>>>

Reply via email to