Re: Question regarding Arrow Flight Throughput

Antoine Pitrou Fri, 24 Apr 2020 11:57:40 -0700


I'm not sure a new transport for gRPC would change anything.  gRPC
currently uses HTTP (HTTP2 I believe), and there's no reason for HTTP to
be the culprit here.


Regards

Antoine.


Le 24/04/2020 à 20:48, Micah Kornfield a écrit :
> A couple of questions:
> 1.  For same node transport would doing something with Plasma be a
> reasonable approach?
> 2.  What are the advantages/disadvantages of creating a new transport for
> gRPC [1] vs building an entirely new backend of flight?
> 
> Thanks,
> Micah
> 
> [1] https://github.com/grpc/grpc/issues/7931
> 
> On Fri, Apr 24, 2020 at 11:37 AM David Li <li.david...@gmail.com> wrote:
> 
>> Having alternative backends for Flight has been a goal from the start,
>> hence why gRPC is wrapped and generally not exposed to the user. I
>> would be interested in collaborating on an HTTP/1 backend that is
>> accessible from the browser (or via an alternative transport meeting
>> the same requirements, e.g. WebSockets).
>>
>> In terms of tuning gRPC, taking a performance profile would be useful.
>> I remember there are some TODOs on the C++ side about copies that
>> sometimes occur due to gRPC that we don't quite understand yet. I
>> spent quite a bit of time a while ago trying to tune gRPC, but like
>> Antoine, couldn't find any easy wins.
>>
>> Best,
>> David
>>
>> On 4/24/20, Antoine Pitrou <anto...@python.org> wrote:
>>>
>>> Hi Jiajia,
>>>
>>> I see.  I think there are two possible avenues to try and improve this:
>>>
>>> * better use gRPC in the hope of achieving higher performance.  This
>>> doesn't seem to be easy, though.  I've already tried to change some of
>>> the parameters listed here, but didn't get any benefits:
>>> https://grpc.github.io/grpc/cpp/group__grpc__arg__keys.html
>>>
>>> (perhaps there are other, lower-level APIs that we should use? I don't
>>> know)
>>>
>>> * take the time to design and start implementing another I/O backend for
>>> Flight.  gRPC is just one possible backend, but the Flight remote API is
>>> simple enough that we could envision other backends (for example a HTTP
>>> REST-like API).  If you opt for this, I would strongly suggest start the
>>> discussion on the mailing-list in order to coordinate with other
>>> developers.
>>>
>>> Best regards
>>>
>>> Antoine.
>>>
>>>
>>> Le 24/04/2020 à 19:16, Li, Jiajia a écrit :
>>>> Hi Antoine,
>>>>
>>>>> The question, though, is: do you *need* those higher speeds on
>> localhost?
>>>>>  In which context are you considering Flight?
>>>>
>>>> We want to send large data(in cache) to the data analytic application(in
>>>> local).
>>>>
>>>> Thanks,
>>>> Jiajia
>>>>
>>>> -----Original Message-----
>>>> From: Antoine Pitrou <anto...@python.org>
>>>> Sent: Saturday, April 25, 2020 1:01 AM
>>>> To: dev@arrow.apache.org
>>>> Subject: Re: Question regarding Arrow Flight Throughput
>>>>
>>>>
>>>> Hi Jiajia,
>>>>
>>>> It's true one should be able to reach higher speeds.  For example, I can
>>>> reach more than 7 GB/s on a simple TCP connection, in pure Python, using
>>>> only two threads:
>>>> https://gist.github.com/pitrou/6cdf7bf6ce7a35f4073a7820a891f78e
>>>>
>>>> The question, though, is: do you *need* those higher speeds on
>> localhost?
>>>> In which context are you considering Flight?
>>>>
>>>> Regards
>>>>
>>>> Antoine.
>>>>
>>>>
>>>> Le 24/04/2020 à 18:52, Li, Jiajia a écrit :
>>>>> Hi Antoine,
>>>>>
>>>>> I think here 5 GB/s is in localhost. As localhost does not depend on
>>>>> network speed and I've checked the CPU is not the bottleneck when
>> running
>>>>> benchmark, I think flight can get a higher throughput.
>>>>>
>>>>> Thanks,
>>>>> Jiajia
>>>>>
>>>>> -----Original Message-----
>>>>> From: Antoine Pitrou <anto...@python.org>
>>>>> Sent: Friday, April 24, 2020 5:47 PM
>>>>> To: dev@arrow.apache.org
>>>>> Subject: Re: Question regarding Arrow Flight Throughput
>>>>>
>>>>>
>>>>> The problem with gRPC is that it was designed with relatively small
>>>>> requests and payloads in mind.  We're using it for a large data
>>>>> application which it wasn't optimized for.  Also, its threading model
>> is
>>>>> inscrutable (yielding those weird benchmark results).
>>>>>
>>>>> However, 5 GB/s is indeed very good if between different machines.
>>>>>
>>>>> Regards
>>>>>
>>>>> Antoine.
>>>>>
>>>>>
>>>>> Le 24/04/2020 à 05:15, Wes McKinney a écrit :
>>>>>> On Thu, Apr 23, 2020 at 10:02 PM Wes McKinney <wesmck...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> hi Jiajia,
>>>>>>>
>>>>>>> See my TODO here
>>>>>>>
>>>>>>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/fli
>>>>>>> g
>>>>>>> ht_benchmark.cc#L182
>>>>>>>
>>>>>>> My guess is that if you want to get faster throughput with multiple
>>>>>>> cores, you need to run more than one server and serve on different
>>>>>>> ports rather than having all threads go to the same server through
>>>>>>> the same port. I don't think we've made any manycore scalability
>>>>>>> claims, though.
>>>>>>>
>>>>>>> I tried to run this myself but I can't get the benchmark executable
>>>>>>> to run on my machine right now -- this seems to be a regression.
>>>>>>>
>>>>>>> https://issues.apache.org/jira/browse/ARROW-8578
>>>>>>
>>>>>> This turned out to be a false alarm and went away after a reboot.
>>>>>>
>>>>>> On my laptop a single thread is faster than multiple threads making
>>>>>> requests to a sole server, so this supports the hypothesis that
>>>>>> concurrent requests on the same port does not increase throughput.
>>>>>>
>>>>>> $ ./release/arrow-flight-benchmark -num_threads 1
>>>>>> Speed: 5131.73 MB/s
>>>>>>
>>>>>> $ ./release/arrow-flight-benchmark -num_threads 16
>>>>>> Speed: 4258.58 MB/s
>>>>>>
>>>>>> I'd suggest improving the benchmark executable to spawn multiple
>>>>>> servers as the next step to study multicore throughput. That said
>>>>>> with the above being ~40gbps already it's unclear how higher
>>>>>> throughput can go realistically.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> - Wes
>>>>>>>
>>>>>>> On Thu, Apr 23, 2020 at 8:17 PM Li, Jiajia <jiajia...@intel.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I have some doubts about arrow flight throughput. In this
>>>>>>>> article(https://www.dremio.com/understanding-apache-arrow-flight/),
>>>>>>>> it said "High efficiency. Flight is designed to work without any
>>>>>>>> serialization or deserialization of records, and with zero memory
>>>>>>>> copies, achieving over 20 Gbps per core."  And in the other article
>>>>>>>> (https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/
>> ),
>>>>>>>> it said "As far as absolute speed, in our C++ data throughput
>>>>>>>> benchmarks, we are seeing end-to-end TCP throughput in excess of
>>>>>>>> 2-3GB/s on localhost without TLS enabled. This benchmark shows a
>>>>>>>> transfer of ~12 gigabytes of data in about 4 seconds:"
>>>>>>>>
>>>>>>>> Here 20 Gbps /8 = 2.5GB/s, does it mean if we test benchmark in a
>>>>>>>> server with two cores, the throughput will be 5 GB/s?  But I have
>> run
>>>>>>>> the arrow-flight-benchmark, my server with 40 cores, but the result
>> is
>>>>>>>> " Speed: 2420.82 MB/s" .
>>>>>>>>
>>>>>>>> So what should I do to increase the throughput? Please correct me
>> if I
>>>>>>>> am wrong. Thank you in advance!
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jiajia
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>
>>
>

Re: Question regarding Arrow Flight Throughput

Reply via email to