Hi Wes,

Thanks for your reply! 

Thanks,
Jiajia

-----Original Message-----
From: Wes McKinney <wesmck...@gmail.com> 
Sent: Friday, April 24, 2020 11:15 AM
To: dev <dev@arrow.apache.org>
Subject: Re: Question regarding Arrow Flight Throughput

On Thu, Apr 23, 2020 at 10:02 PM Wes McKinney <wesmck...@gmail.com> wrote:
>
> hi Jiajia,
>
> See my TODO here
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/fligh
> t_benchmark.cc#L182
>
> My guess is that if you want to get faster throughput with multiple 
> cores, you need to run more than one server and serve on different 
> ports rather than having all threads go to the same server through the 
> same port. I don't think we've made any manycore scalability claims, 
> though.
>
> I tried to run this myself but I can't get the benchmark executable to 
> run on my machine right now -- this seems to be a regression.
>
> https://issues.apache.org/jira/browse/ARROW-8578

This turned out to be a false alarm and went away after a reboot.

On my laptop a single thread is faster than multiple threads making requests to 
a sole server, so this supports the hypothesis that concurrent requests on the 
same port does not increase throughput.

$ ./release/arrow-flight-benchmark -num_threads 1
Speed: 5131.73 MB/s

$ ./release/arrow-flight-benchmark -num_threads 16
Speed: 4258.58 MB/s

I'd suggest improving the benchmark executable to spawn multiple servers as the 
next step to study multicore throughput. That said with the above being ~40gbps 
already it's unclear how higher throughput can go realistically.


>
> - Wes
>
> On Thu, Apr 23, 2020 at 8:17 PM Li, Jiajia <jiajia...@intel.com> wrote:
> >
> > Hi all,
> >
> > I have some doubts about arrow flight throughput. In this 
> > article(https://www.dremio.com/understanding-apache-arrow-flight/),  it 
> > said "High efficiency. Flight is designed to work without any serialization 
> > or deserialization of records, and with zero memory copies, achieving over 
> > 20 Gbps per core."  And in the other article 
> > (https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/), it 
> > said "As far as absolute speed, in our C++ data throughput benchmarks, we 
> > are seeing end-to-end TCP throughput in excess of 2-3GB/s on localhost 
> > without TLS enabled. This benchmark shows a transfer of ~12 gigabytes of 
> > data in about 4 seconds:"
> >
> > Here 20 Gbps /8 = 2.5GB/s, does it mean if we test benchmark in a server 
> > with two cores, the throughput will be 5 GB/s?  But I have run the 
> > arrow-flight-benchmark, my server with 40 cores, but the result is " Speed: 
> > 2420.82 MB/s" .
> >
> > So what should I do to increase the throughput? Please correct me if I am 
> > wrong. Thank you in advance!
> >
> > Thanks,
> > Jiajia
> >
> >
> >

Reply via email to