Re: Question regarding Arrow Flight Throughput

2020-04-24 Thread Wes McKinney
). If you opt for this, I would strongly suggest start the > >>> discussion on the mailing-list in order to coordinate with other > >>> developers. > >>> > >>> Best regards > >>> > >>> Antoine. > >>> > >&g

Re: Question regarding Arrow Flight Throughput

2020-04-24 Thread Antoine Pitrou
rit : >>>> Hi Antoine, >>>> >>>>> The question, though, is: do you *need* those higher speeds on >> localhost? >>>>> In which context are you considering Flight? >>>> >>>> We want to send large data(in cache) to th

Re: Question regarding Arrow Flight Throughput

2020-04-24 Thread Micah Kornfield
t; In which context are you considering Flight? > >> > >> We want to send large data(in cache) to the data analytic application(in > >> local). > >> > >> Thanks, > >> Jiajia > >> > >> -Original Message- > >&g

Re: Question regarding Arrow Flight Throughput

2020-04-24 Thread David Li
st? >>> In which context are you considering Flight? >> >> We want to send large data(in cache) to the data analytic application(in >> local). >> >> Thanks, >> Jiajia >> >> -Original Message- >> From: Antoine Pitrou >>

Re: Question regarding Arrow Flight Throughput

2020-04-24 Thread Antoine Pitrou
Message- > From: Antoine Pitrou > Sent: Saturday, April 25, 2020 1:01 AM > To: dev@arrow.apache.org > Subject: Re: Question regarding Arrow Flight Throughput > > > Hi Jiajia, > > It's true one should be able to reach higher speeds. For example, I can

RE: Question regarding Arrow Flight Throughput

2020-04-24 Thread Li, Jiajia
Sent: Saturday, April 25, 2020 1:01 AM To: dev@arrow.apache.org Subject: Re: Question regarding Arrow Flight Throughput Hi Jiajia, It's true one should be able to reach higher speeds. For example, I can reach more than 7 GB/s on a simple TCP connection, in pure Python, using only two thr

Re: Question regarding Arrow Flight Throughput

2020-04-24 Thread Antoine Pitrou
ottleneck when running benchmark, > I think flight can get a higher throughput. > > Thanks, > Jiajia > > -Original Message- > From: Antoine Pitrou > Sent: Friday, April 24, 2020 5:47 PM > To: dev@arrow.apache.org > Subject: Re: Question regarding Arrow Flight Thr

RE: Question regarding Arrow Flight Throughput

2020-04-24 Thread Li, Jiajia
riday, April 24, 2020 5:47 PM To: dev@arrow.apache.org Subject: Re: Question regarding Arrow Flight Throughput The problem with gRPC is that it was designed with relatively small requests and payloads in mind. We're using it for a large data application which it wasn't optimized for. Also

RE: Question regarding Arrow Flight Throughput

2020-04-24 Thread Li, Jiajia
Hi Wes, Thanks for your reply! Thanks, Jiajia -Original Message- From: Wes McKinney Sent: Friday, April 24, 2020 11:15 AM To: dev Subject: Re: Question regarding Arrow Flight Throughput On Thu, Apr 23, 2020 at 10:02 PM Wes McKinney wrote: > > hi Jiajia, > > See

Re: Question regarding Arrow Flight Throughput

2020-04-24 Thread Antoine Pitrou
The problem with gRPC is that it was designed with relatively small requests and payloads in mind. We're using it for a large data application which it wasn't optimized for. Also, its threading model is inscrutable (yielding those weird benchmark results). However, 5 GB/s is indeed very good i

Re: Question regarding Arrow Flight Throughput

2020-04-23 Thread Wes McKinney
On Thu, Apr 23, 2020 at 10:02 PM Wes McKinney wrote: > > hi Jiajia, > > See my TODO here > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/flight_benchmark.cc#L182 > > My guess is that if you want to get faster throughput with multiple > cores, you need to run more than one serv

Re: Question regarding Arrow Flight Throughput

2020-04-23 Thread Wes McKinney
hi Jiajia, See my TODO here https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/flight_benchmark.cc#L182 My guess is that if you want to get faster throughput with multiple cores, you need to run more than one server and serve on different ports rather than having all threads go to

Question regarding Arrow Flight Throughput

2020-04-23 Thread Li, Jiajia
Hi all, I have some doubts about arrow flight throughput. In this article(https://www.dremio.com/understanding-apache-arrow-flight/), it said "High efficiency. Flight is designed to work without any serialization or deserialization of records, and with zero memory copies, achieving over 20 Gbp