Re: What do people think about a one day get together?

2018-04-11 Thread Sourav Mazumder
+ 1 I would love to attend too. I would be there in Spark summit and presenting too there. Regards, Sourav Mazumder Data Science Center of Competency IBM Analytics On Mon, Apr 9, 2018 at 10:23 AM, Julian Hyde wrote: > +1 The Arrow community would benefit greatly from a > conf

Re: Meetup in SF, Additional Speakers?

2018-03-27 Thread Sourav Mazumder
make this Custom Rest Data Source ( https://github.com/sourav-mazumder/Data-Science-Extensions/tree/master/spark-datasource-rest) store the data using Arrow and getting the Performance advantage. This Rest Data Source can be used for any Rest based Data Service not just Watson Service. 2. NetCDF is

Re: Meetup in SF, Additional Speakers?

2018-03-27 Thread Sourav Mazumder
Hi Jacques, I can talk about on either of these 2 topics - 1. Using Arrow with IBM Watson Studio for vectorized query processing on large volume of data 2. Using Arrow for NetCDF data format for supporting scientific data processing Regards, Sourav Mazumder Data Science Center of Competency

Re: Comparing with Parquet

2016-02-26 Thread Sourav Mazumder
> > deserialized from Parquet for use in Python and R. > > > > > > - Wes > > > > > > On Thu, Feb 25, 2016 at 8:20 AM, Henry Robinson > > wrote: > > >> Think of Parquet as a format well-suited to writing very large > > >> dataset

Comparing with Parquet

2016-02-25 Thread Sourav Mazumder
Hi All, New to this. And still trying to figure out where exactly Arrow fits in the ecosystem of various Big Data technologies. In that respect first thing which came to my mind is how does Arrow compare with parquet. In my understanding Parquet also supports a very efficient columnar format (wi