from:"Sushrut Ikhar"

Re: parquet vs orc files

2018-03-01 Thread Sushrut Ikhar

To add, schema evaluation is better for parquet compared to orc (at the cost of a bit slowness) as orc is truly index based; especially useful in case you would want to delete some column later. Regards, Sushrut Ikhar [image: https://]about.me/sushrutikhar <https://about.me/sushrutikhar?pr

Re: Using Thrift with Dataframe

2018-03-01 Thread Sushrut Ikhar

https://github.com/airbnb/airbnb-spark-thrift Regards, Sushrut Ikhar [image: https://]about.me/sushrutikhar <https://about.me/sushrutikhar?promo=email_sig> On Thu, Mar 1, 2018 at 6:05 AM, Nikhil Goyal wrote: > Hi guys, > > I have a RDD of thrift struct. I want to convert it i

Driver storage memory getting waste

2016-10-17 Thread Sushrut Ikhar

Hi, Is there any config to change the storage memory fraction for driver; as i'm not caching anything in driver and by default it is picking from spark.memory.fraction (0.9) spark.memory.storageFraction (0.6); whose value i've set as per my executor usage. Regards, Sushrut Ikhar [im

When will spark 2.0.1 be available in maven repo?

2016-10-07 Thread Sushrut Ikhar

Regards, Sushrut Ikhar [image: https://]about.me/sushrutikhar <https://about.me/sushrutikhar?promo=email_sig>

Re: Spark Executor Lost issue

2016-09-28 Thread Sushrut Ikhar

Can you add more details like are you using rdds/datasets/sql ..; are you doing group by/ joins ; is your input splittable? btw, you can pass the config the same way you are passing memryOverhead: e.g. --conf spark.default.parallelism=1000 or through spark-context in code Regards, Sushrut Ikhar

Re: mapValues Transformation (JavaPairRDD)

2015-12-15 Thread Sushrut Ikhar

Well the issue was because I was using some non thread-safe functions for generating the key. Regards, Sushrut Ikhar [image: https://]about.me/sushrutikhar <https://about.me/sushrutikhar?promo=email_sig> On Tue, Dec 15, 2015 at 2:27 PM, Paweł Szulc wrote: > Hard to imagine. Can yo

mapValues Transformation (JavaPairRDD)

2015-12-14 Thread Sushrut Ikhar

-1.4.1. Thanks in advance. Regards, Sushrut Ikhar [image: https://]about.me/sushrutikhar <https://about.me/sushrutikhar?promo=email_sig>

Re: merge 3 different types of RDDs in one

2015-12-01 Thread Sushrut Ikhar

Hi, I have myself used union in a similar case. And applied reduceByKey on it. Union + reduceByKey will suffice join... but you will have to first use Map so that all values are of same datatype Regards, Sushrut Ikhar [image: https://]about.me/sushrutikhar <https://about.me/sushrutik

Re: Best practises

2015-11-02 Thread Sushrut Ikhar

This presentation may clarify many of your doubts. https://www.youtube.com/watch?v=7ooZ4S7Ay6Y Regards, Sushrut Ikhar [image: https://]about.me/sushrutikhar <https://about.me/sushrutikhar?promo=email_sig> On Mon, Nov 2, 2015 at 7:15 PM, Denny Lee wrote: > In addition, you may want

Split RDD into multiple RDDs using filter-transformation

2015-11-02 Thread Sushrut Ikhar

shows that no RDD partitioned are actually being cached. How do I split then without shuffling thrice? Regards, Sushrut Ikhar [image: https://]about.me/sushrutikhar <https://about.me/sushrutikhar?promo=email_sig>

Re: Running Spark in Yarn-client mode

2015-10-07 Thread Sushrut Ikhar

Hey Jean, Thanks for the quick response. I am using spark 1.4.1 pre-built with hadoop 2.6. Yes the Yarn cluster has multiple running worker nodes. It would a great help if you can tell how to look for the executors logs. Regards, Sushrut Ikhar [image: https://]about.me/sushrutikhar <ht

Running Spark in Yarn-client mode

2015-10-07 Thread Sushrut Ikhar

is now gated for [5000] ms. Reason is: [Disassociated]. I believe that executors are starting but are unable to connect back to the driver. How do I resolve this? Also, I need help in locating the driver and executor node logs. Thanks. Regards, Sushrut Ikhar [image: https://]about.me

Re: parquet vs orc files

Re: Using Thrift with Dataframe

Driver storage memory getting waste

When will spark 2.0.1 be available in maven repo?

Re: Spark Executor Lost issue

Re: mapValues Transformation (JavaPairRDD)

mapValues Transformation (JavaPairRDD)

Re: merge 3 different types of RDDs in one

Re: Best practises

Split RDD into multiple RDDs using filter-transformation

Re: Running Spark in Yarn-client mode

Running Spark in Yarn-client mode

12 matches

Site Navigation

Mail list logo

Footer information