from:"Hanumath Rao Maduri"

Re: Spark RDD and Memory

2016-09-22 Thread Hanumath Rao Maduri

Hello Aditya, After an intermediate action has been applied you might want to call rdd.unpersist() to let spark know that this rdd is no longer required. Thanks, -Hanu On Thu, Sep 22, 2016 at 7:54 AM, Aditya wrote: > Hi, > > Suppose I have two RDDs > val textFile = sc.textFile("/user/emp.txt")

java.lang.RuntimeException: Stream '/jars/' not found

2016-12-16 Thread Hanumath Rao Maduri

Hello All, I am trying to test an application on standalone cluster. Here is my scenario. I started a spark master on a node A and also 1 worker on the same node A. I am trying to run the application from node B(this means I think this acts as driver). I have added jars to the sparkconf using s

Predicate not getting pusdhown to PrunedFilterScan

2017-03-30 Thread Hanumath Rao Maduri

Hello All, I am working on creating a new PrunedFilteredScan operator which has the ability to execute the predicates pushed to this operator. However What I observed is that if column with deep in the hierarchy is used then it is not getting pushed down. SELECT tom._id, tom.address.city from to

Predicate not getting pusdhown to PrunedFilterScan

2017-03-30 Thread Hanumath Rao Maduri

Hello All, I am working on creating a new PrunedFilteredScan operator which has the ability to execute the predicates pushed to this operator. However What I observed is that if column with deep in the hierarchy is used then it is not getting pushed down. SELECT tom._id, tom.address.city from to

Predicate not getting pusdhown to PrunedFilterScan

2017-03-30 Thread Hanumath Rao Maduri

Hello All, I am working on creating a new PrunedFilteredScan operator which has the ability to execute the predicates pushed to this operator. However What I observed is that if column with deep in the hierarchy is used then it is not getting pushed down. SELECT tom._id, tom.address.city from to

Re: How to PushDown ParquetFilter Spark 2.0.1 dataframe

2017-03-30 Thread Hanumath Rao Maduri

Hello Rahul, Please try to use df.filter(df("id").isin(1,2)) Thanks, On Thu, Mar 30, 2017 at 10:45 PM, Rahul Nandi wrote: > Hi, > I have around 2 million data as parquet file in s3. The file structure is > somewhat like > id data > 1 abc > 2 cdf > 3 fas > Now I want to filter and take the reco

Re: Spark RDD and Memory

java.lang.RuntimeException: Stream '/jars/' not found

Predicate not getting pusdhown to PrunedFilterScan

Predicate not getting pusdhown to PrunedFilterScan

Predicate not getting pusdhown to PrunedFilterScan

Re: How to PushDown ParquetFilter Spark 2.0.1 dataframe

6 matches

Site Navigation

Mail list logo

Footer information