ge-
From: Reynold Xin [mailto:r...@databricks.com]
Sent: Tuesday, July 29, 2014 11:44 AM
To: dev@spark.apache.org
Subject: Re: pre-filtered hadoop RDD use case
I am not sure if I agree that it lacks the mechanism to do pushdowns.
Hadoop InputFormat itself provides some basic mechanism to push
lementation seems to be in place, and more optimization is desired
> beyond just record-oriented execution pipelining.
>
>
>
> -Original Message-----
> From: Reynold Xin [mailto:r...@databricks.com]
> Sent: Tuesday, July 29, 2014 12:55 AM
> To: dev@spark.apache.org
> Subject: Re: pre-f
[mailto:r...@databricks.com]
Sent: Tuesday, July 29, 2014 12:55 AM
To: dev@spark.apache.org
Subject: Re: pre-filtered hadoop RDD use case
Would something like this help?
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/PartitionPruningRDD.scala
On Th
Would something like this help?
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/PartitionPruningRDD.scala
On Thu, Jul 24, 2014 at 8:40 AM, Eugene Cheipesh
wrote:
> Hello,
>
> I have an interesting use case for a pre-filtered RDD. I have two solutions
> th