Suggestion for SPARK-1825

2014-07-22 Thread innowireless TaeYun Kim
(I'm resending this mail since it seems that it was not sent. Sorry if this was already sent.) Hi, A couple of month ago, I made a pull request to fix https://issues.apache.org/jira/browse/SPARK-1825. My pull request is here: https://github.com/apache/spark/pull/899 But that pull request

Suggestion for SPARK-1825

2014-07-21 Thread innowireless TaeYun Kim
Hi, A couple of month ago, I made a pull request to fix https://issues.apache.org/jira/browse/SPARK-1825. My pull request is here: https://github.com/apache/spark/pull/899 But that pull request has problems: l It is Hadoop 2.4.0+ only. It won't compile on the versions below it. l The

Suggestion: rdd.compute()

2014-06-10 Thread innowireless TaeYun Kim
Hi, Regarding the following scenario, Would it be nice to have an action method named like 'compute()' that does nothing but computing/materializing the whole partitions of an RDD? It can also be useful for the profiling. -Original Message----- From: innowireless TaeYun Kim [mai

Suggestion or question: Adding rdd.cancelCache() method

2014-05-29 Thread innowireless TaeYun Kim
What I understand is that rdd.cache() is really rdd.cache_this_rdd_when_it_actually_materializes(). So, somewhat esoteric problem may occur. The example is as follows: void method1() { JavaRDD<...> rdd = sc.textFile(...) .map(...); rdd.cache(); // since the follo

RE: Suggestion: RDD cache depth

2014-05-29 Thread innowireless TaeYun Kim
14, at 11:46 PM, innowireless TaeYun Kim wrote: > It would be nice if the RDD cache() method incorporate a depth information. > > That is, > > > > void test() > { > > JavaRDD<.> rdd = .; > > > > rdd.cache(); // to depth 1. actual cachin

Suggestion: RDD cache depth

2014-05-28 Thread innowireless TaeYun Kim
It would be nice if the RDD cache() method incorporate a depth information. That is, void test() { JavaRDD<.> rdd = .; rdd.cache(); // to depth 1. actual caching happens. rdd.cache(); // to depth 2. Nop as long as the storage level is the same. Else, exception. . rdd.uncache(); // t

RE: About JIRA SPARK-1825

2014-05-27 Thread innowireless TaeYun Kim
to fork Spark on github, and then push your changes to it, and then follow: https://help.github.com/articles/using-pull-requests On Tue, May 27, 2014 at 6:10 PM, innowireless TaeYun Kim < taeyun@innowireless.co.kr> wrote: > I'm afraid I don't know how to send a 'p

RE: About JIRA SPARK-1825

2014-05-27 Thread innowireless TaeYun Kim
ockers for 1.0.0 but we can do it for 1.0.1 or 1.1, depending how big the patch is. Matei On May 27, 2014, at 5:25 PM, innowireless TaeYun Kim wrote: > Could somebody please review and fix > https://issues.apache.org/jira/browse/SPARK-1825 ? > > It's a cross-platform issue. >

About JIRA SPARK-1825

2014-05-27 Thread innowireless TaeYun Kim
Could somebody please review and fix https://issues.apache.org/jira/browse/SPARK-1825 ? It's a cross-platform issue. I've fixed the Spark source code based on rc5 and it's working for me now. but totally not sure whether I've done correctly, since I'm almost new to Spark and don't know much about

Is this supported? : Spark on Windows, Hadoop YARN on Linux.

2014-05-13 Thread innowireless TaeYun Kim
I'm trying to run spark-shell on Windows that uses Hadoop YARN on Linux. Specifically, the environment is as follows: - Client - OS: Windows 7 - Spark version: 1.0.0-SNAPSHOT (git cloned 2014.5.8) - Server - Platform: hortonworks sandbox 2.1 I has to modify the spark source code to apply ht