[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-16 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37764448 Unless you are a spark developer, including at Yahoo, the person building the assembly jar is not the same as the person using spark : so depending on assembled jar

[GitHub] spark pull request: Bugfixes/improvements to scheduler

2014-03-16 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/159#issuecomment-37761392 jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Bugfixes/improvements to scheduler

2014-03-16 Thread mridulm
GitHub user mridulm opened a pull request: https://github.com/apache/spark/pull/159 Bugfixes/improvements to scheduler Move the PR#517 of apache-incubator-spark to the apache-spark You can merge this pull request into a Git repository by running: $ git pull https://github.com

[GitHub] spark pull request: Update CommandUtils.scala

2014-03-16 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/157#issuecomment-37761089 That is weird - you can see the use of SPARK_JAVA_OPTS just a few lines above in the patch you submitted. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: Update CommandUtils.scala

2014-03-16 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/157#issuecomment-37758977 This can be done with SPARK_JAVA_OPTS set to java debug options. That goes to master and executors. Practically, particularly in multi-tennet deployments this

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-16 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37755416 There is a user exposed option to configure log4j when run in yarn - which is shipped as part of the job if specified. On Sun, Mar 16, 2014 at 2:25 AM

[GitHub] spark pull request: remove staging dir when app quiting for yarn-c...

2014-03-16 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/154#issuecomment-37751154 jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-16 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10638435 --- Diff: core/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandlerMacro.scala --- @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10638011 --- Diff: core/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandlerMacro.scala --- @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10637951 --- Diff: core/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandlerMacro.scala --- @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10637947 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37748627 To clarify, I am not saying we should not be configuring what is in container-log4j.properties - but we should be trying to do that while preserving the ability to

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37748592 But that would be to debug yarn/hadoop api's primarily - and no easy way to inject spark specific logging levels. I am curious why this was required act

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37748284 @pwendell I was referring not to the actual implementation, but expectation when using the exposed API. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10637181 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10636424 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10636411 --- Diff: core/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandlerMacro.scala --- @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10636404 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala --- @@ -736,7 +736,7 @@ class JavaPairDStream[K, V](val dstream

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37736435 To a step back, given how niche this seems to be and how it violates the "usual" expectations of how our users use spark (lazy execution, etc as menti

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37736391 I am not sure what the intent of this PR is. log config for workers should pretty much mirror what is in master. Also, the hardcoding of the config file, root

[GitHub] spark pull request: MetadataCleaner - fine control cleanup documen...

2014-03-13 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/89#discussion_r10559136 --- Diff: docs/configuration.md --- @@ -487,6 +477,88 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request: MetadataCleaner - fine control cleanup documen...

2014-03-13 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/89#discussion_r10559142 --- Diff: docs/configuration.md --- @@ -487,6 +477,88 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request: MetadataCleaner - fine control cleanup documen...

2014-03-13 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/89#issuecomment-37518471 Looks good to me, sorry for not updating the documentation when I added this ! /CC @ pwendell --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-08 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/96#issuecomment-37093759 Thanks for this fix - excellent catch ! On Sat, Mar 8, 2014 at 1:53 PM, asfgit wrote: > Closed #96 <https://github.com/apache/spark/pull/9

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10386811 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10386330 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10374821 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10373471 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10373335 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10373228 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10373180 --- Diff: bin/spark-submit --- @@ -0,0 +1,38 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36964810 I would have preferred a different identifier (though I dont have good alternatives yet), but that seems moot now since the PR was closed before I could get to it

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10372142 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkApp.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-05 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10332786 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkApp.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-05 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10332762 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkApp.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-05 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10332708 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkAppArguments.scala --- @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-05 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10332688 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkAppArguments.scala --- @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-05 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10332670 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkApp.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-05 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10332560 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkApp.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-03 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10242223 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: Removed accidentally checked in comment

2014-03-02 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/61#issuecomment-36480674 Yeah, this was an internal review comment :-) Thanks ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-1145: Memory mapping with many small blo...

2014-03-02 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/43#discussion_r10200502 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -84,12 +84,27 @@ private class DiskStore(blockManager: BlockManager, diskManager

[GitHub] spark pull request: SPARK-1145: Memory mapping with many small blo...

2014-03-01 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/43#discussion_r10192379 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -84,12 +84,27 @@ private class DiskStore(blockManager: BlockManager, diskManager

[GitHub] spark pull request: SPARK-1145: Memory mapping with many small blo...

2014-02-28 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/43#discussion_r10189405 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -84,12 +84,27 @@ private class DiskStore(blockManager: BlockManager, diskManager

[GitHub] spark pull request: SPARK-1145: Memory mapping with many small blo...

2014-02-28 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/43#discussion_r10180027 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -84,12 +84,27 @@ private class DiskStore(blockManager: BlockManager, diskManager

[GitHub] spark pull request: SPARK-1145: Memory mapping with many small blo...

2014-02-27 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/43#discussion_r10156359 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -84,12 +84,27 @@ private class DiskStore(blockManager: BlockManager, diskManager

[GitHub] spark pull request: SPARK-1145: Memory mapping with many small blo...

2014-02-27 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/43#discussion_r10156343 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockFetcherIterator.scala --- @@ -146,6 +146,12 @@ object BlockFetcherIterator

[GitHub] spark pull request: SPARK-1145: Memory mapping with many small blo...

2014-02-27 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/43#discussion_r10156309 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockFetcherIterator.scala --- @@ -146,6 +146,12 @@ object BlockFetcherIterator

[GitHub] spark pull request: SPARK-1145: Memory mapping with many small blo...

2014-02-27 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/43#discussion_r10156300 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockFetcherIterator.scala --- @@ -146,6 +146,12 @@ object BlockFetcherIterator