[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread sryza
GitHub user sryza opened a pull request: https://github.com/apache/spark/pull/148 SPARK-1252. On YARN, use container-log4j.properties for executors container-log4j.properties is a file that YARN provides so that containers can have log4j.properties distinct from that of the NodeMana

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/149 SPARK-1255: Allow user to pass Serializer object instead of class name for shuffle. This is more general than simply passing a string name and leaves more room for performance optimizations. N

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37720234 @marmbrus this is for you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37720722 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37720723 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37720725 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37720724 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: SPARK-1246, added min max API to Double RDDs i...

2014-03-15 Thread ScrapCodes
Github user ScrapCodes closed the pull request at: https://github.com/apache/spark/pull/140 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is e

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37722638 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13193/ --- If your p

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37722637 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13194/ --- If your project

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37722634 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37722635 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Fix SPARK-1256: Master web UI and Worker web U...

2014-03-15 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/150 Fix SPARK-1256: Master web UI and Worker web UI returns a 404 error You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-1256 Alterna

[GitHub] spark pull request: Fix SPARK-1256: Master web UI and Worker web U...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/150#issuecomment-37723748 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proj

Re: [re-cont] map and flatMap

2014-03-15 Thread andy petrella
Yep, Regarding flatMap and an implicit parameter might work like in scala's future for instance: https://github.com/scala/scala/blob/master/src/library/scala/concurrent/Future.scala#L246 Dunno, still waiting for some insights from the team ^^ andy On Wed, Mar 12, 2014 at 3:23 PM, Pascal Voitot D

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10635398 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/136#discussion_r10635413 --- Diff: core/src/main/scala/org/apache/spark/rdd/SlidedRDD.scala --- @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/136#discussion_r10635444 --- Diff: core/src/main/scala/org/apache/spark/rdd/SlidedRDD.scala --- @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/136#discussion_r10635447 --- Diff: core/src/main/scala/org/apache/spark/rdd/SlidedRDD.scala --- @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [re-cont] map and flatMap

2014-03-15 Thread Koert Kuipers
MappedRDD does: firstParent[T].iterator(split, context).map(f) and FlatMappedRDD: firstParent[T].iterator(split, context).flatMap(f) do yeah seems like its a map or flatMap over the iterator inside, not the RDD itself, sort of... On Sat, Mar 15, 2014 at 9:08 AM, andy petrella wrote: > Yep, > R

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/136#discussion_r10635557 --- Diff: core/src/main/scala/org/apache/spark/rdd/SlidedRDD.scala --- @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37732233 I don't think we typically run jobs inside of getPartitions - so this changes some semantics of calling that function. For instance a lot of the other RDD constructors im

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37732586 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37732587 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/136#discussion_r10635644 --- Diff: core/src/main/scala/org/apache/spark/rdd/SlidedRDD.scala --- @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/136#discussion_r10635646 --- Diff: core/src/main/scala/org/apache/spark/rdd/SlidedRDD.scala --- @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37732906 @pwendell , the limit case is not a practical example. In that case, we need re-partition for most operations to be efficient. Also, this is really for small window sizes l

Re: [re-cont] map and flatMap

2014-03-15 Thread Koert Kuipers
just going head first without any thinking, it changed flatMap to flatMapData and added a flatMap. for FlatMappedRDD my compute is: firstParent[T].iterator(split, context).flatMap(f andThen (_.compute(split, context))) scala> val x = sc.parallelize(1 to 100) scala> x.flatMap _ res0: (Int => org.

Re: [re-cont] map and flatMap

2014-03-15 Thread andy petrella
[Thanks a *lot* for your answers!] That's CoOl, a possible example would be to simply write a for-comprehension that would do this: > > val allEvents = for { > deviceId <- rddFromHdfsOfDeviceId > deviceEvent <- rddFromHdfsOfDeviceEvent(deviceId) > } deviceEvent > val hist = computeHistOf(

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37733755 Seems reasonable to me. You still working on this or is it good to go? --- If your project is set up for it, you can reply to this email and have your reply appear on Git

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37733845 Ah I see - so this isn't going to be externally a user-visible class (I didn't notice it was `private[spark]`)? Would it make sense to throw an assertion error if the sli

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37733908 Even if it's private we can end up with cases where users have a e.g. 10,000 partition RDD with only a few items in each partition. Do we know a priori when calling this

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10635962 --- Diff: core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala --- @@ -136,4 +123,47 @@ class MapOutputTrackerSuite extends FunSuite with Local

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10635964 --- Diff: core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala --- @@ -136,4 +142,30 @@ class MapOutputTrackerSuite extends FunSuite with LocalSp

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10635967 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -35,13 +35,21 @@ private[spark] case class GetMapOutputStatuses(shuffleId: Int)

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10635968 --- Diff: core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala --- @@ -136,4 +142,30 @@ class MapOutputTrackerSuite extends FunSuite with Local

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37734195 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13195/ --- If your project

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37734193 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37734238 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37734239 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37734242 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37734241 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37734634 It is hard to say what threshold to use. I couldn't think of a use case that requires a large window size, but I cannot say there is none. Another possible approach

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread gzm55
GitHub user gzm55 opened a pull request: https://github.com/apache/spark/pull/151 fix compile error for hadoop CDH 4.4+ Fix the compilation error when set SPARK_HADOOP_VERSION to 2.0.0-cdh4.4.0, That is, the yarn-alpha project should work with hadoop CDH 4.4.0 and later. You can me

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37735835 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13197/ --- If your project

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37735834 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37735832 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13196/ --- If your project

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37735830 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/151#issuecomment-37735857 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proj

[GitHub] spark pull request: SPARK-1144 Added license and RAT to check lice...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/125#issuecomment-37736072 @ScrapCodes this is a good start but right now it doesn't actually fail the build if RAT doesn't succeed. Also, RAT reports a bunch of failures for python files that I th

[GitHub] spark pull request: SPARK-1144 Added license and RAT to check lice...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/125#discussion_r10636342 --- Diff: dev/rat.bash --- @@ -0,0 +1,49 @@ +#!/usr/bin/env bash --- End diff -- could you remove the `.bash` extension here? --- If your pr

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10636356 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd) {

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37736391 I am not sure what the intent of this PR is. log config for workers should pretty much mirror what is in master. Also, the hardcoding of the config file, root l

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37736435 To a step back, given how niche this seems to be and how it violates the "usual" expectations of how our users use spark (lazy execution, etc as mentioned above) - d

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10636404 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala --- @@ -736,7 +736,7 @@ class JavaPairDStream[K, V](val dstream:

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10636411 --- Diff: core/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandlerMacro.scala --- @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Softwa

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10636424 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd)

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10636463 --- Diff: core/src/main/scala/org/apache/spark/util/AkkaUtils.scala --- @@ -121,4 +121,9 @@ private[spark] object AkkaUtils extends Logging { def lookup

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10636474 --- Diff: core/src/main/scala/org/apache/spark/util/AkkaUtils.scala --- @@ -121,4 +121,9 @@ private[spark] object AkkaUtils extends Logging { def lookup

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37736922 LGMT pending a minor comment about unifying the code path with the Executor thing that reads the frame size. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37737398 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37737399 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10636602 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd) {

[GitHub] spark pull request: Spark 615 map partitions with index callable f...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/16#issuecomment-37740255 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37738873 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 615 map partitions with index callable f...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/16#issuecomment-37740256 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/149#issuecomment-37738874 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13198/ --- If your project

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10637181 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd)

[GitHub] spark pull request: Spark 615 map partitions with index callable f...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/16#issuecomment-37742137 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13199/ --- If your project i

[GitHub] spark pull request: Spark 615 map partitions with index callable f...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/16#issuecomment-37742136 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/145#issuecomment-37742365 Thanks I've merged this. One small change I added is to use `Resolver.mavenLocal` that sbt provides for you instead of hard coding it. --- If your project is set up for

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/145#issuecomment-37742370 https://github.com/sbt/sbt/blob/0.13/ivy/src/main/scala/sbt/Resolver.scala?source=c#L289 --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37742731 @mridulm I think the RDD definition is actually `private[spark]` and it's just intended to be used internally for higher level algorithms. --- If your project is set up

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37742890 @mridulm I think in YARN environments cluster operators can set a logging file on all of the machines to be shared across applications (e.g. Spark, MapReduce, etc). So th

[GitHub] spark pull request: Akka frame

2014-03-15 Thread pwendell
GitHub user pwendell opened a pull request: https://github.com/apache/spark/pull/152 Akka frame This is a very small change on top of @andrewor14's patch in #147. You can merge this pull request into a Git repository by running: $ git pull https://github.com/pwendell/spark akka

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10637306 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -35,13 +35,21 @@ private[spark] case class GetMapOutputStatuses(shuffleId: I

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37743968 This should be ready to merge unless other people have more to add. --- If your project is set up for it, you can reply to this email and have your reply appear on GitH

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37744049 Hey @andrewor14 I submitted some small changes on top of this while you were working on it over at #152. --- If your project is set up for it, you can reply to this emai

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/145 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabl

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/152#issuecomment-37744154 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread andrewor14
Github user andrewor14 closed the pull request at: https://github.com/apache/spark/pull/147 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is e

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-15 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37744167 Continued at #152. Closing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/152#issuecomment-37744244 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/152#issuecomment-37744245 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/152#discussion_r10637484 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -35,13 +35,21 @@ private[spark] case class GetMapOutputStatuses(shuffleId: Int

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/152#discussion_r10637519 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -35,13 +35,21 @@ private[spark] case class GetMapOutputStatuses(shuffleId: Int)

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/152#discussion_r10637523 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -35,13 +35,21 @@ private[spark] case class GetMapOutputStatuses(shuffleId: Int)

Code documentation

2014-03-15 Thread David Thomas
Is there any documentation available that explains the code architecture that can help a new Spark framework developer?

Re: Code documentation

2014-03-15 Thread Reynold Xin
Take a look at https://cwiki.apache.org/confluence/display/SPARK/Spark+Internals On Sat, Mar 15, 2014 at 6:19 PM, David Thomas wrote: > Is there any documentation available that explains the code architecture > that can help a new Spark framework developer? >

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10637576 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd) {

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/152#issuecomment-37745299 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13200/ --- If your project

[GitHub] spark pull request: SPARK-1244: Throw exception if map output stat...

2014-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/152#issuecomment-37745298 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10637809 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread gzm55
Github user gzm55 commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10637844 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala --- @@ -736,7 +736,7 @@ class JavaPairDStream[K, V](val dstream: DS

[GitHub] spark pull request: [SPARK-1241] Add sliding to RDD

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/136#issuecomment-37748284 @pwendell I was referring not to the actual implementation, but expectation when using the exposed API. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: fix compile error for hadoop CDH 4.4+

2014-03-15 Thread gzm55
Github user gzm55 commented on a diff in the pull request: https://github.com/apache/spark/pull/151#discussion_r10637918 --- Diff: core/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandlerMacro.scala --- @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37748592 But that would be to debug yarn/hadoop api's primarily - and no easy way to inject spark specific logging levels. I am curious why this was required actually. Cur

[GitHub] spark pull request: SPARK-1252. On YARN, use container-log4j.prope...

2014-03-15 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/148#issuecomment-37748627 To clarify, I am not saying we should not be configuring what is in container-log4j.properties - but we should be trying to do that while preserving the ability to configu

[GitHub] spark pull request: SPARK-1255: Allow user to pass Serializer obje...

2014-03-15 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/149#discussion_r10637947 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -43,12 +44,13 @@ abstract class NarrowDependency[T](rdd: RDD[T]) extends Dependency(rdd)

[GitHub] spark pull request: fix compile error of streaming project

2014-03-15 Thread gzm55
GitHub user gzm55 opened a pull request: https://github.com/apache/spark/pull/153 fix compile error of streaming project explicit return type for implicit function You can merge this pull request into a Git repository by running: $ git pull https://github.com/gzm55/spark work/s

  1   2   >