is it possible to apply AQE rules only on some of nodes?

2020-08-31 Thread CodingCat
Hi, Spark devs I am wondering if it is possible to apply AQE on part of the physical plan? e.g. I only want to apply coalesce partitions on a particular ShuffleQueryStageExec? I didn't find a very straightforward way to achieve this, but is there a way to workaround the current limitation? Thank

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37704961 @mateiz, done~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37680371 sure, will do that this evening~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-13 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37603588 Ah, good, thank you very much for the comments @rxin @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-13 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/135#discussion_r10580163 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -310,6 +310,9 @@ abstract class RDD[T: ClassTag]( * Return a sampled subset of

[GitHub] spark pull request: SPARK-1240: handle the case with empty RDD whe...

2014-03-13 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/135 SPARK-1240: handle the case with empty RDD when take sample https://spark-project.atlassian.net/browse/SPARK-1240 It seems that the current implementation does not handle the empty RDD

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-13 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10576332 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -222,4 +232,19 @@ private[spark] object HadoopRDD { def

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-13 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/101#issuecomment-37558032 Hi, @aarondav , thank you very much for the comments, I think it's ready for the further review --- If your project is set up for it, you can reply to this emai

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/101#issuecomment-37498220 Hi, @kayousterhout and @aarondav , Thank you for your comments, I addressed them One potential issue is that, to call the function in HadoopRDD, I moved

[GitHub] spark pull request: hot fix for PR105 - change to Java annotation

2014-03-12 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/133#issuecomment-37494644 Hi, @pwendell and @aarondav, is it good? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: hot fix for PR105 - change to Java annotation

2014-03-12 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/133 hot fix for PR105 - change to Java annotation You can merge this pull request into a Git repository by running: $ git pull https://github.com/CodingCat/spark SPARK-1160-2 Alternatively you

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10548392 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -165,12 +174,29 @@ class HadoopRDD[K, V]( override def compute

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/101#issuecomment-37485195 @aarondav Thank you for your comments, I will address them --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10548369 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -165,12 +174,29 @@ class HadoopRDD[K, V]( override def compute

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10548362 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -165,12 +174,29 @@ class HadoopRDD[K, V]( override def compute

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10548255 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -165,12 +174,29 @@ class HadoopRDD[K, V]( override def compute

[GitHub] spark pull request: SPARK-1104: kill Process in workerThread of Ex...

2014-03-12 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/35#issuecomment-37452003 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-1160: Deprecate toArray in RDD

2014-03-12 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/105#issuecomment-37451918 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/101#issuecomment-37451940 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-1104: kill Process in workerThread of Ex...

2014-03-11 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/35#issuecomment-37289205 Hi, @pwendell, do you have time to take a look at this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-1102: Create a saveAsNewAPIHadoopDataset...

2014-03-10 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/12#issuecomment-37231038 No problem, thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-09 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/101#issuecomment-37150695 anyone would like to review this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-1104: kill Process in workerThread of Ex...

2014-03-09 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/35#issuecomment-37150704 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-1102: Create a saveAsNewAPIHadoopDataset...

2014-03-09 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/12#issuecomment-37150712 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-1102: Create a saveAsNewAPIHadoopDataset...

2014-03-08 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/12#issuecomment-37116374 @mateiz I have rebased the code, any further comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: SPARK-1160: Deprecate toArray in RDD

2014-03-08 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/105#issuecomment-37115130 Hi, @pwendell , thank you for the comments I just fixed that --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: SPARK-1160: Deprecate toArray in RDD

2014-03-08 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/105 SPARK-1160: Deprecate toArray in RDD https://spark-project.atlassian.net/browse/SPARK-1160 reported by @mateiz: "It's redundant with collect() and the name doesn't mak

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-07 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/101 SPARK-1128: set hadoop task properties when constructing HadoopRDD The task properties are set when constructing HadoopRDD in current implementation, this may limit the implementation based on

[GitHub] spark pull request: SPARK-1102: Create a saveAsNewAPIHadoopDataset...

2014-03-07 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/12#issuecomment-37057689 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10388297 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager

[GitHub] spark pull request: Add timeout for fetch file

2014-03-07 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/98#discussion_r10387879 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -278,6 +278,10 @@ private[spark] object Utils extends Logging { uc = new

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10387705 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10387464 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager

[GitHub] spark pull request: Fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-06 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10367306 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/85#discussion_r10355799 --- Diff: core/src/main/scala/org/apache/spark/util/AkkaUtils.scala --- @@ -108,6 +108,6 @@ private[spark] object AkkaUtils { /** Returns the

[GitHub] spark pull request: SPARK-1156: allow user to login into a cluster...

2014-03-05 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/58#issuecomment-36825482 any further comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-1102: Create a saveAsNewAPIHadoopDataset...

2014-03-05 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/12#issuecomment-36825497 any further comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-05 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36818916 what? how to understand the output? no failure, but there are some exceptions in the console output Accumulator cannot be accessed inside task? --- If your

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-05 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36814645 @pwendell , I removed some redundant parameters, but I'm thinking that which option is more convenient for the user, different pages contain different set of param

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-05 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36813427 @pwendell , yep, I asked in the mail list, but didn't get response, so I decided to put the things here first and revise it (e.g. remove those unnecessary) based o

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-05 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36812678 I said it is WIP, because of 2 things 1. The document is surely need to be revised, I'm not sure if I understand all details correctly, though I spent n

[GitHub] spark pull request: SPARK-1192: The document for most of the param...

2014-03-05 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/85 SPARK-1192: The document for most of the parameters used in core component I grep the code in core component, I found that around 30 parameters in the implementation is actually used but

[GitHub] spark pull request: SPARK-1171: when executor is removed, we shoul...

2014-03-05 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/63#issuecomment-36800228 @kayousterhout Thank you very much! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-1102: Create a saveAsNewAPIHadoopDataset...

2014-03-04 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/12#issuecomment-36702525 This is ready to merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1171: when executor is removed, we shoul...

2014-03-04 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/63#issuecomment-36702509 How about this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-04 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10283513 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: SPARK-1104: kill Process in workerThread of Ex...

2014-03-04 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/35#issuecomment-36698403 personally, I felt that, https://spark-project.atlassian.net/browse/SPARK-1175 is also related to this issue. --- If your project is set up for it, you can reply

[GitHub] spark pull request: SPARK-1156: allow user to login into a cluster...

2014-03-04 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/58#issuecomment-36698085 fixed that line as well as others with the same issue --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [Proposal] SPARK-1171: simplify the implementa...

2014-03-04 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/63#discussion_r10263675 --- Diff: core/src/main/scala/org/apache/spark/scheduler/WorkerOffer.scala --- @@ -21,4 +21,4 @@ package org.apache.spark.scheduler * Represents free

[GitHub] spark pull request: spark-1178: missing document of spark.schedule...

2014-03-04 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/74 spark-1178: missing document of spark.scheduler.revive.interval https://spark-project.atlassian.net/browse/SPARK-1178 The configuration on spark.scheduler.revive.interval is undocumented

[GitHub] spark pull request: [Proposal] SPARK-1171: simplify the implementa...

2014-03-03 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/63#issuecomment-36593562 @kayousterhout @markhamstra @andrewor14 Thank you for your comments, I updated the code, how about this? --- If your project is set up for it, you can reply to

[GitHub] spark pull request: fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-03 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10243399 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,7 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-03 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10243179 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,7 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-03 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10242829 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,7 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-03 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10242393 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-03 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10242213 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: SPARK-1156: allow user to login into a cluster...

2014-03-03 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/58#issuecomment-36590068 I updated the code and tested the functionalities, everything goes well --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-03 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10241868 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: SPARK-1156: allow user to login into a cluster...

2014-03-02 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/58#issuecomment-36481890 I think the better way to fix this is, not allow user to start non-slave cluster, but allow them to login to a all-slaves-lost cluster? --- If your project is set up

[GitHub] spark pull request: simplify the implementation of CoarseGrainedSc...

2014-03-02 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/63 simplify the implementation of CoarseGrainedSchedulerBackend There are 5 main data structures in the class, after reading the source code, I found that some of them are actually not used, some of

[GitHub] spark pull request: SPARK-1166: clean vpc_id if the group was just...

2014-03-02 Thread CodingCat
Github user CodingCat closed the pull request at: https://github.com/apache/spark/pull/59 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-1156: allow user to login into a cluster...

2014-03-02 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/58#issuecomment-36478687 with only a master, en...no service is actually working (in distributed fashion) but this patch is just to allow user to login to a master-only cluster --- If

[GitHub] spark pull request: SPARK-1166: clean vpc_id if the group was just...

2014-03-02 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/59 SPARK-1166: clean vpc_id if the group was just now created Reported in https://spark-project.atlassian.net/browse/SPARK-1166 In some very weird situation (when new created group

[GitHub] spark pull request: SPARK-1156: allow user to login into a cluster...

2014-03-02 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/58#issuecomment-36477517 oh, fixed, --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: SPARK-1156: allow user to login into a cluster...

2014-03-02 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/58 SPARK-1156: allow user to login into a cluster without slaves Reported in https://spark-project.atlassian.net/browse/SPARK-1159 The current spark-ec2 script doesn't allow user to login

[GitHub] spark pull request: SPARK-1102: Create a saveAsNewAPIHadoopDataset...

2014-03-02 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/12#issuecomment-36471736 exceed with 5 charssorry.fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-01 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10194239 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-01 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10194233 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: SPARK-1102: Create a saveAsNewAPIHadoopDataset...

2014-03-01 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/12#issuecomment-36445655 I rebased the code after https://github.com/apache/spark/pull/11 was merged, and tested in my local side, I think it is ready for further ready/testing --- If your

[GitHub] spark pull request: [SPARK-1150] fix repo location in create scrip...

2014-03-01 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/52 [SPARK-1150] fix repo location in create script (re-open) reopen for https://spark-project.atlassian.net/browse/SPARK-1150 You can merge this pull request into a Git repository by running

[GitHub] spark pull request: [SPARK-1150] fix repo location in create scrip...

2014-03-01 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/48#issuecomment-36442789 sure, Just reopened, https://github.com/apache/spark/pull/52 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-1100] prevent Spark from overwriting di...

2014-03-01 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/11#issuecomment-36440931 @pwendell done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-1100] prevent Spark from overwriting di...

2014-03-01 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/11#issuecomment-36437295 @pwendell Thank you again! Just updated the code --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-979] a LRU scheduler for load balancing...

2014-03-01 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/7#issuecomment-36433922 close it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-979] a LRU scheduler for load balancing...

2014-03-01 Thread CodingCat
Github user CodingCat closed the pull request at: https://github.com/apache/spark/pull/7 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-1100] prevent Spark from overwriting di...

2014-02-28 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/11#issuecomment-36414832 @pwendell , I just updated the code In the latest update, I make the checking only applicable to FileOutputFormat, the difference with your suggestion is that I

[GitHub] spark pull request: [SPARK-1100] prevent Spark from overwriting di...

2014-02-28 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/11#discussion_r10188629 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -712,6 +713,10 @@ class PairRDDFunctions[K: ClassTag, V: ClassTag](self

[GitHub] spark pull request: [SPARK-1100] prevent Spark from overwriting di...

2014-02-28 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/11#issuecomment-36406325 I changed the code and tested in local side, mind reviewing it again? @pwendell --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: fix repo location in create script

2014-02-28 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/48 fix repo location in create script fix the repo location in create_release script You can merge this pull request into a Git repository by running: $ git pull https://github.com/CodingCat

[GitHub] spark pull request: fix #SPARK-1149 Bad partitioners can cause Spa...

2014-02-28 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10166948 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: [SPARK-1102] Create a saveAsNewAPIHadoopDatase...

2014-02-28 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/12#issuecomment-36353844 I changed back the parameter type of the new method to Configuration for keeping consistent with other APIs, and whether Job should be parameter type is still under

[GitHub] spark pull request: fix #SPARK-1149 Bad partitioners can cause Spa...

2014-02-28 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10164732 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int

[GitHub] spark pull request: [SPARK-1102] Create a saveAsNewAPIHadoopDatase...

2014-02-27 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/12#issuecomment-36321849 this is a re-opened PR, in the old PR, https://github.com/apache/incubator-spark/pull/636, all test cases have passed Can anyone verify that and make further

[GitHub] spark pull request: [SPARK-979] Randomize order of offers.

2014-02-27 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/27#issuecomment-36275312 @shivaram I understand your cautiousness and I agree with Kay on that we would be careful when adding the complexity to the already-complex code base. So, I don't

[GitHub] spark pull request: [SPARK-1104] kill Process in workerThread

2014-02-27 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/35 [SPARK-1104] kill Process in workerThread As reported in https://spark-project.atlassian.net/browse/SPARK-1104 By @pwendell: "Sometimes due to large shuffles executors will take a

[GitHub] spark pull request: [SPARK-979] Randomize order of offers.

2014-02-27 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/27#issuecomment-36238278 en...it's much simpler...but randomization can just mitigate the issue with some probability? --- If your project is set up for it, you can reply to this emai

[GitHub] spark pull request: [SPARK-1100] prevent Spark from overwriting di...

2014-02-27 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/11#discussion_r10121533 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -618,10 +619,6 @@ class PairRDDFunctions[K: ClassTag, V: ClassTag](self