[jira] [Commented] (SPARK-17533) I think it's necessary to have an overrided method of union in sparkContext

Sean Owen (JIRA) Wed, 14 Sep 2016 02:46:09 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-17533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15489971#comment-15489971
 ]


Sean Owen commented on SPARK-17533:
-----------------------------------

You can always repartition() the result. I'm not sure this API makes sense, 
because the point of a union is to make one RDD whose partitions are the 
partitions of all the underlying RDDs. Of course, you can subsequently do 
anything you want with the result including repartition it, but this API 
doesn't have an intrinsic need to expose that.

> I think it's necessary to have an overrided method of union in sparkContext 
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-17533
>                 URL: https://issues.apache.org/jira/browse/SPARK-17533
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 2.0.0
>            Reporter: WangJianfei
>            Priority: Minor
>              Labels: features
>
> I think it's necessary to have an override method of union in sparkContext
> for the purpose of that the user can desinate the number of partitions and 
> the Partitioner.
> A func like this
> ```
>  def union[T: ClassTag](rdds: Seq[RDD[T], numPartitions: Int, partitioner: 
> Partitioner): RDD[T] = withScope {
>      
>   }
> ```
> we can discuss here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-17533) I think it's necessary to have an overrided method of union in sparkContext

Reply via email to