Re: Force Partitioner to use entire entry of PairRDD as key

Jay Luan Mon, 22 Feb 2016 18:09:07 -0800

Thank you, that helps a lot.

On Mon, Feb 22, 2016 at 6:01 PM, Takeshi Yamamuro <linguin....@gmail.com>
wrote:


> You're correct, reduceByKey is just an example.
>
> On Tue, Feb 23, 2016 at 10:57 AM, Jay Luan <jaylu...@gmail.com> wrote:
>
>> Could you elaborate on how this would work?
>>
>> So from what I can tell, this maps a key to a tuple which always has a 0
>> as the second element. From there the hash widely changes because we now
>> hash something like ((1,4), 0) and ((1,3), 0). Thus mapping this would
>> create more even partitions. Why reduce by key after? Is that just an
>> example of an operation that can be done? Or does it provide some kind of
>> real value to the operation.
>>
>>
>>
>> On Mon, Feb 22, 2016 at 5:48 PM, Takeshi Yamamuro <linguin....@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> How about adding dummy values?
>>> values.map(d => (d, 0)).reduceByKey(_ + _)
>>>
>>> On Tue, Feb 23, 2016 at 10:15 AM, jluan <jaylu...@gmail.com> wrote:
>>>
>>>> I was wondering, is there a way to force something like the hash
>>>> partitioner
>>>> to use the entire entry of a PairRDD as a hash rather than just the key?
>>>>
>>>> For Example, if we have an RDD with values: PairRDD = [(1,4), (1, 3),
>>>> (2,
>>>> 3), (2,5), (2, 10)]. Rather than using keys 1 and 2, can we force the
>>>> partitioner to hash the entire tuple such as (1,4)?
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Force-Partitioner-to-use-entire-entry-of-PairRDD-as-key-tp26299.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>
>>>>
>>>
>>>
>>> --
>>> ---
>>> Takeshi Yamamuro
>>>
>>
>>
>
>
> --
> ---
> Takeshi Yamamuro
>

Re: Force Partitioner to use entire entry of PairRDD as key

Reply via email to