Create an array out of cilumns, convert to Dataframe,
explode,distinct,write.
On 19 Sep 2016 19:11, "Saurav Sinha" <sauravsinh...@gmail.com> wrote:

> You can use distinct over you data frame or rdd
>
> rdd.distinct
>
> It will give you distinct across your row.
>
> On Mon, Sep 19, 2016 at 2:35 PM, Abhishek Anand <abhis.anan...@gmail.com>
> wrote:
>
>> I have an rdd which contains 14 different columns. I need to find the
>> distinct across all the columns of rdd and write it to hdfs.
>>
>> How can I acheive this ?
>>
>> Is there any distributed data structure that I can use and keep on
>> updating it as I traverse the new rows ?
>>
>> Regards,
>> Abhi
>>
>
>
>
> --
> Thanks and Regards,
>
> Saurav Sinha
>
> Contact: 9742879062
>

Reply via email to