Thanks Mike for this.

This is Scala. As expected it adds the id column to the end of the column
list starting from 0 0

scala> val df = ll_18740868.withColumn("id",
monotonically_increasing_id()).show
(2)
+---------------+---------------+---------+-------------+---
-------------------+-----------+------------+-------+---+
|transactiondate|transactiontype| sortcode|accountnumber|
transactiondescription|debitamount|creditamount|balance| id|
+---------------+---------------+---------+-------------+---
-------------------+-----------+------------+-------+---+
|     2009-12-31|            CPT|'30-64-72|     18740868|  LTSB STH
KENSINGT...|       90.0|        null|  400.0|  0|
|     2009-12-31|            CPT|'30-64-72|     18740868|  LTSB CHELSEA
(309...|       10.0|        null|  490.0|  1|
+---------------+---------------+---------+-------------+---
-------------------+-----------+------------+-------+---+

Can one provide the starting value say 1?

Cheers


Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 5 August 2016 at 16:45, Mike Metzger <m...@flexiblecreations.com> wrote:

> You can use the monotonically_increasing_id method to generate guaranteed
> unique (but not necessarily consecutive) IDs.  Calling something like:
>
> df.withColumn("id", monotonically_increasing_id())
>
> You don't mention which language you're using but you'll need to pull in
> the sql.functions library.
>
> Mike
>
> On Aug 5, 2016, at 9:11 AM, Tony Lane <tonylane....@gmail.com> wrote:
>
> Ayan - basically i have a dataset with structure, where bid are unique
> string values
>
> bid: String
> val : integer
>
> I need unique int values for these string bid''s to do some processing in
> the dataset
>
> like
>
> id:int   (unique integer id for each bid)
> bid:String
> val:integer
>
>
>
> -Tony
>
> On Fri, Aug 5, 2016 at 6:35 PM, ayan guha <guha.a...@gmail.com> wrote:
>
>> Hi
>>
>> Can you explain a little further?
>>
>> best
>> Ayan
>>
>> On Fri, Aug 5, 2016 at 10:14 PM, Tony Lane <tonylane....@gmail.com>
>> wrote:
>>
>>> I have a row with structure like
>>>
>>> identifier: String
>>> value: int
>>>
>>> All identifier are unique and I want to generate a unique long id for
>>> the data and get a row object back for further processing.
>>>
>>> I understand using the zipWithUniqueId function on RDD, but that would
>>> mean first converting to RDD and then joining back the RDD and dataset
>>>
>>> What is the best way to do this ?
>>>
>>> -Tony
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Ayan Guha
>>
>
>

Reply via email to