pandas dataframe, find duplicates and add suffix

zljubisic Tue, 28 Mar 2017 14:16:20 -0700

In dataframe

import pandas as pd


data = {'model': ['first', 'first', 'second', 'second', 'second', 'third', 
'third'],
        'dtime': ['2017-01-01_112233', '2017-01-01_112234', 
'2017-01-01_112234', '2017-01-01_112234', '2017-01-01_112234', 
'2017-01-01_112235', '2017-01-01_112235'],
        }
df = pd.DataFrame(data, index = ['a.jpg', 'b.jpg', 'c.jpg', 'd.jpg', 'e.jpg', 
'f.jpg', 'g.jpg'], columns=['model', 'dtime'])

print(df.head(10))

        model              dtime
a.jpg   first  2017-01-01_112233
b.jpg   first  2017-01-01_112234
c.jpg  second  2017-01-01_112234
d.jpg  second  2017-01-01_112234
e.jpg  second  2017-01-01_112234
f.jpg   third  2017-01-01_112235
g.jpg   third  2017-01-01_112235

within model, there are duplicate dtime values.
For example, rows d and e are duplicates of the c row.
Row g is duplicate of the f row.

For each duplicate (within model) I would like to add suffix (starting from 1) 
to the dtime value. Something like this:

        model              dtime
a.jpg   first  2017-01-01_112233
b.jpg   first  2017-01-01_112234
c.jpg  second  2017-01-01_112234
d.jpg  second  2017-01-01_112234-1
e.jpg  second  2017-01-01_112234-2
f.jpg   third  2017-01-01_112235
g.jpg   third  2017-01-01_112235-1

How to do that?
-- 
https://mail.python.org/mailman/listinfo/python-list

pandas dataframe, find duplicates and add suffix

Reply via email to