extracting duplicates from CSV file by specific fields
Hi, I have a csv file: 'aaa.111', 'T100', 'pn123', 'sn111' 'aaa.111', 'T200', 'pn123', 'sn222' 'bbb.333', 'T300', 'pn123', 'sn333' 'ccc.444', 'T400', 'pn123', 'sn444' 'ddd', 'T500', 'pn123', 'sn555' 'eee.666', 'T600', 'pn123', 'sn444' 'fff.777', 'T700', 'pn123', 'sn777' How can I extract duplicates checking each row by filed1 and filed4? I should get something like that: 'aaa.111', 'T100', 'pn123', 'sn111' 'bbb.333', 'T300', 'pn123', 'sn333' 'ccc.444', 'T400', 'pn123', 'sn444' 'ddd', 'T500', 'pn123', 'sn555' 'fff.777', 'T700', 'pn123', 'sn777' and 'aaa.111', 'T200', 'pn123', 'sn222' 'eee.666', 'T600', 'pn123', 'sn444' Any help will be extremely appreciated. -- http://mail.python.org/mailman/listinfo/python-list
Re: extracting duplicates from CSV file by specific fields
Thanks guys! Tested, seems working. CSV file: - "a.a","sn-01" "b.b","sn-02" "c.c","sn-03" "d.d","sn-04" "e.e","sn-05" "f.f","sn-06" "g.g","sn-07" "h.h","sn-08" "i.i","sn-09" "a.a","sn-10" "k.k","sn-02" "i.i","sn-09" Source: - #!/usr/bin/env python import csv unqs = [] dups = [] seen_in_field0 = set() seen_in_field1 = set() reader = csv.reader(open("myfile.csv", "rb")) print "\nOriginals:\n" for row in reader: print row if row[0] in seen_in_field0 or row[1] in seen_in_field1: dups.append(row) else: seen_in_field0.add(row[0]) seen_in_field1.add(row[1]) unqs.append(row) print "\nUniques:\n" for row in unqs: print row print "\nDuplicates:\n" for row in dups: print row print "\n" Result: - Originals: ['a.a', 'sn-01'] ['b.b', 'sn-02'] ['c.c', 'sn-03'] ['d.d', 'sn-04'] ['e.e', 'sn-05'] ['f.f', 'sn-06'] ['g.g', 'sn-07'] ['h.h', 'sn-08'] ['i.i', 'sn-09'] ['a.a', 'sn-10'] ['k.k', 'sn-02'] ['i.i', 'sn-09'] Uniques: ['a.a', 'sn-01'] ['b.b', 'sn-02'] ['c.c', 'sn-03'] ['d.d', 'sn-04'] ['e.e', 'sn-05'] ['f.f', 'sn-06'] ['g.g', 'sn-07'] ['h.h', 'sn-08'] ['i.i', 'sn-09'] Duplicates: ['a.a', 'sn-10'] ['k.k', 'sn-02'] ['i.i', 'sn-09'] -- http://mail.python.org/mailman/listinfo/python-list
Please advise me for a right solution
Hi all, Please advise me for a right solution based on your experience. I need to create a web based inventory tool with specific requirements such as: * More then one group is going to use it. * Authentication and authorization system based on user and group privileges. For example based on a group privileges group can add/edit/delete their own stuff and having a read only access to other groups stuff. etc. What solution is better for that? * CGI implementation from the scratch. It seems to much work and I am not sure that this is right way. * WSGI based frameworks such as Werkzeug, Pylons, repoze.bg for HTTP requests and responds plus different components like AuthKit, repoze.who and repoze.what, SQLAlchemy or raw SQL I was trying to get those thing done by Django, but realized that every time I have to extend Django admin interface or to extend user profile etc.. I am not telling that Django is not good for this, just personal fillings. May be I am wrong. Well, what you recommend me? -- http://mail.python.org/mailman/listinfo/python-list