Hello,
Let's say I want to compare two csv files: file A and file B. They are both
similarly built - the first column has product IDs (one product per row) and
the columns provide some stats about the products such as sales in # and $.
I want to compare these files - see which product IDs appear in the first
column of file A and not in B, and which in B and not A.
Finally, it would be very great if the result could be written into two new CSV
files - one product ID per row in the first column. (no other data in the other
columns needed)
This is the script I tried:
==
import csv
#open CSV's and read first column with product IDs into variables pointing to
lists
A = [line.split(',')[0] for line in open('Afile.csv')]
B = [line.split(',')[0] for line in open('Bfile.csv')]
#create variables pointing to lists with unique product IDs in A and B
respectively
inAnotB = list(set(A)-set(B))
inBnotA = list(set(B)-set(A))
print inAnotB
print inBnotA
c = csv.writer(open("inAnotB.csv", "wb"))
c.writerow([inAnotB])
d = csv.writer(open("inBnotA.csv", "wb"))
d.writerow([inBnotA])
print "done!"
=
But it doesn't produce the required results.
It prints IDs in this format:
247158132\n
and nothing to the csv files.
You could probably tell I'm a newbie.
Could you help me out?
here's some dummy data:
https://docs.google.com/file/d/0BwziqsHUZOWRYU15aEFuWm9fajA/edit?usp=sharing
https://docs.google.com/file/d/0BwziqsHUZOWRQVlTelVveEhsMm8/edit?usp=sharing
Thanks a bunch in advance! :)
--
http://mail.python.org/mailman/listinfo/python-list