Pareshkumar Panchal wrote: > I am trying to merge two csv files with following condition. > > filea.csv column( a1,a2,a3) > fileb.csv column( b1,b2,b3) > > i wanted to merge if a1 & b1 reference (as the title is different but > contents are valid for comparison).
If the column title is the problem you can rename it. > also the result i need is > a1,a2,a3,b1,b2 (not b3). Then delete that column either before or after the merge. > i am using openpyxl,csv modules at this moment. > > any help appreciated. Here's a demo using pandas: >>> import pandas as pd Read the original files: >>> left = pd.read_csv("left.csv") >>> left alpha beta gamma 0 foo one two 1 bar three four 2 baz five six 3 ham seven eight [4 rows x 3 columns] >>> right = pd.read_csv("right.csv") >>> right delta epsilon zeta 0 foo the quick 1 bar brown fox 2 baz jumps over 3 spam the lazy [4 rows x 3 columns] Merge by alpha/delta column: >>> left.merge(right, left_on="alpha", right_on="delta", how="outer") alpha beta gamma delta epsilon zeta 0 foo one two foo the quick 1 bar three four bar brown fox 2 baz five six baz jumps over 3 ham seven eight NaN NaN NaN 4 NaN NaN NaN spam the lazy [5 rows x 6 columns] >>> both = _ >>> import sys Write to file (or stream as I did for demonstration purposes): >>> both.to_csv(sys.stdout) ,alpha,beta,gamma,delta,epsilon,zeta 0,foo,one,two,foo,the,quick 1,bar,three,four,bar,brown,fox 2,baz,five,six,baz,jumps,over 3,ham,seven,eight,,, 4,,,,spam,the,lazy Delete a column you don't need (could have done that earlier): >>> del both["zeta"] >>> both.to_csv(sys.stdout) ,alpha,beta,gamma,delta,epsilon 0,foo,one,two,foo,the 1,bar,three,four,bar,brown 2,baz,five,six,baz,jumps 3,ham,seven,eight,, 4,,,,spam,the _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor