Hi, I have to compare two pandas dataframes and find difference between each row. For example, in the code bellow, rows with index 0 and 3 are intentionally different. Row 0 is different in field A, and row 3 is different in field 3.
After merging dataframes, I can concentrate to the dfm with dfm['_merge'] != 'both'. How to find which field is different? End result should be: row 0, A row 3, B Regards. import pandas as pd import numpy as np idx = pd.date_range('01.01.2017', periods=7, freq='D') A = [False, True, True, False, True, False, True] B = np.random.randn(7) C = np.random.randn(7) data1 = { 'A' : [False, True, True, False, True, False, True], 'B' : list(range(len(idx))), 'C' : list(range(len(idx), len(idx) + len(idx))) } dfl = pd.DataFrame(data1, index=idx) data2 = data1.copy() data2['A'][0] = True data2['B'][3] = 30 dfr = pd.DataFrame(data2, index=idx) # dfm = dfl.merge(dfr, indicator=True, how='outer', left_on=dfl.index, right_on = dfr.index) dfm = dfl.merge(dfr, indicator=True, how='outer') print(dfl) print(dfr) print(dfm) print(dfl.iloc[[0,3]]) print(dfr.iloc[[0,3]]) Output: A B C 2017-01-01 False 0 7 2017-01-02 True 1 8 2017-01-03 True 2 9 2017-01-04 False 3 10 2017-01-05 True 4 11 2017-01-06 False 5 12 2017-01-07 True 6 13 A B C 2017-01-01 True 0 7 2017-01-02 True 1 8 2017-01-03 True 2 9 2017-01-04 False 30 10 2017-01-05 True 4 11 2017-01-06 False 5 12 2017-01-07 True 6 13 A B C _merge 0 False 0 7 left_only 1 True 1 8 both 2 True 2 9 both 3 False 3 10 left_only 4 True 4 11 both 5 False 5 12 both 6 True 6 13 both 7 True 0 7 right_only 8 False 30 10 right_only Backend TkAgg is interactive backend. Turning interactive mode on. A B C 2017-01-01 False 0 7 2017-01-04 False 3 10 A B C 2017-01-01 True 0 7 2017-01-04 False 30 10 -- https://mail.python.org/mailman/listinfo/python-list