Smith wrote: > On 22/07/2017 22:21, Albert-Jan Roskam wrote: >> df1['difference'] = (df1 == df2).all(axis=1) > > below here there is the mistake : > > In [17]: diff = df1['difference'] = (df1 == df2).all(axis=1) > --------------------------------------------------------------------------- > ValueError Traceback (most recent call > last) <ipython-input-17-195a2c4caf00> in <module>() > ----> 1 diff = df1['difference'] = (df1 == df2).all(axis=1) > > /usr/local/lib/python3.5/dist-packages/pandas/core/ops.py in f(self, > other) > 1295 def f(self, other): > 1296 if isinstance(other, pd.DataFrame): # Another DataFrame > -> 1297 return self._compare_frame(other, func, str_rep) > 1298 elif isinstance(other, ABCSeries): > 1299 return self._combine_series_infer(other, func) > > /usr/local/lib/python3.5/dist-packages/pandas/core/frame.py in > _compare_frame(self, other, func, str_rep) > 3570 def _compare_frame(self, other, func, str_rep): > 3571 if not self._indexed_same(other): > -> 3572 raise ValueError('Can only compare identically-labeled > ' > 3573 'DataFrame objects') > 3574 return self._compare_frame_evaluate(other, func, str_rep) > > ValueError: Can only compare identically-labeled DataFrame objects
The columns of both dataframes must be identical. Compare: >>> import pandas as pd >>> a = pd.DataFrame([[1,2],[3,4]], columns=["a", "b"]) >>> b = pd.DataFrame([[1,2],[3,5]], columns=["a", "c"]) With different column names: >>> a != b Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3/dist-packages/pandas/core/ops.py", line 875, in f return self._compare_frame(other, func, str_rep) File "/usr/lib/python3/dist-packages/pandas/core/frame.py", line 2860, in _compare_frame raise ValueError('Can only compare identically-labeled ' ValueError: Can only compare identically-labeled DataFrame objects Again, with identical column names: >>> b = pd.DataFrame([[1,2],[3,5]], columns=["a", "b"]) >>> a != b a b 0 False False 1 False True -- https://mail.python.org/mailman/listinfo/python-list