I have a list of Pandas Dataframes that I am attempting to combine using the concatenation function.
dataframe_lists = [df1, df2, df3] result = pd.concat(dataframe_lists, keys = ['one', 'two','three'], ignore_index=True) The full traceback that I receive when I execute this function is: --------------------------------------------------------------------------- AssertionError Traceback (most recent call last) <ipython-input-198-a30c57d465d0> in <module>() ----> 1 result = pd.concat(dataframe_lists, keys = ['one', 'two','three'], ignore_index=True) 2 check(dataframe_lists) C:\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\pandas\tools\merge.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, copy) 753 verify_integrity=verify_integrity, 754 copy=copy) --> 755 return op.get_result() 756 757 C:\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\pandas\tools\merge.py in get_result(self) 924 925 new_data = concatenate_block_managers( --> 926 mgrs_indexers, self.new_axes, concat_axis=self.axis, copy=self.copy) 927 if not self.copy: 928 new_data._consolidate_inplace() C:\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\pandas\core\internals.py in concatenate_block_managers(mgrs_indexers, axes, concat_axis, copy) 4061 copy=copy), 4062 placement=placement) -> 4063 for placement, join_units in concat_plan] 4064 4065 return BlockManager(blocks, axes) C:\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\pandas\core\internals.py in <listcomp>(.0) 4061 copy=copy), 4062 placement=placement) -> 4063 for placement, join_units in concat_plan] 4064 4065 return BlockManager(blocks, axes) C:\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\pandas\core\internals.py in concatenate_join_units(join_units, concat_axis, copy) 4150 raise AssertionError("Concatenating join units along axis0") 4151 -> 4152 empty_dtype, upcasted_na = get_empty_dtype_and_na(join_units) 4153 4154 to_concat = [ju.get_reindexed_values(empty_dtype=empty_dtype, C:\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\pandas\core\internals.py in get_empty_dtype_and_na(join_units) 4139 return np.dtype('m8[ns]'), tslib.iNaT 4140 else: # pragma -> 4141 raise AssertionError("invalid dtype determination in get_concat_dtype") 4142 4143 AssertionError: invalid dtype determination in get_concat_dtype I believe that the error lies in the fact that one of the data frames is empty. As a temporary workaround this rather perplexing error. I used the simple function check to verify and return just the headers of the empty dataframe: def check(list_of_df): headers = [] for df in dataframe_lists: if df.empty is not True: continue else: headers.append(df.columns) return headers I am wondering if it is possible to use this function to, if in the case of an empty dataframe, return just that empty dataframe's headers and append it to the concatenated dataframe. The output would be a single row for the headers (and, in the case of a repeating column name, just a single instance of the header (as in the case of the concatenation function). I have two sample data sources, one and two non-empty data sets. df1: https://gist.github.com/ahlusar1989/42708e6a3ca0aed9b79b df2 :https://gist.github.com/ahlusar1989/26eb4ce1578e0844eb82 Here is an empty dataframe. df3 (empty dataframe): https://gist.github.com/ahlusar1989/0721bd8b71416b54eccd I would like to have the resulting concatenate have the column headers (with their values) that reflects df1 and df2... 'AT','AccountNum', 'AcctType', 'Amount', 'City', 'Comment', 'Country','DuplicateAddressFlag', 'FromAccount', 'FromAccountNum', 'FromAccountT','PN', 'PriorCity', 'PriorCountry', 'PriorState', 'PriorStreetAddress','PriorStreetAddress2', 'PriorZip', 'RTID', 'State', 'Street1','Street2', 'Timestamp', 'ToAccount', 'ToAccountNum', 'ToAccountT', 'TransferAmount', 'TransferMade', 'TransferTimestamp', 'Ttype', 'WA','WC', 'Zip' as follows: 'A', 'AT','AccountNum', 'AcctType', 'Amount', 'B', 'C', 'City', 'Comment', 'Country', 'D', 'DuplicateAddressFlag', 'E', 'F' 'FromAccount', 'FromAccountNum', 'FromAccountT', 'G', 'PN', 'PriorCity', 'PriorCountry', 'PriorState', 'PriorStreetAddress','PriorStreetAddress2', 'PriorZip', 'RTID', 'State', 'Street1','Street2', 'Timestamp', 'ToAccount', 'ToAccountNum', 'ToAccountT', 'TransferAmount', 'TransferMade', 'TransferTimestamp', 'Ttype', 'WA','WC', 'Zip' I welcome any feedback on how to best do this. Thank you. -- https://mail.python.org/mailman/listinfo/python-list