Hi all!

I'm having the following problem. Consider the code (the commented or the not commented which I think do the same things):

#for col in missing_cols:
#    df[col] = np.nan

df=df.copy()
df[missing_cols]=np.nan

df has about 20000 cols and len(missing_cols) is about 18000.

I'm getting lots (1 by missing_col?) of the following message from ipykernel:

"PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df[missing_cols]=np.nan"


At first I didn't have df=df.copy(). I added it later, but the same problem.

This slows down the code a lot, perhaps because jupyter is taking too much time issuing these messages!

Thanks for any comments.
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to