Hi all!
I'm having the following problem. Consider the code (the commented or
the not commented which I think do the same things):
#for col in missing_cols:
# df[col] = np.nan
df=df.copy()
df[missing_cols]=np.nan
df has about 20000 cols and len(missing_cols) is about 18000.
I'm getting lots (1 by missing_col?) of the following message from
ipykernel:
"PerformanceWarning: DataFrame is highly fragmented. This is usually
the result of calling `frame.insert` many times, which has poor
performance. Consider joining all columns at once using
pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe =
frame.copy()`
df[missing_cols]=np.nan"
At first I didn't have df=df.copy(). I added it later, but the same problem.
This slows down the code a lot, perhaps because jupyter is taking too
much time issuing these messages!
Thanks for any comments.
--
https://mail.python.org/mailman/listinfo/python-list