Hi everybody,

Please, there is an issue with pyarrow (version 4.0.0) when you try to write a 
parquet with your engine. It is not possible to write a parquet from a pandas 
df when it includes non str columns (datetime64, float64, int64...)

Example:

df = pd.DataFrame({'A':[1, 2, 3], 'B':['a', 'b', 'c']})
df.to_parquet('example.parquet', engine='pyarrow') #Not working
ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for 
column InternalId with type float64')

df['A'] = df['A'].astype(str)
df.to_parquet('example.parquet', engine='pyarrow') #Working

Best!

[cid:image001.jpg@01D73B80.7A2385C0]
Jorge Alarcon
Senior Data Analytics Specialist

Mail: jorge.alar...@maccresi.com
Telf: +34 683541389
28020 Madrid


Reply via email to