Jordan Samuels created ARROW-1957: ------------------------------------- Summary: Handle nanosecond timestamps in parquet serialization Key: ARROW-1957 URL: https://issues.apache.org/jira/browse/ARROW-1957 Project: Apache Arrow Issue Type: Improvement Affects Versions: 0.8.0 Environment: Python 3.6.4, Mac OSX Reporter: Jordan Samuels Priority: Minor
The following code {code:python} import pyarrow as pa import pyarrow.parquet as pq import pandas as pd n=3 df = pd.DataFrame({'x': range(n)}, index=pd.DatetimeIndex(start='2017-01-01', freq='1n', periods=n)) pq.write_table(pa.Table.from_pandas(df), '/tmp/t.parquet'){code} results in: {{ArrowInvalid: Casting from timestamp[ns] to timestamp[us] would lose data: 1483228800000000001}} The desired effect is that we can save nanosecond resolution without losing precision (e.g. conversion to ms). Note that if {{freq='1u'}} is used, the code runs properly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)