from:"Lucas Pickup"

Re: Thank you

2020-08-27 Thread Lucas Pickup

This is an awesome sentimate, thank you Release orchestrstors and contributors! Cheers, Lucas On Thu, Aug 27, 2020 at 1:26 PM Jorge Cardoso Leitão < jorgecarlei...@gmail.com> wrote: > Hi, > > > > I am writing to just thank all those involved in the release process. > > Sometimes the work of rele

Table.cast throws ArrowNotImplementedError (pyarrow==0.15.0)

2019-10-08 Thread Lucas Pickup

So it seems in 'pyarrow==0.15.0' `Table.columns` now returns ChunkedArray instead of Column. This has broken `Table.cast()` as it just calls `Table.itercolumns` and expects the yielded values to have a `.cast()` method, which ChunkedArray doesn't. Was `Table.cast()` missed in cleaning up after

[jira] [Created] (ARROW-1436) PyArrow Timestamps written to Parquet as INT96 appear in Spark as 'bigint'

2017-08-30 Thread Lucas Pickup (JIRA)

Lucas Pickup created ARROW-1436: --- Summary: PyArrow Timestamps written to Parquet as INT96 appear in Spark as 'bigint' Key: ARROW-1436 URL: https://issues.apache.org/jira/browse/ARROW-1436

[jira] [Created] (ARROW-1435) PyArrow not propagating timezone information from Parquet to Pyhon

2017-08-30 Thread Lucas Pickup (JIRA)

Lucas Pickup created ARROW-1435: --- Summary: PyArrow not propagating timezone information from Parquet to Pyhon Key: ARROW-1435 URL: https://issues.apache.org/jira/browse/ARROW-1435 Project: Apache Arrow

RE: PyArrow not retaining Parquet metadata

2017-08-30 Thread Lucas Pickup

Please reply to: lucas.pic...@microsoft.com Outlook isn't playing nice. Apologies, Lucas Pickup -Original Message- From: Lucas Pickup [mailto:lucas.pic...@microsoft.com.INVALID] Sent: Wednesday, August 30, 2017 10:47 AM To: dev@arrow.apache.org Subject: PyArrow not retaining Pa

PyArrow not retaining Parquet metadata

2017-08-30 Thread Lucas Pickup

_: int64 -- metadata -- pandas: {"pandas_version": "0.20.3", "columns": [{"name": "DateNaive", "pandas_type": "datetime", "numpy_type": "datetime64[ns]", "metadata": null}, {"name": "DateAware", "pandas_type": "datetimetz", "numpy_type": "datetime64[ns, UTC]", "metadata": {"timezone": "UTC"}}], "index_columns": ["__index_level_0__"]} >>> >>> pyarrowDF = pyarrowTable.to_pandas() >>> pyarrowDF DateNaive DateAware 0 2015-07-05 23:50:00 2015-07-05 23:50:00 >>> This was on PyArrow 0.6.0. Cheers, Lucas Pickup

Re: Reading Parquet datetime column gives different answer in Spark vs PyArrow

2017-08-28 Thread Lucas Pickup

Here is the pyspark script I used to see this difference. On Mon, 28 Aug 2017 at 09:20 Lucas Pickup wrote: > Hi all, > > Very sorry if people already responded to this at: > lucas.pic...@microsoft.com There was an INVALID identifier attached to > the end of the reply address

Reading Parquet datetime column gives different answer in Spark vs PyArrow

2017-08-28 Thread Lucas Pickup

'ns', tz='GMT')) newColumn = pa.Column.from_array(newField, newArray) table = table.remove_column(i) table = table.add_column(i, newColumn) return table Cheers, Lucas Pickup

RE: Reading Parquet datetime column gives different answer in Spark vs PyArrow

2017-08-25 Thread Lucas Pickup

l in chunkedToArray(table[i].data)], pa.timestamp('ns', tz='GMT')) newColumn = pa.Column.from_array(newField, newArray) table = table.remove_column(i) table = table.add_column(i, newColumn) return table Cheers, Lucas Pickup From: Lucas Pick

Reading Parquet datetime column gives different answer in Spark vs PyArrow

2017-08-25 Thread Lucas Pickup

Date 0 2015-07-06 06:50:00 1 2015-07-06 06:30:00 I would've expected to end up with the same datetime from both readers since there was no timezone attached at any point. It just a date and time value. Am I missing anything here? Or is this a bug. Cheers, Lucas Pickup

Major difference between Spark and Arrow Parquet Implementations

2017-08-16 Thread Lucas Pickup

is here<https://github.com/apache/spark/blob/cba826d00173a945b0c9a7629c66e36fa73b723e/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala#L565>. I was wondering if there was a reason why the implementations have such a major difference when it comes to schema generation? Cheers, Lucas Pickup

Re: Thank you

Table.cast throws ArrowNotImplementedError (pyarrow==0.15.0)

[jira] [Created] (ARROW-1436) PyArrow Timestamps written to Parquet as INT96 appear in Spark as 'bigint'

[jira] [Created] (ARROW-1435) PyArrow not propagating timezone information from Parquet to Pyhon

RE: PyArrow not retaining Parquet metadata

PyArrow not retaining Parquet metadata

Re: Reading Parquet datetime column gives different answer in Spark vs PyArrow

Reading Parquet datetime column gives different answer in Spark vs PyArrow

RE: Reading Parquet datetime column gives different answer in Spark vs PyArrow

Reading Parquet datetime column gives different answer in Spark vs PyArrow

Major difference between Spark and Arrow Parquet Implementations

11 matches

Site Navigation

Mail list logo

Footer information