Re: [DISCUSS][C++] Raw pointer string views

2023-10-06 Thread Mark Raasveldt
For the index vs pointer question - DuckDB went with pointers as they are more flexible, and DuckDB was designed to consume data (and strings) from a wide variety of formats in a wide variety of languages. Pointers allows us to easily zero-copy from e.g. Python strings, R strings, Arrow strings,

[Java] Unable to populate a Field with a List of Structs?

2023-03-28 Thread Mark Schreiber
Hi - I have a data set which is mostly a 2D table, however one column (called Attributes) contains a List of Structs in each cell. Each Struct has three fields: Attribute Tag, Attribute Type and Attribute Value. The definition of the Attributes Field is: /** * Attribute Tag - Two character tag.

Request to start CI workflows

2022-12-29 Thread Mark Schreiber
Hi, I recently started my first PR for Arrow (Java) but I need someone to approve the check workflows before I can proceed. PR is https://github.com/apache/arrow/pull/15106 Thanks, Mark

Re: [Rust] Blog post for 2.0.0

2020-10-15 Thread Mark Farnan
I would agree with this. I’ve been working with the GO Arrow library last few weeks, and took a while to get head around it all / how to use etc. Even then not sure i’ve got it right. Usage examples would be great. Regards Mark > On Oct 14, 2020, at 4:08 PM, Fernando Herrera >

Re: Arrow as a streaming format

2020-09-09 Thread Mark Farnan
Unclear if this is needed or not. It would be ideal if even the streaming format coming in, was based on Arrow concepts / datatypes / organization etc. More thinking required. Regards Mark. On 9/10/20, 5:25 AM, "Fan Liya" wrote: +1 for introducing Arrow in streaming pro

RE: Compression in Arrow - Question

2020-08-30 Thread mark
method for appending inbound realtime sensor data into the in-memory model. Still thinking about that one. Regards Mark. [1] Large in obviously relative: In this case, a single plot may have 20-50 separate time series, each with between 20k to 10 million points each. [2] The data

Compression in Arrow - Question

2020-08-29 Thread mark
d, blocks could come out of a data base/source, through the data service, across the wire (flight) and land in the consuming applications memory without ever being decompressed or processed until final use. Crazy thought ? Regards Mark. [1]: https://www.vldb.org/pvldb/vol8/p1816-teller.pdf

RE: Arrow Flight + Go, Arrow for Realtime

2020-08-15 Thread mark
taset and visualization choices. So far arrow seems a good choice rather than any 'roll your own', and it will be nice to use same format on Client side as well as in the Server system. My use case is primarily 'Get', consuming large datasets for visualization. I doubt I

RE: Arrow Flight + Go, Arrow for Realtime

2020-08-14 Thread mark
Thanks Wes, I'll likely work on that once I get my head around Arrow in general and confirm will use for the project. Considerations for how to account for the streaming append problem to an otherwise immutable dataset is current concern. Still thinking through that. Regards

RE: Arrow Flight + Go, Arrow for Realtime

2020-08-14 Thread mark
reat if it can. Regards Mark. -Original Message- From: Sebastien Binet Sent: Wednesday, August 12, 2020 1:53 PM To: dev@arrow.apache.org Subject: Re: Arrow Flight + Go, Arrow for Realtime Mark, AFAIK, nobody's actively working on Arrow-Flight for Go (I think somebody started that w

Arrow Flight + Go, Arrow for Realtime

2020-08-12 Thread mark
ds to 'grow' as new data arrives, often at high speed). Not language specific, just trying to understand the right pattern for using Arrow for this, and couldn't' find much in the docs. Regards Mark.

[jira] [Created] (ARROW-8967) [Python] [Parquet] Table.to_pandas() fails to convert valid TIMESTAMP_MILLIS fails to convert to pandas timestamp

2020-05-27 Thread Mark Waddle (Jira)
Mark Waddle created ARROW-8967: -- Summary: [Python] [Parquet] Table.to_pandas() fails to convert valid TIMESTAMP_MILLIS fails to convert to pandas timestamp Key: ARROW-8967 URL: https://issues.apache.org/jira/browse

[jira] [Created] (ARROW-8648) [Rust] Optimize Rust CI Build Times

2020-04-30 Thread Mark Hildreth (Jira)
Mark Hildreth created ARROW-8648: Summary: [Rust] Optimize Rust CI Build Times Key: ARROW-8648 URL: https://issues.apache.org/jira/browse/ARROW-8648 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-8637) Resolve Issues with `prettytable-rs` dependency

2020-04-29 Thread Mark Hildreth (Jira)
Mark Hildreth created ARROW-8637: Summary: Resolve Issues with `prettytable-rs` dependency Key: ARROW-8637 URL: https://issues.apache.org/jira/browse/ARROW-8637 Project: Apache Arrow Issue

[jira] [Created] (ARROW-8608) Update vendored mpark/variant.h to latest to fix NVCC compilation issues

2020-04-27 Thread Mark Harris (Jira)
Mark Harris created ARROW-8608: -- Summary: Update vendored mpark/variant.h to latest to fix NVCC compilation issues Key: ARROW-8608 URL: https://issues.apache.org/jira/browse/ARROW-8608 Project: Apache

[jira] [Created] (ARROW-8590) [Rust] Use Arrow pretty print utility in DataFusion

2020-04-24 Thread Mark Hildreth (Jira)
Mark Hildreth created ARROW-8590: Summary: [Rust] Use Arrow pretty print utility in DataFusion Key: ARROW-8590 URL: https://issues.apache.org/jira/browse/ARROW-8590 Project: Apache Arrow

[jira] [Created] (ARROW-8015) Releasing pyarrow 0.16.0 for Windows Python 3.5

2020-03-05 Thread Mark Keller (Jira)
Mark Keller created ARROW-8015: -- Summary: Releasing pyarrow 0.16.0 for Windows Python 3.5 Key: ARROW-8015 URL: https://issues.apache.org/jira/browse/ARROW-8015 Project: Apache Arrow Issue Type

Missing Windows 35 wheel for 0.16.0

2020-03-04 Thread Mark Keller
-5679 but this seems to be open for the time being. Could you please let me know if you are planning on releasing this, or if it’s gone for good? -- Mark Keller Software Engineer mobile +1 650 484 6154 <+16504846154> email mark.kel...@snowflake.com Snowflake Inc. 450 Concar Drive San

[jira] [Created] (ARROW-6815) Timestamps saved via Pandas and PyArrow unreadable in Hive and Presto

2019-10-08 Thread Mark Litwintschik (Jira)
Mark Litwintschik created ARROW-6815: Summary: Timestamps saved via Pandas and PyArrow unreadable in Hive and Presto Key: ARROW-6815 URL: https://issues.apache.org/jira/browse/ARROW-6815 Project

[jira] [Created] (ARROW-6205) ARROW_DEPRECATED warning when including io/interfaces.h from CUDA (.cu) source

2019-08-11 Thread Mark Harris (JIRA)
Mark Harris created ARROW-6205: -- Summary: ARROW_DEPRECATED warning when including io/interfaces.h from CUDA (.cu) source Key: ARROW-6205 URL: https://issues.apache.org/jira/browse/ARROW-6205 Project

Re: Weld

2016-12-15 Thread Mark Hamstra
I already made sure that Matei is aware of this thread. He seemed interested in talking with key Arrow developers. On Thu, Dec 15, 2016 at 10:49 AM, Julian Hyde wrote: > I think someone should reach out to Matei and Shoumik, and see if they > would like to collaborate. Wes, would you like to do