Re: [DISCUSS] How to provide forward compatibility with MetadataVersion

2020-07-14 Thread Antoine Pitrou
I think Micah is right. Also, it seems (from checking the source) that the Flatbuffers verifier doesn't check that enums are in range, so we may possibly allow out-of-range values and interpret them as "highest supported version". Regards Antoine. Le 14/07/2020 à 00:53, Micah Kornfield a écr

[NIGHTLY] Arrow Build Report for Job nightly-2020-07-14-0

2020-07-14 Thread Crossbow
Arrow Build Report for Job nightly-2020-07-14-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-07-14-0 Failed Tasks: - gandiva-jar-xenial: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-07-14-0-travis-gandiva-jar-xenial - test-co

Re: [DISCUSS] How to provide forward compatibility with MetadataVersion

2020-07-14 Thread Micah Kornfield
I think allowing out of range values would be a mistake. Doesn't a change in the metadata version indicates a non backwards compatible change? On Tuesday, July 14, 2020, Antoine Pitrou wrote: > > I think Micah is right. Also, it seems (from checking the source) that > the Flatbuffers verifier

Re: [DISCUSS] How to provide forward compatibility with MetadataVersion

2020-07-14 Thread Antoine Pitrou
Le 14/07/2020 à 15:46, Micah Kornfield a écrit : > I think allowing out of range values would be a mistake. Doesn't a change > in the metadata version indicates a non backwards compatible change? I have no idea. Perhaps it will indeed, especially now that we have the features enum. Regards A

Re: [DISCUSS] How to provide forward compatibility with MetadataVersion

2020-07-14 Thread Wes McKinney
I'll post a patch this morning. If a library sees a version "from the future" it should error, pretty simple. On Tue, Jul 14, 2020 at 8:47 AM Antoine Pitrou wrote: > > > Le 14/07/2020 à 15:46, Micah Kornfield a écrit : > > I think allowing out of range values would be a mistake. Doesn't a change

Re: [Discuss] [Rust] Looking to add Wasm32 compile target for rust library

2020-07-14 Thread Brian Hulette
That sounds great! I'd like to have some support for using the rust and/or C++ libraries in the browser via wasm as well. As long as the community is ok with your overall approach "to add compiler conditionals around any I/O features and libc dependent features of these two libraries," I think it m

Re: [Discuss] [Rust] Looking to add Wasm32 compile target for rust library

2020-07-14 Thread Micah Kornfield
Fwiw, I believe at least the core c++ library already can be compiled to wasm. I think perspective does this [1] I'm curious What are you hoping to achieve with embedded wasm in spark? Thanks, Micah [1] https://perspective.finos.org/ On Tuesday, July 14, 2020, Brian Hulette wrote: > That

Re: [Discuss] [Rust] Looking to add Wasm32 compile target for rust library

2020-07-14 Thread Andy Grove
I'm also curious about the use case and have put questions in the JIRA. Thanks, Andy. On Tue, Jul 14, 2020 at 8:27 AM Micah Kornfield wrote: > Fwiw, I believe at least the core c++ library already can be compiled to > wasm. I think perspective does this [1] > > > I'm curious What are you hopi

Re: [Discuss] [Rust] Looking to add Wasm32 compile target for rust library

2020-07-14 Thread Adam Lippai
This sounds really interesting, how about adding the wasm build (C++) to the releases? I've done a lot of asm.js work (different from wasm) in the past, but my assumption would be that using Rust instead of C++ as source for wasm should result in smaller wasm binaries. Rust Arrow doesn't really use

Re: [Discuss] [Rust] Looking to add Wasm32 compile target for rust library

2020-07-14 Thread Micah Kornfield
Hi Adam, > This sounds really interesting, how about adding the wasm build (C++) to > the releases? I think this just needs someone to volunteer to do it and maintain it (at a minimum if it doesn't already exist we need CI for it). We would also need to figure out details of publishing and integ

Re: [Discuss] [Rust] Looking to add Wasm32 compile target for rust library

2020-07-14 Thread Adam Lippai
"I don't know much about either, but I'm curious why you would expect this to be the case?" Looks like this is not true, it was just my perception reading the different articles. They are practically the same for a "hello world" if compiled carefully. So this is really up to a real world comparison

[Discuss] Format to use when casting temporal arrays to string

2020-07-14 Thread Ben Kietzman
When casting (for example) date32 -> string, should the result be the digits of the underlying integer value or a timestamp? For timestamp -> string the format should probably be ISO8601 since that is the format used when casting string -> timestamp (if a different format is used then string -> ti

Re: [Discuss] Format to use when casting temporal arrays to string

2020-07-14 Thread Antoine Pitrou
How is the other side (cast string -> date32) implemented? I would say, ideally a timestamp is accepted. Le 14/07/2020 à 19:07, Ben Kietzman a écrit : > When casting (for example) date32 -> string, should the result be the > digits of the underlying integer value or a timestamp? > > For times

Re: [Discuss] Format to use when casting temporal arrays to string

2020-07-14 Thread Ben Kietzman
string -> date32 is not implemented, AFAICT. I agree that a timestamp seems ideal, that way string -> date32 should produce the same result as string -> timestamp -> date32. A related question: what format would be expected for time32 <-> string? On Tue, Jul 14, 2020 at 1:19 PM Antoine Pitrou w

Re: [Discuss] Format to use when casting temporal arrays to string

2020-07-14 Thread Antoine Pitrou
Le 14/07/2020 à 19:40, Ben Kietzman a écrit : > string -> date32 is not implemented, AFAICT. > > I agree that a timestamp seems ideal, that way string -> date32 should > produce the same result as string -> timestamp -> date32. > > A related question: what format would be expected for time32 <-

Re: [Discuss] Format to use when casting temporal arrays to string

2020-07-14 Thread Wes McKinney
I agree with using ISO 8601 or the constituent components (date or time) thereof On Tue, Jul 14, 2020 at 12:48 PM Antoine Pitrou wrote: > > > Le 14/07/2020 à 19:40, Ben Kietzman a écrit : > > string -> date32 is not implemented, AFAICT. > > > > I agree that a timestamp seems ideal, that way strin

Re: [Discuss] Format to use when casting temporal arrays to string

2020-07-14 Thread Neal Richardson
Are we sure that "casting" should be supported? date/timestamp -> string sounds to me like "format" with parameters (like strftime tokens, which may default to ISO-8601), not "cast". Likewise, string to timestamp sounds like "parse" (also with parameters), not "cast". Neal On Tue, Jul 14, 2020 a

Re: [Discuss] Format to use when casting temporal arrays to string

2020-07-14 Thread Antoine Pitrou
I suppose "cast" to string is just another way of saying "represent as string". We may want a specific name for T -> string and string -> T. Feel free to discuss :-) As for mandating a format, though, that's a bit annoying in the common case. It's ok to be able to customize the representation

Re: [Discuss] Format to use when casting temporal arrays to string

2020-07-14 Thread Wes McKinney
I would actually be OK with disallowing temporal -> string inside Cast. SQL systems only provide this through functions like TO_CHAR https://www.postgresql.org/docs/12/functions-formatting.html On Tue, Jul 14, 2020 at 1:18 PM Antoine Pitrou wrote: > > > I suppose "cast" to string is just another

Re: [Discuss] Format to use when casting temporal arrays to string

2020-07-14 Thread Antoine Pitrou
But is there a point in forcing a different function name (except annoying the poor developer)? Le 14/07/2020 à 20:22, Wes McKinney a écrit : > I would actually be OK with disallowing temporal -> string inside > Cast. SQL systems only provide this through functions like TO_CHAR > > https://www

Re: [Discuss] Format to use when casting temporal arrays to string

2020-07-14 Thread Wes McKinney
Well, the spirit of "cast" is to provide conversions where there is only "one way to do it" (subject to certain nuisances like decimal point / comma distinction based on locale). Since there are many possible ways to display a temporal type as a string, I'm not sure this qualifies, instead it seems

Re: [Discuss] Format to use when casting temporal arrays to string

2020-07-14 Thread Micah Kornfield
In a SQL context, "cast" to/from date times is converted if the string matches the literal format matches. E.g. BQ standard SQL [1]. But naming is hard :) [1] https://cloud.google.com/bigquery/docs/reference/standard-sql/conversion_rules#casting_date_types On Tue, Jul 14, 2020 at 1:09 PM Wes Mc

Failing "Ursabot" builds

2020-07-14 Thread Wes McKinney
I patch I merged [1] has caused failures in 4 externally-configured Ursabot builds that are missing the ARROW_TEST_DATA environment variable, so if you see these failures show up in PRs or on master, you can ignore them until the env variable issue is fixed. Thanks Wes [1]: https://github.com/ap

Re: Timeline for next major Arrow release (1.0.0)

2020-07-14 Thread Wes McKinney
I think we are basically code complete for the release at this point except for two outstanding issues: * ARROW-9424 (disabling writing LZ4-compressed Parquet files in C++) -- awaiting CI * ARROW-9139 (switching default "engine" for pyarrow.parquet.read_table) -- CI green but awaiting a decision