Hey all,
I've put together a short proposal for how we might augment DataFusion with
full support for catalog/schema-based namespacing of tables, in line with the
SQL standard.
The doc lives here, should allow comments from everyone:
https://docs.google.com/document/d/1_bCP_tjVRLJyOrMBOezSFNpF
For anyone interested in this, I've also put together a draft PR with a
supporting implementation: https://github.com/apache/arrow/pull/9762
> -Original Message-
> From: Ruan Pearce-Authers
> Sent: 20 March 2021 11:43
> To: dev@arrow.apache.org
> Subject: [Rust] [
Awesome stuff Daniël, well-deserved :D
> -Original Message-
> From: Daniël Heres
> Sent: 28 April 2021 17:26
> To: dev@arrow.apache.org
> Subject: Re: [ANNOUNCE] New Arrow committer: Daniël Heres
>
> Thank you all!
>
> It has been an amazing experience working with you! Looking forward
Hey all,
I'm currently running some UX testing for a prototype DB engine integrating
DataFusion, and one recurring point that crops up is that specifying literal
timestamps, e.g. as gt/lt predicates in a where clause, is a bit awkward right
now. Most of the testing is borrowing existing queries
I'd be interested in helping spec this out, it's especially tricky atm to track
down issues when integrating DataFusion into the same binary as other
medium/large dependencies.
Recently hit a really specific issue where DataFusion depends on Parquet, which
supports various compression algs, inc
Hey all,
Whilst working on some UDAFs, I noticed I essentially had to reimplement
GroupByScalar to use scalars as HashMap keys inside accumulator struct state,
as ScalarValue (correctly!) doesn't implement Eq/Hash.
A simple fix to ease this process would be to remove the crate-only access
qual
run OOM there
> - for ~0.5GB of input, with 32GB of RAM). Much more efficient would be to
> store the accumulated data in (typed) arrays, keep offsets to values in those
> arrays and get rid of using per-row scalar values in those cases.
>
> Best,
>
> Daniël
>
> Op di 2