I got this working by re-organizing vectors into 1 million row each.
My Snowflake bulk insert now takes 3 minutes vs 3 hours.. I'll open a ticket in
ADBC to improve the interface..
ADBC's adbc_ingest() function needs something similar to
https://arrow.apache.org/docs/python/generated/pyarrow.da
As far as I understand, that bundles the Arrays into a ChunkedArray which only
Table interacts with. It doesn't make a longer Array and depending on what the
ADBC Snowflake driver is doing that may or may not help with the number of
invocations that are happening.
Also, its not portable across
Re 4. you create ChunkedArray from Array.
BR
J
śr., 22 lis 2023 o 20:48 Aldrin napisał(a):
> Assuming the C++ implementation, Jacek's suggestion (#3 below) is probably
> best. Here is some extra context:
>
> 1. You can slice larger RecordBatches [1]
> 2. You can make a larger RecordBatch [2] f
Assuming the C++ implementation, Jacek's suggestion (#3 below) is probably
best. Here is some extra context:
1. You can slice larger RecordBatches [1]
2. You can make a larger RecordBatch [2] from columns of smaller RecordBatches
[3] probably using the correct type of Builder [4] and with a bit
Hi!
I think some code is needed for clarity. You can concatenate tables (and
combine_chunks afterwards) or arrays. Then pass such concatenated one.
Regards,
Jacek
śr., 22 lis 2023 o 19:54 Lee, David (PAG)
napisał(a):
> I've got 36 million rows of data which ends up as a record batch with 300
I've got 36 million rows of data which ends up as a record batch with 3000
batches ranging from 12k to 300k rows each. I'm assuming these batches are
created using the multithreaded CSV file reader..
Is there anyway to reorg the data into sometime like 36 batches consistent of 1
million rows e
Hi,
During the last couple of weekly community calls there has been a
discussion raised around the necessity of creating a patch release for
14.0.2. There have been several issues tagged with the
"backport-candidate" label [1].
>From my understanding the issues are mainly fixing some possible
seg
Having spent a while doing patch and update management for critical
infrastructure, I think exposing it directly in the Change Log is the best
possible solution. I'll make sure to let the team know that we can look out
for future issues with some creative use of GitHub search but more
visibility is
Hi Shara,
The example dockerfile installs the base requirements for Ubuntu but
then we use the build_venv.sh (or build_conda.sh) to build the Arrow
CPP library and then pyarrow [1].
>From the error it seems you did not build Arrow CPP as libarrow.so
can't be found. Can you try following the recip
Congratulations James.
With Regards,
Vibhatha Abeykoon, PhD
On Sat, Nov 18, 2023 at 2:46 AM Ian Cook wrote:
> Congratulations James!
>
> On Thu, Nov 16, 2023 at 3:45 AM Sutou Kouhei wrote:
>
> > On behalf of the Arrow PMC, I'm happy to announce that James Duong
> > has accepted an invitation
Hi Chris,
As Bryce pointed out the current process is managed with the manual
addition of the `Breaking change` label in GitHub. In general after
the Release there is a review process to tag some of those that were
missing.
Currently you could use the GitHub issue search. For example for
13.0.0 a
11 matches
Mail list logo