In my mind there were two/three formats
* 2 related: IPC stream/file: native storage, everything memory mappable,
slight overhead from having to read in meta data due to chunking.
* feather: like IPC, but with possible compression/codecs, so non-memory
mappable (at least no practical use), to pro
Nice Joris, congratulations!
On Thu, Nov 18, 2021 at 9:34 AM Nic wrote:
> Congratulations, great news!
>
> On Thu, 18 Nov 2021 at 07:27, Joris Van den Bossche <
> jorisvandenboss...@gmail.com> wrote:
>
> > Thanks all!
> >
> > On Thu, 18 Nov 2021 at 08:10, Jorge Cardoso Leitão <
> > jorgecarl
Great work, nice to see this formalized.
On Thu, Jun 24, 2021 at 9:17 AM Antoine Pitrou wrote:
>
> Can we document them in the format docs and/or in the FAQ?
>
>
> On Thu, 24 Jun 2021 10:47:34 +0900 (JST)
> Sutou Kouhei wrote:
> > Hi,
> >
> > The official media types (MIME types) for Apache
Hi Ying,
If you manage to get the debugger to work nicely with VScode, could you
share the instructions on how to do set this up? I usually just use gdb,
but that can be a bit crude, would love to use a visual debugger sometimes.
Regards,
Maarten Breddels
Software engineer / consultant / data
that never got off the ground.
So yes, if you can do what Wes suggests, that would be great.
cheers,
Maarten Breddels
Software engineer / consultant / data scientist
Python / C++ / Javascript / Jupyter
www.maartenbreddels.com / vaex.io
maartenbredd...@gmail.com +31 6 2464 0838 <+31+6+24640
Another approach I took is;
https://github.com/vaexio/vaex-arrow-ext
But it uses pybind11, not cython.
(from mobile phone)
On Fri, 6 Nov 2020, 17:19 Vibhatha Abeykoon, wrote:
> One more question about packaging, here when the API requires both Cython
> and C++ APIs,
> Pyarrow dependency must a
I'm using vscode with the Clang-format plugin, and configured it with:
"editor.formatOnSafe": true,
"clang-format.executable": "clang-format-7",
clang-format-7 I installed with apt-get I think
It will auto format when safe.
Op do 16 jul. 2020 om 22:19 schreef Micah Kornfield :
> If you ha
our on Windows: You don't actually link
> against numpy but you statically link a set of functions that are resolved
> to NumPy's function when you import numpy. Quick googling leads to
> https://github.com/yugr/Implib.so which could provide something similar
> for Linux.
> &g
Ok, thanks!
I'm setting up a repo with an example here, using pybind11:
https://github.com/vaexio/vaex-arrow-ext
and I'll just try all possible combinations and report back.
cheers,
Maarten Breddels
Software engineer / consultant / data scientist
Python / C++ / Javascript
where someone installed a pyarrow 2014
wheel, or build from source, or installed from conda-forge?
cheers,
Maarten Breddels
Software engineer / consultant / data scientist
Python / C++ / Javascript / Jupyter
www.maartenbreddels.com / vaex.io
maartenbredd...@gmail.com +31 6 2464 0838 <+3
sts for 32bit offset strings arrays)
Overall, we're quite positive, and as you see, the pain points are not
fundamental issue, but annoyances that might be easy to fix, and make
adoption smoother/faster.
cheers,
Maarten Breddels
Software engineer / consultant / data scientist
Python / C
I think that if __eq__ does not return True/False exclusively, __bool__
should raise an exception to avoid these unexpected truthies. Python users
are used to that due to Numpy.
Op wo 1 jul. 2020 om 15:40 schreef Joris Van den Bossche <
jorisvandenboss...@gmail.com>:
> On Wed, 1 Jul 2020 at 09:4
ooking at the loaded
> shared libraries.
>
> François
>
> On Mon, Jun 15, 2020 at 10:38 AM Antoine Pitrou
> wrote:
> >
> >
> > Hi Maarten,
> >
> > You should build in debug mode, i.e. pass -DCMAKE_BUILD_TYPE=Debug
> >
> > Regards
> &
o tab completion, because
I assume this is not exported (although the symbol is visible using nm
path/to/libarrow.so.100).
Is there an easy (or hard?) way to get a breakpoint there, and what might
be the reason I cannot put a breakpoint at TransformAsciiUpper.
cheers,
Maarten Breddels
Maarten Breddels created ARROW-9100:
---
Summary: Add ascii_lower kernel
Key: ARROW-9100
URL: https://issues.apache.org/jira/browse/ARROW-9100
Project: Apache Arrow
Issue Type: Task
Hi Antoine,
Adding xsimd to the list of options:
* https://github.com/xtensor-stack/xsimd
Not sure how it compares to the rest though.
cheers,
Maarten
Maarten Breddels created ARROW-8865:
---
Summary: windows distribution for 0.17.1 seems broken (conda only?
Key: ARROW-8865
URL: https://issues.apache.org/jira/browse/ARROW-8865
Project: Apache Arrow
Op wo 27 nov. 2019 om 19:37 schreef Wes McKinney :
> On Tue, Nov 26, 2019 at 9:40 AM Maarten Breddels
> wrote:
> >
> > Op di 26 nov. 2019 om 15:02 schreef Wes McKinney :
> >
> > > hi Maarten
> > >
> > > I opened https://issues.apache.org/j
nary type.
> >
> >
> > Another solution is to create a `FixedBuilder class where
> > - the number of elements is known
> > - the data type is of fixed width
> > - Nullability is know (whether you need an extra buffer).
> >
> > I think sooner or later we'll ne
Hi Uwe,
Having it in a separate package/module/namespace makes it easier to make it an
optional install in the future, might that happen. Also, it would be more tab
completion friendly.
Cheers,
Maarten
> On 17 Dec 2019, at 10:24, Uwe L. Korn wrote:
>
> Hello all,
>
> we have developed quit
In vaex I always write the data to hdf5 as 1 large chunk (per column).
The reason is that it allows the mmapped columns to be exposed as a
single numpy array (talking numerical data only for now), which many
people are quite comfortable with.
The strategy for vaex to write unchunked data, is to fi
. Also in vaex, all the
processing happens in chunks, and no chunk will ever be that large (for the
near future...).
In vaex, when exporting to hdf5, I always write in 1 chunk, and that's
where most of my issues show up.
cheers,
Maarten
>
> - Wes
>
> [1]:
> https:/
uld play better with
pa.ChunkedArray.
Regards,
Maarten Breddels
Maarten Breddels created ARROW-3686:
---
Summary: Support for masked arrays in to/from numpy
Key: ARROW-3686
URL: https://issues.apache.org/jira/browse/ARROW-3686
Project: Apache Arrow
Issue
Maarten Breddels created ARROW-3685:
---
Summary: Better roundtrip between numpy and arrow binary array
Key: ARROW-3685
URL: https://issues.apache.org/jira/browse/ARROW-3685
Project: Apache Arrow
Maarten Breddels created ARROW-3669:
---
Summary: pyarrow swallows big endian arrow without converting or
error msg
Key: ARROW-3669
URL: https://issues.apache.org/jira/browse/ARROW-3669
Project
26 matches
Mail list logo