+1
On Mon, Mar 22, 2021 at 8:33 AM Andrew Lamb wrote:
>
> +1
>
> On Sun, Mar 21, 2021 at 7:08 PM paddy horan wrote:
>
> > +1 (non-binding)
> >
> >
> >
> > From: Sutou Kouhei
> > Sent: Sunday, March 21, 2021 4:34:43 PM
> > To: dev@arrow.apache.org
> > Subject: R
You can either use the provided server facility found in flight [1],
or use stream directly via ipc [2]. You can look at the tests on how
to use both facilities.
François
[1] https://github.com/apache/arrow/tree/master/go/arrow/flight
[2] https://github.com/apache/arrow/tree/master/go/arrow/ipc
They're mapped with the StructType/StructArray, which is also columnar
representation, e.g. one buffer per field in the sub-object. If you
have varying/incompatible types, a field will be promoted to a
UnionType.
François
On Thu, Apr 2, 2020 at 12:54 AM Micah Kornfield wrote:
>
> Hi Hasara,
> Th
It does make sense, I would go a little further and make this
field/property a single value of the same type than the array. This
would allow using any arbitrary sentinel value for unknown values (0
in your suggested case). The end result is zero-copy for R bindings
(if stars are aligned). I create
+1 (binding)
Verified all sources locally on Ubuntu 18.04 (including Javascript).
Verified the binaries, wheels verification matches the one found in
https://github.com/apache/arrow/pull/6961
François
On Fri, Apr 17, 2020 at 8:12 AM Antoine Pitrou wrote:
>
>
> Hi,
>
> I tested the sources on Ub
+1 (binding)
On Fri, Apr 24, 2020 at 5:41 AM Krisztián Szűcs
wrote:
>
> +1 (binding)
>
> On 2020. Apr 24., Fri at 1:51, Micah Kornfield
> wrote:
>
> > +1 (binding)
> >
> > On Thu, Apr 23, 2020 at 2:35 PM Sutou Kouhei wrote:
> >
> > > +1 (binding)
> > >
> > > In
> > > "[VOTE] Add "trivial" Re
Hello David,
I think that what you ask is achievable with the dataset API without
much effort. You'd have to insert the pre-buffering at
ParquetFileFormat::ScanFile [1]. The top-level Scanner::Scan method is
essentially a generator that looks like
flatmap(Iterator>). It consumes the
fragment in-or
stems since on linux it can call
`readahead` and/or `madvise`.
François
On Thu, Apr 30, 2020 at 8:56 AM Francois Saint-Jacques
wrote:
>
> Hello David,
>
> I think that what you ask is achievable with the dataset API without
> much effort. You'd have to insert the pre-bufferi
I'll add https://issues.apache.org/jira/browse/ARROW-8726 to the list.
On Tue, May 5, 2020 at 6:52 PM Wes McKinney wrote:
>
> Sorry I haven't had enough coffee today.
>
> The patches that still need to be resolved AFAICT are ARROW-8684 and
> ARROW-8706 (AKA PARQUET-1857), so it will take a little
+1 binding, verified sources and binaries locally (no exclusions).
On Fri, May 15, 2020 at 10:38 AM Neal Richardson
wrote:
>
> +1 (binding)
>
> Verification here: https://github.com/apache/arrow/pull/7170
>
> Still haven't worked out the Windows source verification job, but
> everything else look
I documented [1] the behaviors by experimentation or by reading the
documentation. My experiments were mostly about checking INT64_MAX +
1. My preference would be to use the platform defined behavior by
default and provide a safety option that errors.
Feel free to add more databases/systems.
Fra
As Antoine said, debug mode is probably the most important
configuration. You can also try the `relwithdebinfo` if you're trying
to debug the optimized code. I'd also add the following:
1. Building out of conda provides a much better integration with gdb
and the system's libstdc++ due to the prett
Hello,
We use this extensively in unit tests, see [1]
François
[1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/testing/random.h
On Mon, Jun 22, 2020 at 9:51 AM Kirill Lykov wrote:
>
> Hi,
>
> I wonder if there is existing C++ code which allows to generate a
> random arrow table by
We should aim to improve the performance of the most widely used
*default* packages, which are python pip, python conda and R (all
platforms). AFAIK, both pip (manywheel) and conda use gcc on Linux by
default. R uses gcc on Linux and mingw (gcc) on Windows. I suppose
(haven't checked) that clang is
> something like RandomTableGenerator before implementing myself one
> > > using RandomArrayGenerator.
> > >
> > > On Mon, Jun 22, 2020 at 4:49 PM Francois Saint-Jacques
> > > wrote:
> > > >
> > > > Hello,
> > > >
> > > > We
If you configured CMake to build tests (-DARROW_BUILD_TESTS=ON) and
install locally, there should be a `libarrow_testing.so` that you need
to link against. What I meant is that this library is _not_ part of
pip/conda/dpkg/rpm.
François
Hello Yue,
FeatherV2 is just a facade for the Arrow IPC file format. You can find
the implementation here [1]. I will try to answer your question with
inline comments. On a high level, the file format writes a schema and
then multiple "chunks" called RecordBatch. Your lowest level of
granularity
AM Francois Saint-Jacques
wrote:
>
> Hello Yue,
>
> FeatherV2 is just a facade for the Arrow IPC file format. You can find
> the implementation here [1]. I will try to answer your question with
> inline comments. On a high level, the file format writes a schema and
> then mul
+1 (binding)
OTOH,
how do we handle NullType -> UnionType cast conversion? Do we
require some convention like the first children ArrayData null bitmap
to be set and all tags set to 0?
François
On Wed, Jun 24, 2020 at 1:09 PM Antoine Pitrou wrote:
>
>
> Le 24/06/2020 à 18:34, Wes McKinney a écrit :
> > On We
+1 (binding)
On Tue, Jun 30, 2020 at 10:55 AM Neal Richardson
wrote:
>
> +1 (binding)
>
> On Tue, Jun 30, 2020 at 2:52 AM Antoine Pitrou wrote:
>
> >
> > +1 (binding)
> >
> > Le 29/06/2020 à 23:59, Wes McKinney a écrit :
> > > Hi,
> > >
> > > As discussed on the mailing list [1], it has been pro
Hello Kazuaki!
I recommend you read and take a look at the benchmark sub-library [1]
of archery and how it's glued [2]. You will need to implement:
- A runner for the framework you intend to use [3] and [4], it also
implies capturing the output into a class that implements the
"Benchmark" interfa
I would say at first sight that it's due to your usage of char[] and
builder.Append(d) implicitly does a strlen.
François
On Wed, Nov 18, 2020 at 2:00 PM Ying Zhou wrote:
>
> Sure!
>
> BinaryBuilder builder;
> char d[] = "\x00\x01\xbf\x5b”;
> (void)(builder.Append(d));
> std::shared_ptr array;
>
Sounds reasonable to me.
On Sat, Jul 20, 2019 at 5:55 AM Sutou Kouhei wrote:
>
> Hi,
>
> No more opinions?
>
> If there are no more opinions, we'll use the current
> SO versioning schema committed by
> https://github.com/apache/arrow/pull/4801 for 1.0.0. The
> current versioning schema is the fol
Do we bump the library version on changes from _any_ language
implementation, or just the C++/Java version?
François
On Fri, Jul 26, 2019 at 3:34 PM Wes McKinney wrote:
>
> hello,
>
> As discussed on the mailing list thread [1], Micah Kornfield has
> proposed a version scheme for the project to
Hello,
if each record has a different size, then I suggest to just use a
Struct> where Dim is a struct (or expand in the outer
struct). You can probably add your own logic with the recently
introduced ExtensionType [1].
François
[1]
https://github.com/apache/arrow/blob/f77c3427ca801597b572fb197b
My vote would go with underscore to minimize changes and minimize
exceptions to the google style guide reference. I also suggests that
we add this to the linters somehow, if it's not too much trouble.
François
On Tue, Aug 6, 2019 at 9:35 PM Sutou Kouhei wrote:
>
> Hi,
>
> I like hyphens.
>
> Bec
Congrats!
well deserved.
On Fri, Aug 9, 2019 at 11:12 AM Wes McKinney wrote:
>
> The Project Management Committee (PMC) for Apache Arrow has invited
> Micah Kornfield to become a PMC member and we are pleased to announce
> that Micah has accepted.
>
> Congratulations and welcome!
Indeed, I'd expect the `type()` method to not be called in the hot path.
François
On Mon, Aug 19, 2019 at 10:17 AM Wes McKinney wrote:
>
> hi Ben,
>
> On this possibility
>
> - Make ArrayBuilder::type() virtual. This will be much more expensive for
> nested builders and for applications which ne
I created the ticket https://issues.apache.org/jira/browse/ARROW-6396,
I think we can offer both.
François
On Thu, Aug 29, 2019 at 5:10 PM Ben Kietzman wrote:
>
> Indeed it's not about sanitizing nulls; it's about how nulls should
> interact with boolean (and other) expressions.
>
> For purpose
Hello Ivan,
There's a software called `bloaty` [1] that can tell you the size of
binary object per symbols.
Thank you,
François
[1] https://github.com/google/bloaty
On Wed, Sep 4, 2019 at 12:00 PM Ivan Popivanov wrote:
>
> Have been trying to figure out the binary size of a basic arrow static
Congrats to everyone!
François
On Fri, Sep 6, 2019 at 4:34 AM Kenta Murata wrote:
>
> Thank you very much everyone!
> I'm very happy to join this community.
>
> 2019年9月6日(金) 12:39 Micah Kornfield :
>
> >
> > Congrats everyone.
> >
> > On Thu, Sep 5, 2019 at 7:06 PM Ji Liu wrote:
> >
> > > Congr
Hello,
I suggest we tackle https://jira.apache.org/jira/browse/ARROW-5801.
For Rust, that would be
https://jira.apache.org/jira/browse/ARROW-5809. Once ported to
docker/docker-compose, it's trivial to activate github action for the
same test (see https://github.com/apache/arrow/pull/5530). As I'm
+1 (non binding)
Source release verified. ARROW_FLIGHT=OFF due to system protobuf.
Binary release verified.
Ubuntu 18.04
François
On Wed, Oct 2, 2019 at 1:18 AM Micah Kornfield wrote:
>
> +1 (binding)
>
> On Debian Stretch I ran: dev/release/verify-release-candidate.sh binaries
> 0.15.0 2 and i
There's always the route of vendoring some library and not exposing
external CMake options. This would achieve the goal of
compile-out-of-the-box and enable important feature in the basic
build. We also simplify dependencies requirements (benefits CI or
developer). The downside is following securit
As mentioned, Result is an improvement for function which returns a
single value, e.g. Make/Factory-like. My vote goes Result for such
case. For multiple return types, we have std::tuple like Antoine
proposed.
François
On Fri, Oct 18, 2019 at 9:19 PM Antoine Pitrou wrote:
>
>
> Le 18/10/2019 à 2
+1 (non-binding)
Ubuntu 18.04
- Source release verified
- Binary release verified
François
On Fri, Oct 25, 2019 at 2:43 PM Krisztián Szűcs
wrote:
>
> Hi,
>
> I would like to propose the following release candidate (RC0) of Apache
> Arrow version 0.15.1. This is a patch release consisting of 36
nges in
> >> > https://github.com/apache/arrow/pull/5600
> >> > + pip3 install -e dev/archery
> >> > Obtaining file:///tmp/arrow-0.15.1.7OxLD/apache-arrow-0.15.1/dev/archery
> >> > Complete output from command python setup.py
Lint and Rust failures fixed
(https://github.com/apache/arrow/commit/aa9f5c95253ef1fe713c5010f0a8f740ef284109)
Gandiva failures fixed
(https://github.com/apache/arrow/commit/1d23ec42fd786141b7de58a057d91c74ca19c32e)
Centos7 failure fixed
(https://github.com/apache/arrow/commit/5a47c5e8c2d5dba5eac52
The parquet c++ implementation has all the facilities to expose the
required information to implement predicate pushdown. The experimental
Dataset API does make use of this with parquet. See [1] for an example
of the API. Or a real-life usage with the nyc-tlc taxi dataset [2].
The relevant implemen
I'm all for it. Created [1] it would also enable an operator[] for
arrays of primitive types [2].
[1] https://issues.apache.org/jira/browse/ARROW-7178
[2] https://issues.apache.org/jira/browse/ARROW-6276
On Fri, Nov 15, 2019 at 12:40 AM Micah Kornfield wrote:
>
> I think there are potentially ot
Attendees:
- Projjal Chanda
- Uwe Korn
- Antoine Pitrou
- Prudhvi Porandla
- François Saint-Jacques
Discussion:
- Dataset API is going to be a first candidate for the Result
refactor (see https://github.com/apache/arrow/pull/5857)
- There's an overlap of dataset::Expression class and gandiva::Node
This notation is already used in some parts of the codebase [1]. I
think it was introduced when absorbing gandiva and then in a draft of
the logical operations in the compute module. I have no strong opinion
for/against. I find it convenient to reduce typing, but the style
guide argue against this.
I'll revert, some questions:
1. Should we revert only the pointer aliases, or also the Vector/Iterator.
2. Should we revert all modules, i.e. gandiva and compute.
François
It seems that the array_union_test.cc does the latter, look at how
`expected_types` is constructed. I opened
https://issues.apache.org/jira/browse/ARROW-7265 .
Wes, is the intended usage of type_ids to allow a producer to pass a
subset columns of unions without modifying the type codes?
François
Hello Maarten,
In theory, you could provide a custom mmap-allocator and use the
builder facility. Since the array is still in "build-phase" and not
sealed, it should be fine if mremap changes the pointer address. This
might fail in practice since the allocator is also used for auxiliary
data, e.g.
Attendees:
- Micah Kornfield, Google
- Praveen Kumar, Dremio
- Todd Hendricks
- François Saint-Jacques RStudio/Ursa Labs
Subject
- Bazel. Micah wants feedback on the PR. This first is aimed a
developer productivity, notably shorter link time and sandboxed build.
As a first PoC, parts of the python
Hello Hongze,
The C++ implementation of dataset, notably Dataset, DataSource,
DataSourceDiscovery, and Scanner classes are not ready/designed for
distributed computing. They don't serialize and they reference by
pointer all around, thus I highly doubt that you can implement parts
in Java, and some
Bravo!
On Mon, Dec 9, 2019 at 6:55 AM Wes McKinney wrote:
>
> On behalf of the Arrow PMC, I'm happy to announce that Joris has
> accepted an invitation to become a committer on Apache Arrow.
>
> Welcome, and thank you for your contributions!
It seems that LLVM can't auto vectorize. I don't have a debug build,
so I can't get the `-debug-only` information from llvm-opt/opt about
why it can't vectorize. The buffer address mangling should be hoisted
out of the loop (still doesn't enable auto vectorization) [1]. The
buffer juggling should b
Attendees:
- Antoine Pitrou, Ursa Labs/RStudio
- Francois Saint-Jaques, Ursa Labs/RStudio
- Ravindra Pindikura, Dremio
- Neville Dipale
- Rok Mihevc
Subjects:
- Arrow 1.0 release:
- Neville has been working on the Rust IPC bindings
(https://github.com/apache/arrow/pull/6013)
- Antoine is worki
functionality.
>
> On Wed, Dec 11, 2019 at 10:06 PM Francois Saint-Jacques <
> fsaintjacq...@gmail.com> wrote:
>
> > It seems that LLVM can't auto vectorize. I don't have a debug build,
> > so I can't get the `-debug-only` information from llvm-opt/opt ab
Missing [1] link.
[1] https://godbolt.org/z/S8tixP
On Wed, Dec 11, 2019 at 12:58 PM Francois Saint-Jacques
wrote:
>
> So, llvm _can_ auto-vectorize, I was just missing the `-mtripple`
> option [1]. That still requires to hoist the buffer juggling.
>
> François
>
> On Wed,
nk we can probably take an incremental approach of:
> 1. Eliminate *Ptr in src/arrow code (discuss similar changes in
> parquet/gandiva).
> 2. Decide on the Iterator/Vector.
>
> On Fri, Nov 22, 2019 at 10:47 AM Wes McKinney wrote:
>
> > hi Francois
> >
> > On Fri, No
The desired goal for this feature is trivial modifications, e.g.
within an editor, by data-scientists and researchers.
I'd go for the flatbuffer's json representation as it is stable and
has native support in almost any language or editor due to the
ubiquity of JSON. The C interface schema string
What's the point of having zero copy if the OS is doing the
decompression in kernel (which trumps the zero-copy argument)? You
might as well just use parquet without filesystem compression. I
prefer to have compression algorithm where the columnar engine can
benefit from it [1] than marginally impr
By filter, you mean a filter expression, or a selection vector/bitmap?
On Thu, Jan 23, 2020 at 11:38 PM Micah Kornfield wrote:
>
> One of the things that I think got overlooked in the conversation on having
> a slice offset in the C API was a suggestion from Jacques of perhaps
> generalizing the
Opened https://github.com/apache/arrow/pull/6342 to silence the OSX jar issue.
On Sun, Feb 2, 2020 at 8:31 AM Crossbow wrote:
>
>
> Arrow Build Report for Job nightly-2020-02-02-0
>
> All tasks:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-02-0
>
> Failed Tasks:
> -
Whelp, gmail didn't help with the thread folding. I'll just approve
Krisz' patch :).
On Mon, Feb 3, 2020 at 8:22 AM Francois Saint-Jacques
wrote:
>
> Opened https://github.com/apache/arrow/pull/6342 to silence the OSX jar issue.
>
> On Sun, Feb 2, 2020
The debian buster failure seems to be a network issue with github
upload, we'll see tomorrow. The gandiva-jar will be gone in the next
nightly (https://github.com/apache/arrow/pull/6342).
On Mon, Feb 3, 2020 at 8:48 AM Crossbow wrote:
>
>
> Arrow Build Report for Job nightly-2020-02-03-0
>
> All
+1
Binaries verification didn't have any issues.
Sources verification worked with some local environment hiccups
François
On Mon, Feb 3, 2020 at 8:46 PM Andy Grove wrote:
>
> +1 (binding) based on running the Rust tests
>
> Thanks.
>
> On Thu, Jan 30, 2020 at 8:13 PM Krisztián Szűcs
> wrote:
>
Tested on ubuntu 18.04 for the source release.
On Mon, Feb 3, 2020 at 10:07 PM Francois Saint-Jacques
wrote:
>
> +1
>
> Binaries verification didn't have any issues.
> Sources verification worked with some local environment hiccups
>
> François
>
> On Mon, F
This is a first!
On Tue, Feb 4, 2020 at 8:47 AM Crossbow wrote:
>
>
> Arrow Build Report for Job nightly-2020-02-04-0
>
> All tasks:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0
>
> Succeeded Tasks:
> - centos-6:
> URL:
> https://github.com/ursa-labs/crossbo
Arrow does have a Map type [1][2][3]. It is represented as a list of pairs.
François
[1]
https://github.com/apache/arrow/blob/762202418541e843923b8cae640d15b4952a0af6/format/Schema.fbs#L60-L87
[2]
https://github.com/apache/arrow/blob/762202418541e843923b8cae640d15b4952a0af6/cpp/src/arrow/type.h
Hello Matthew,
The dplyr binding is just syntactic sugar on top of the dataset API.
There's no analytics capabilities yet [1], other than the select and
the limited projection supported by the dataset API. It looks like it
is doing analytics due to properly placed `collect()` calls, which
converts
; >
> > > >>>> >> > > > > > > [1]
> > > >>>> >> https://bintray.com/apache/arrow/python-rc/0.16.0-rc2#files
> > > >>>> >> > > > > > > [2]
> > > >>>> >> > > > > >
> > > >>>> >>
+1
On Thu, Feb 13, 2020 at 9:08 PM Fan Liya wrote:
>
> +1 (binding)
>
> On Thu, Feb 13, 2020 at 11:52 AM Wes McKinney wrote:
>
> > +1 (binding)
> >
> > On Tue, Feb 11, 2020 at 4:29 PM Antoine Pitrou wrote:
> > >
> > >
> > > Ah, you're right, it's PR 6040:
> > > https://github.com/apache/arrow/p
Hello,
the recent dataset and compute work has forced us to think about
schema projection. One problem that surfaced is referencing fields in
nested schemas and/or schemas where duplicate column names exists. We
currently have (C++) APIs that either pass a vector or a
vector to represent fields su
It seems the code for the naive Scalar example is not friendly with the
compiler auto-vectorization component. If you accumulate in a local state
(instead of SumState pointer), you'll get different results. at least with
clang++6.0.
benchmark-noavx (only SSE):
BM_SumInt32Scalar
rks to do that (and merge with the
> SumState at the end of the function) for thoroughness. Thanks!
> On Wed, Oct 17, 2018 at 9:07 AM Francois Saint-Jacques
> wrote:
> >
> > It seems the code for the naive Scalar example is not friendly with the
> > compiler auto-vectori
One point toward seperate repositories, vendoring Arrow for C++ project
with git submodules becomes awkward if it's a multi-lang monorepo.
On Tue, Oct 16, 2018 at 9:22 PM Wes McKinney wrote:
> I would also add -- Krisztian's recent work Dockerizing the project is
> setting us up to be able to de
Not the nesting, but pulling a lot of unused files.
On Wed, Oct 17, 2018 at 12:39 PM Wes McKinney wrote:
> Why would one level of directory nesting cause awkwardness (curious)?
>
> On Wed, Oct 17, 2018, 12:28 PM Francois Saint-Jacques <
> fsaintjacq...@networkdump.com> wro
Seems like the type combination you're using (int32 -> uint32) and (int32
-> uint64) don't match the following pattern-matching
https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/cast.cc#L191-L192
which avoid using "safe" cast and revert to the following cast
implementation
Hello,
With JSON and other "typed" formats (msgpack, protobuf, ...) you need to
take account unions, e.g.
{a: "herp", b: 10}
{a: true, c: "derp"}
The type for `a` would be union.
I think we should also evaluate into investing at ingesting different
schema DSL (protobuf idl, json-schema) to avoi
I'd also suggest that we extend Romain's effort to add labels to all
languages, review states, and mabye. While the string labeling with [],
works, github search/filtering is not very good compared to filtering by
labels.
lang-{R,c++,py,java,...}
review-{wip,ready}
comp-{doc,gandiva,parquet,plasma
Hello Darren,
what Uwe suggests is usually the way to go, your active process writes to a
new file every time. Then you have a parallel process/thread that does
compaction of smaller files in the background such that you don't have too
many files.
On Wed, Dec 19, 2018 at 7:59 AM Uwe L. Korn wrot
No issue with this.
When the final squash is done, which title/body is preserved?
On Wed, Dec 19, 2018 at 8:43 AM Wes McKinney wrote:
> hi folks,
>
> As the contributor base has grown, our development styles have grown
> increasingly diverse.
>
> Sometimes contributors are used to working in a
Is there a reason why Datum::ARRAY stores an ArrayData and not an Array?
I'm aware there's the `make_array` method to obtain the equivalent, but was
wondering if there was a deeper reason.
Notes from today's meeting
Attendees:
- François Saint-Jacques (Ursa Labs/RStudio)
- Wes McKinney (Ursa Labs/RStudio)
- Benchmark Project
- PR Backlog
- Ben Kietzman (Ursa Labs/RStudio)
- Neville Dipale
- Siddhart Teotia (Dremio)
- Andy Grove
- Ravindra (Dremio)
- Shyam SIngh (Dremio)
- Li Jin
On Mon, Jan 28, 2019 at 12:53 AM Wes McKinney wrote:
> I was having a discussion recently about Arrow and the topic of
> server-side filtering vs. client-side filtering came up.
>
> The basic problem is this:
>
> If you have a RecordBatch that you wish to filter out some of the
> "rows", one way
This is also applicable to a per-repository basis by modifying the clone
`.git/config` file instead of the global one in your home.
On Wed, Jan 30, 2019 at 1:49 PM Antoine Pitrou wrote:
>
> That will be activated for all repositories, though, not only Arrow?
>
> Regards
>
> Antoine.
>
>
> Le 30/
Hi,
I also agree that we should follow a model similar to what you propose. I
think the plan is, correct me if I'm wrong Wes, to write the logical plan
operators, then write a small execution engine prototype and produce a
proper design document out of this experiment. There's also a placeholder
t
Can you remind us what's the easiest way to get flight working with grpc?
clone + make install doesn't really work out of the box.
François
On Thu, Feb 21, 2019 at 10:41 AM Antoine Pitrou wrote:
>
> Hello,
>
> I've been trying to saturate several CPU cores using our Flight
> benchmark (which sp
al wrote:
> > > I like flamegraphs for investigating this sort of problem:
> > >
> > > https://github.com/brendangregg/FlameGraph
> > >
> > > There are likely many other techniques for inspecting where time
> is being spent but
+1 (non-binding)
* Validated sources on Ubuntu 18.04 with cmake 3.10.2
* Validated binaries
On Fri, Feb 22, 2019 at 6:33 AM Uwe L. Korn wrote:
> +1 (binding)
>
> * Checked sources on Ubuntu 16.04 with an updated CMake and Gandiva turned
> off.
> * Verified the uploaded signatures of sources and
;
> Regards
>
> Antoine.
>
>
> Le 21/02/2019 à 18:40, Francois Saint-Jacques a écrit :
> > You can compile with dwarf (-g/-ggdb) and use `--call-graph=dwarf` to
> perf,
> > it'll help the unwinding. Sometimes it's better than the stack pointer
> > meth
I think we're witnessing multiple issues.
1. Travis seems to be slow (is it an OOM issue?)
- https://travis-ci.org/apache/arrow/jobs/499122041#L1019
- https://travis-ci.org/apache/arrow/jobs/498906118#L3694
- https://travis-ci.org/apache/arrow/jobs/499146261#L2316
2. https://issues.apache.or
There's a good chance we end up using curl for the dataset project. Curl
has a new url API https://github.com/curl/curl/wiki/URL-API , but it
requires a recent version (7.62.0 october 2018) which means vendoring.
François
On Wed, Feb 27, 2019 at 11:06 AM Antoine Pitrou wrote:
>
> Hello,
>
> As
LS library and who knows what else...
>
> Regards
>
> Antoine.
>
>
> On Wed, 27 Feb 2019 11:16:49 -0500
> Francois Saint-Jacques wrote:
> > There's a good chance we end up using curl for the dataset project. Curl
> > has a new url API https://github.co
Also just created https://issues.apache.org/jira/browse/ARROW-4728
On Thu, Feb 28, 2019 at 3:53 AM Ravindra Pindikura
wrote:
>
>
> > On Feb 28, 2019, at 2:10 PM, Antoine Pitrou wrote:
> >
> >
> > Le 28/02/2019 à 07:53, Ravindra Pindikura a écrit :
> >>
> >>
> >>> On Feb 27, 2019, at 1:48 AM, An
>
> Thoughts?
>
> -Micah
>
>
> On Fri, Mar 1, 2019 at 8:55 AM Francois Saint-Jacques <
> fsaintjacq...@gmail.com> wrote:
>
> > Also just created https://issues.apache.org/jira/browse/ARROW-4728
> >
> > On Thu, Feb 28, 2019 at 3:53 AM Ravindra Pindiku
an
> be tagged with the label
>
> On Fri, Mar 1, 2019 at 12:45 PM Francois Saint-Jacques
> wrote:
> >
> > I agree with adding a tag/label for this and even marking the failure as
> > critical.
> >
> >
> > On Fri, Mar 1, 2019 at 12:18 PM Micah Kornf
Could someone give me write/edit access to confluence?
Thank you,
François
On Fri, Mar 1, 2019 at 3:55 PM Francois Saint-Jacques <
fsaintjacq...@gmail.com> wrote:
> I'll take this.
>
> On Fri, Mar 1, 2019 at 3:55 PM Wes McKinney wrote:
>
>> We could create a pa
On Fri, Mar 1, 2019 at 8:09 PM Francois Saint-Jacques
> wrote:
> >
> > Could someone give me write/edit access to confluence?
> >
> > Thank you,
> > François
> >
> > On Fri, Mar 1, 2019 at 3:55 PM Francois Saint-Jacques <
> > fsaintjacq...@gm
Congrats, great addition!
On Fri, Mar 8, 2019 at 3:12 PM Philipp Moritz wrote:
> Congrats Micah!
>
> On Fri, Mar 8, 2019 at 11:28 AM Wes McKinney wrote:
>
> > On behalf of the Arrow PMC, I'm happy to announce that Micah has
> > accepted an invitation to become a committer on Apache Arrow.
> >
>
Greetings,
I noted that the current C++ API permits constructing core objects breaking
said classes invariants. The following recent issues were affected by this:
- ARROW-4766: segfault due to invalid ArrayData with nullptr buffer
- ARROW-4774: segfault due to invalid Table with columns of differ
then_ pass that to
> >> std::make_shared(..., vector_arg, ...).
> >>
> >> I do not agree with refactoring these methods to use "validating"
> >> constructors. Users of these C++ APIs should know what their
> >> requirements are, and we pro
Hello Felipe,
it's a bit per value as per memory layout documentation.
François
On Fri, Mar 22, 2019 at 10:48 AM Felipe Aramburu
wrote:
> In the builder base class I see this api
>
>
> https://github.com/apache/arrow/blob/ad1697e5d25eeaff5630421f55b0120f45cf0ce1/cpp/src/arrow/array/builder_b
Actually, this specific method seems to use a byte per value as you
questioned. I think it's worth adding documentation and an explicit warning
if it confused me. I'll let bkietz chime in to comment on the usage.
François
On Fri, Mar 22, 2019 at 10:57 AM Francois Saint-Jacques &l
to do this but I am
> trying to do things as the documentation suggest which I assumed was the
> preferred method of doing this.
>
>
>
> On Fri, Mar 22, 2019 at 8:13 AM Francois Saint-Jacques <
> fsaintjacq...@gmail.com> wrote:
>
> > Actually, this specific method
1 - 100 of 283 matches
Mail list logo