Re: [C++] parquet::arrow::WriteTable process exited with code 0177

2022-05-23 Thread Rares Vernica
gt; 2. (Less likely) It could have something to do with whether exceptions > have been compiling the parquet library (Arrow catches exceptions and > translates them to statuses). > > > -Micah > > On Sat, May 21, 2022 at 9:55 PM Rares Vernica wrote: > > > Hello, > > &

[C++] parquet::arrow::WriteTable process exited with code 0177

2022-05-21 Thread Rares Vernica
Hello, We have a plugin that writes both Arrow and Parquet format files. We are experiencing an issue when using Parquet format, while Arrow format works just fine. More exactly, the process crashes in parquet::arrow::WriteTable. Using gdb we identified the line when the process crashes https://g

Re: [R] Install arrow package: arrow.so undefined symbol

2022-05-03 Thread Rares Vernica
Hi Dragos, It still fails after setting the environment variable. Here is the log. Cheers, Rares Setup: centos:7 Docker container, R and related packages installed with yum /> cat /etc/redhat-release CentOS Linux release 7.9.2009 (Core) /> export ARROW_R_DEV=true /> R R version 3.6.0 (2019-04-2

[R] Install arrow package: arrow.so undefined symbol

2022-04-29 Thread Rares Vernica
Hello, I'm trying to do install.packages("arrow") in R 3.6.0 on CentOS 7 and it errors out like this: $ cat /etc/redhat-release CentOS Linux release 7.9.2009 (Core) $ R R version 3.6.0 (2019-04-26) -- "Planting of a Tree" > install.packages("arrow") ... ** building package indices ** installing

Re: C++ Parquet thrift_ep No rule to make target install

2021-09-28 Thread Rares Vernica
cmake#L20 > > On Thu, Sep 23, 2021 at 6:18 PM Rares Vernica wrote: > > > > Hello, > > > > I managed to get Thrift 0.12.0 compiled and installed from source on my > > CentOS 7 setup. I configured it like so, mimicking what > > ThirdpartyToolchain

Re: C++ Boost GitHub URL in ThirdpartyToolchain.cmake

2021-09-27 Thread Rares Vernica
CentOS 7 On Mon, Sep 27, 2021 at 10:06 PM Benson Muite wrote: > Hi Rares, > What operating system are you using? > Benson > On 9/28/21 7:38 AM, Rares Vernica wrote: > > Hello, > > > > I'm still struggling to build Arrow with Parquet. I compiled Thrif

C++ Boost GitHub URL in ThirdpartyToolchain.cmake

2021-09-27 Thread Rares Vernica
Hello, I'm still struggling to build Arrow with Parquet. I compiled Thrift myself but I'm running into dependency issues with Boost. It looks like the Boost download URL provided in ThirdpartyToolchain.cmake here https://github.com/apache/arrow/blob/ef4e92982054fcc723729ab968296d799d3108dd/cpp/cm

Re: C++ Parquet thrift_ep No rule to make target install

2021-09-23 Thread Rares Vernica
a -rwxr-xr-x. 1 root root 939 Sep 23 23:09 libthriftz.la drwxr-xr-x. 2 root root 4096 Sep 23 23:09 pkgconfig How could I make Arrow's cmake to pick them up instead of trying to get Thrift again? Thank you! Rares On Wed, Sep 22, 2021 at 5:33 PM Rares Vernica wrote: > Eduardo,

Re: C++ Parquet thrift_ep No rule to make target install

2021-09-22 Thread Rares Vernica
apache.org/jira/browse/THRIFT-2559 > [3] > https://centos.pkgs.org/7/epel-x86_64/thrift-0.9.1-15.el7.x86_64.rpm.html > [4] https://github.com/apache/arrow/pull/4558 > > On Mon, Sep 20, 2021 at 11:56 PM Rares Vernica wrote: > > > Hello,

C++ Parquet thrift_ep No rule to make target install

2021-09-20 Thread Rares Vernica
Hello, I'm compiling the C++ library for Arrow 3.0.0 in CentOS 7. It works fine, but it breaks if I set ARROW_PARQUET=ON. I stops while trying to build thrift_ep > scl enable devtoolset-3 "cmake3 .. \ -DARROW_PARQUET=ON

C++ Determine Size of RecordBatch

2021-08-31 Thread Rares Vernica
Hello, I'm storing RecordBatch objects in a local cache to improve performance. I want to keep track of the memory usage to stay within bounds. The arrays stored in the batch are not nested. The best way I came up to compute the size of a RecordBatch is: size_t arrowSize = 0;

Re: [C++] Unable to getMutableValues from ArrayData

2021-08-02 Thread Rares Vernica
pos_buffers, delta_data->null_count); On Mon, Aug 2, 2021 at 5:45 PM Antoine Pitrou wrote: > On Fri, 30 Jul 2021 18:55:33 +0200 > Rares Vernica wrote: > > Hello, > > > > I have a RecordBatch that I read from an IPC file. I need to run a > > cumulative sum

[C++] Use RecordBatch::AddColumn to update RecordBatch

2021-08-02 Thread Rares Vernica
Hello, I'm using RecordBatch:;AddColumn to update a RecordBatch. Something like this: std::shared_ptr rb; ... rb = rb->AddColumn(...) Since AddColumn creates a new RecordBatch, is the memory taken by rb before assignment being freed as expected. Thanks! Rares

[C++] Unable to getMutableValues from ArrayData

2021-07-30 Thread Rares Vernica
Hello, I have a RecordBatch that I read from an IPC file. I need to run a cumulative sum on one of the int64 arrays in the batch. I tried to do: std::shared_ptr pos_data = batch->column_data(nAtts); auto pos_values = pos_data->GetMutableValues(1); for (auto i = 1; i < pos_data->length; i++) pos_v

Re: Compute Functions: Modulo

2021-07-28 Thread Rares Vernica
PS a is an integer array and b is an integer scalar. On Wed, Jul 28, 2021 at 1:04 PM Rares Vernica wrote: > Hello, > > I'm making use of the Compute Functions to do some basic arithmetic. One > operation I need to perform is the modulo, i.e., a % b. I'm debating > b

Compute Functions: Modulo

2021-07-28 Thread Rares Vernica
Hello, I'm making use of the Compute Functions to do some basic arithmetic. One operation I need to perform is the modulo, i.e., a % b. I'm debating between two options: 1. Compute it using the available Compute Functions using a % b = a - a / b * b, where / is the integer division. I assume that

Re: C++ warning: missing initializer for member

2021-07-27 Thread Rares Vernica
this could be fixed by adding default values: > > struct DayMilliseconds { > int32_t days = 0; > int32_t milliseconds = 0; > ... > }; > > In the meantime, you would have to suppress the warning in the > compiler where it's happening > > On Tue, Jul 27, 2

C++ warning: missing initializer for member

2021-07-27 Thread Rares Vernica
Hello, I'm getting a handful of warnings when including arrow/builder.h Is this expected? Should I use the suggested -W flag? In file included from /opt/apache-arrow/include/arrow/array/builder_dict.h:29:0, from /opt/apache-arrow/include/arrow/builder.h:26, /opt/apache-arrow/inc

C++ Datum::move returns ArrayData not Array

2021-07-27 Thread Rares Vernica
Hi, I'm trying the example in the Compute Functions user guide https://arrow.apache.org/docs/cpp/compute.html#invoking-functions std::shared_ptr numbers_array = ...;std::shared_ptr increment = ...;arrow::Datum incremented_datum; ARROW_ASSIGN_OR_RAISE(incremented_datum, arrow

Re: Apache Arrow Cookbook

2021-07-07 Thread Rares Vernica
Awesome! We would find C++ versions of these recipes very useful. From our experience the C++ API is much much harder to deal with and error prone than the R/Python one. Cheers, Rares On Wed, Jul 7, 2021 at 9:07 AM Alessandro Molina < alessan...@ursacomputing.com> wrote: > Yes, that was mostly w

Re: Xenial 3.0.0 packages | Bintray

2021-07-06 Thread Rares Vernica
gt; sudo sed -i'' -e 's,bintray.com,jfrog.io/artifactory,' > /etc/apt/sources.list.d/apache-arrow.sources > sudo apt install -y -V libarrow-dev > > > Thanks, > -- > kou > > In > "Re: Xenial 3.0.0 packages | Bintray" on Tue, 6 Jul 2

Re: Xenial 3.0.0 packages | Bintray

2021-07-06 Thread Rares Vernica
. We don't provide newer packages for Xenial. > > > Thanks, > -- > kou > > In > "Xenial 3.0.0 packages | Bintray" on Tue, 6 Jul 2021 13:11:33 -0700, > Rares Vernica wrote: > > > Hello, > > > > I realize that newer packages are on jfrog.

Xenial 3.0.0 packages | Bintray

2021-07-06 Thread Rares Vernica
Hello, I realize that newer packages are on jfrog.io Until last week, I was still able to use bintray.com for Xenial packages of 3.0.0. Today https://apache.bintray.com/arrow/ returns forbidden. Is this temporary? If not, are these Xenial packages available somewhere else? Thank you! Rares

Re: [C++] LZ4 Compression formats

2021-06-22 Thread Rares Vernica
are trying to use the buffer level compression described > by the specification? If so only LZ4_FRAME is currently allowed [1] > > [1] https://github.com/apache/arrow/blob/master/format/Message.fbs#L45 > > > On Tue, Jun 22, 2021 at 12:28 PM Rares Vernica wrote: > > > Hell

[C++] LZ4 Compression formats

2021-06-22 Thread Rares Vernica
Hello, Using Arrow 3.0.0 I tried to compress a stream with LZ4 and got this error message: NotImplemented: Streaming compression unsupported with LZ4 raw format. Try using LZ4 frame format instead. Is it because LZ4 raw was not enabled when the .so was compiled or actually not implemented? Is L

Re: C++ Segmentation Fault RecordBatchReader::ReadNext in CentOS only

2021-06-11 Thread Rares Vernica
efault compiler for that CentOS version. > > Regards > > Antoine. > > > > If there were the package > > maintainer bandwidth, having both devtoolset-gcc and system-gcc > > pre-built RPMs would be potentially interesting (but there are so many > > devtoolsets, whic

Re: C++ Segmentation Fault RecordBatchReader::ReadNext in CentOS only

2021-06-10 Thread Rares Vernica
gt; Thanks, > -- > kou > > In > "Re: C++ Segmentation Fault RecordBatchReader::ReadNext in CentOS only" > on Wed, 9 Jun 2021 21:39:04 -0700, > Rares Vernica wrote: > > > I got the apache-arrow-4.0.1 source and compiled it with the Debug flag. > No >

Re: C++ Segmentation Fault RecordBatchReader::ReadNext in CentOS only

2021-06-09 Thread Rares Vernica
rce location on segmentation fault. > > Thanks, > -- > kou > > In > "C++ Segmentation Fault RecordBatchReader::ReadNext in CentOS only" on > Tue, 8 Jun 2021 12:01:27 -0700, > Rares Vernica wrote: > > > Hello, > > > > We recently migrated our C+

C++ Segmentation Fault RecordBatchReader::ReadNext in CentOS only

2021-06-08 Thread Rares Vernica
Hello, We recently migrated our C++ Arrow code from 0.16 to 3.0.0. The code works fine on Ubuntu, but we get a segmentation fault in CentOS while reading Arrow Record Batch files. We can successfully read the files from Python or Ubuntu so the files and the writer are fine. We use Record Batch St

Re: C++ Migrate from Arrow 0.16.0

2021-06-02 Thread Rares Vernica
THROW_NOT_OK((result_name).status()); \ > > > lhs = std::move(result_name).ValueUnsafe(); > > > #define ASSIGN_OR_THROW(lhs, rexpr) \ > > > ASSIGN_OR_THROW_IMPL(_maybe ## __COUNTER__, lhs, rexpr) > > > > > > Then lines such as > &g

C++ Migrate from Arrow 0.16.0

2021-05-27 Thread Rares Vernica
Hello, We are trying to migrate from Arrow 0.16.0 to a newer version, hopefully up to 4.0.0. The Arrow 0.17.0 change in AllocateBuffer from taking a shared_ptr to returning a unique_ptr is making things very difficult. We wonder if there is a strong reason behind the change from shared_ptr to uniq

Re: C++ RecordBatch Debugging Segmentation Fault

2021-05-20 Thread Rares Vernica
create the > RecordBatch (from one thread) which will force the boxed columns to > materialize. > > -Weston > > On Thu, May 20, 2021 at 11:40 AM Wes McKinney wrote: > > > > Also, is it possible that the field is not an Int64Array? > > > > On Wed, May 19, 202

C++ Compression in RecordBatchStreamWriter

2021-05-20 Thread Rares Vernica
Hello, Just a clarifying question, when a CompressedOutputStream is used with RecordBatchStreamWriter, are the composing Arrow arrays compressed independently or the entire output file is compressed at once? For example if we use GZIP, is the resulting file a valid GZIP file that we can uncompres

Re: C++ RecordBatch Debugging Segmentation Fault

2021-05-19 Thread Rares Vernica
Is there a better (safer) way of accessing a specific Int64 cell in a RecordBatch? Currently I'm doing something like this: std::static_pointer_cast(batch->column(i))->raw_values()[j] On Wed, May 19, 2021 at 3:09 PM Rares Vernica wrote: > > /opt/rh/devtoolset-3/root/usr/b

Re: C++ RecordBatch Debugging Segmentation Fault

2021-05-19 Thread Rares Vernica
vial caching which > uses std::atomic_load[1] which is not implemented properly on gcc < 5 > so our behavior is different depending on the compiler version. > > [1] https://en.cppreference.com/w/cpp/atomic/atomic_load > > On Wed, May 19, 2021 at 10:15 AM Rares Vernica wrote: >

C++ RecordBatch Debugging Segmentation Fault

2021-05-19 Thread Rares Vernica
Hello, I'm using Arrow for accessing data outside the SciDB database engine. It generally works fine but we are running into Segmentation Faults in a corner multi-threaded case. I identified two threads that work on the same Record Batch. I wonder if there is something internal about RecordBatch t

Re: Python: Bad address when rewriting file

2020-12-14 Thread Rares Vernica
", line 97, in pyarrow.lib.check_status OSError: [Errno 14] Error writing bytes to file. Detail: [errno 14] Bad address Cheers, Rares On Mon, Dec 14, 2020 at 12:30 AM Antoine Pitrou wrote: > > Hello Rares, > > Is there a complete reproducer that we may try out? > > Re

Python: Bad address when rewriting file

2020-12-13 Thread Rares Vernica
Hello, As part of a test, I'm reading a record batch from an Arrow file, re-batching the data in smaller batches, and writing back the result to the same file. I'm getting an unexpected Bad address error and I wonder what am I doing wrong? reader = pyarrow.open_stream(fn) tbl = reader.read_all()

Re: C++: Cache RecordBatch

2020-11-17 Thread Rares Vernica
Hi Antoine, On Tue, Nov 17, 2020 at 2:34 AM Antoine Pitrou wrote: > > Le 17/11/2020 à 03:34, Rares Vernica a écrit : > > > > I'm using an arrow::io::BufferReader and > > arrow::ipc::RecordBatchStreamReader to read an arrow::RecordBatch from a > > file. There i

C++: Cache RecordBatch

2020-11-16 Thread Rares Vernica
Hello, I'm using an arrow::io::BufferReader and arrow::ipc::RecordBatchStreamReader to read an arrow::RecordBatch from a file. There is only one batc in the file so I do a single RecordBatchStreamReader::ReadNext call. I store the populated RecordBatch in memory for reuse (cache). The memory buffe

Sort int tuples across Arrow arrays in C++

2020-09-03 Thread Rares Vernica
Hello, I have a set of integer tuples that need to be collected and sorted at a coordinator. Here is an example with tuples of length 2: [(1, 10), (1, 15), (2, 10), (2, 15)] I am considering storing each column in an Arrow array, e.g., [1, 1, 2, 2] and [10, 15, 10, 15], and have the Arrow arr

Re: C++ Write Schema with RecordBatchStreamWriter

2020-06-16 Thread Rares Vernica
the fact that '1' is one byte > in Py2.7 and 'foo' is 3 bytes). Try passing an open file handle > instead > > On Tue, Jun 16, 2020 at 11:28 AM Rares Vernica wrote: > > > > Thank you for your help in getting to the bottom of this. It seems that >

Re: C++ Write Schema with RecordBatchStreamWriter

2020-06-16 Thread Rares Vernica
Thanks! Rares On Mon, Jun 15, 2020 at 10:55 PM Micah Kornfield wrote: > Hi Rares, > This last issue sounds like you are trying to write data from 0.16.0 > version of the library and read it from a pre-0.15.0 version of the python > library. If you want to do this you need to set "bool > wri

Re: C++ Write Schema with RecordBatchStreamWriter

2020-06-15 Thread Rares Vernica
4, in pyarrow.lib.check_status pyarrow.lib.ArrowInvalid: Expected to read 1886221359 metadata bytes, but only read 4 On Mon, Jun 15, 2020 at 10:08 PM Wes McKinney wrote: > On Mon, Jun 15, 2020 at 11:24 PM Rares Vernica wrote: > > > > I was able to reproduce my issue in a small,

Re: C++ Write Schema with RecordBatchStreamWriter

2020-06-15 Thread Rares Vernica
ional Section: libdevel Installed-Size: 38738 Maintainer: Apache Arrow Developers Architecture: amd64 Multi-Arch: same Source: apache-arrow Version: 0.17.1-1 Depends: libarrow17 (= 0.17.1-1) > g++ --version g++ (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 Does this make sense? Cheers, Rares

Re: C++ Write Schema with RecordBatchStreamWriter

2020-06-15 Thread Rares Vernica
you should set a breakpoint in this function > and see if for some reason started_ is true on the first invocation > (in which case it makes me wonder if there is something > not-fully-C++11-compliant about your toolchain). > > Otherwise I'm a bit stumped since there are lots of

Re: C++ Write Schema with RecordBatchStreamWriter

2020-06-15 Thread Rares Vernica
owStream->Close()); On Mon, Jun 15, 2020 at 6:26 AM Wes McKinney wrote: > Can you show the code you are writing? The first thing the stream writer > does before writing any record batch is write the schema. It sounds like > you are using arrow::ipc::WriteRecordBatch somewhere. >

C++ Write Schema with RecordBatchStreamWriter

2020-06-14 Thread Rares Vernica
Hello, I have a RecordBatch that I would like to write to a file. I'm using FileOutputStream::Open to open the file and RecordBatchStreamWriter::Open to open the stream. I write a record batch with WriteRecordBatch. Finally, I close the RecordBatchWriter and OutputStream. The resulting file size

Re: C++ IPC Array length did not match record batch length (5)

2020-06-14 Thread Rares Vernica
hen you run > ARROW_RETURN_NOT_OK(arrowBatch->Validate())? > > On Sun, Jun 14, 2020 at 2:09 PM Rares Vernica wrote: > > > > Hello, > > > > I'm porting a C++ program from Arrow 0.9.0 to 0.16.0. The *sender* uses > > BufferOutputStream and RecordBatchWri

C++ IPC Array length did not match record batch length (5)

2020-06-14 Thread Rares Vernica
Hello, I'm porting a C++ program from Arrow 0.9.0 to 0.16.0. The *sender* uses BufferOutputStream and RecordBatchWriter to serialize a set of Arrow arrays. The *receiver* uses BufferReader and RecordBatchReader to deserialize them. I get the runtime error *Array length did not match record batch l

Re: Read Arrow 0.9.0 output using newer pyarrow version

2019-05-13 Thread Rares Vernica
://issues.apache.org/jira/browse/ARROW-921 > some time ago about adding some tools to integration test one version > versus another to obtain hard proof of this, but this work has not > been completed yet (any takers?). > > Have you encountered any problems? > > Thanks,

Read Arrow 0.9.0 output using newer pyarrow version

2019-03-10 Thread Rares Vernica
Hello, I have a C++ library using Arrow 0.9.0 to serialize data The code looks like this: std::shared_ptr arrowBatch; arrowBatch = arrow::RecordBatch::Make(_arrowSchema, nCells, _arrowArrays); std::shared_ptr arrowBuffer(new arrow::PoolBuffer(_arrowPool)); arrow::io::BufferOutputStream arrowStre

Re: C++ buildings and Regex issue

2018-12-11 Thread Rares Vernica
gs and Regex issue" on Tue, 11 Dec 2018 22:53:58 -0800, > Rares Vernica wrote: > > > Hi, > > > > Unfortunately we need to stay on CentOS 6 for now. > > > > We have a locally built libboost-devel-1.54 for CentOS 6 which installs > in > > a custom loca

Re: C++ buildings and Regex issue

2018-12-11 Thread Rares Vernica
on CentOS 6. Because system Boost > is old. It's better that you upgrade to CentOS 7. > > Thanks, > -- > kou > > In > "Re: C++ buildings and Regex issue" on Tue, 11 Dec 2018 22:07:20 -0800, > Rares Vernica wrote: > > > Wes, > > > &g

Re: C++ buildings and Regex issue

2018-12-11 Thread Rares Vernica
n each part of your application, and in the Arrow > + Parquet libraries > > 0.9.0 is over 1000 patches ago. I'd recommend that you try to upgrade > > $ git hist apache-arrow-0.9.0..master | wc -l > 1540 > > - Wes > On Tue, Dec 11, 2018 at 10:58 PM Rares Vernica wrote: >

C++ buildings and Regex issue

2018-12-11 Thread Rares Vernica
Hello, We are using the C++ bindings of Arrow 0.9.0 on our system on CentOS. Once we load the Arrow library, our regular regex calls (outside of Arrow) misbehave and trigger some unknown crashes. We are still trying to figure things out but I was wondering if there are any know issues regarding re

Re: Install pyarrow 0.9.0.post1

2018-08-17 Thread Rares Vernica
on Linux? > > On Fri, Aug 17, 2018 at 11:47 PM, Rares Vernica > wrote: > > Hello, > > > > I see the latest 0.9.0 version of pyarrow is 0.9.0.post1 > > https://pypi.org/project/pyarrow/0.9.0.post1/#files but I can't convince > > pip to install it. Do

Install pyarrow 0.9.0.post1

2018-08-17 Thread Rares Vernica
Hello, I see the latest 0.9.0 version of pyarrow is 0.9.0.post1 https://pypi.org/project/pyarrow/0.9.0.post1/#files but I can't convince pip to install it. Do you have any clue on what might be going wrong? > pip --version pip 9.0.3 from /usr/lib/python2.7/site-packages (python 2.7) > pip install

RecordBatch with different-length Arrays

2018-08-02 Thread Rares Vernica
Hi, The docs suggest that a RecordBatch is a collection of equal-length array instances. It appears that this is not enforced and one could build a RecordBatch from arrays of different length. Is this intentional? Here is an example: >>> b = pyarrow.RecordBatch.from_arrays( [pyarrow.array([1,

Re: C++ RecordBatchWriter/ReadRecordBatch clarification

2018-04-22 Thread Rares Vernica
ple. > > See https://gist.github.com/alendit/c6cdd1adaf7007786392731152d3b6b9 > > Cheers, > Dimitri. > > On Tue, Apr 17, 2018 at 3:52 AM, Rares Vernica wrote: > > > Hi, > > > > I'm writing a batch of records to a stream and I want to read them > la

C++ RecordBatchWriter/ReadRecordBatch clarification

2018-04-16 Thread Rares Vernica
Hi, I'm writing a batch of records to a stream and I want to read them later. I notice that if I use the RecordBatchStreamWriter class to write them and then ReadRecordBatch function to read them, I get a Segmentation Fault. On the other hand, if I use the RecordBatchFileWriter class to write the

Re: Upstream segmentation fault when using 0.9.0, OK with 0.8.0

2018-04-03 Thread Rares Vernica
ne. We actually should build > the packages with the system one: https://issues.apache.org/ > jira/browse/ARROW-2383 > > On Tue, Apr 3, 2018, at 7:06 AM, Rares Vernica wrote: > > Hi Wes, > > > > That did it! Thanks so much for the pointer. > > > > BTW, enjoy you

Re: Upstream segmentation fault when using 0.9.0, OK with 0.8.0

2018-04-02 Thread Rares Vernica
that's > part of -DARROW_ORC=on > > Wes > > On Sun, Apr 1, 2018 at 9:48 PM, Rares Vernica wrote: > > Hello, > > > > I'm using libarrow.so to build a simplified SciDB "plugin". The plugin > gets > > loaded dynamically into a running SciDB i

Upstream segmentation fault when using 0.9.0, OK with 0.8.0

2018-04-01 Thread Rares Vernica
Hello, I'm using libarrow.so to build a simplified SciDB "plugin". The plugin gets loaded dynamically into a running SciDB instance. The plugin just does arrow::default_memory_pool(). With Arrow 0.8.0, the plugin gets loaded successfully and can be used in SciDB. With Arrow 0.9.0, SciDB crashes wh

Re: C++ optimize stream output

2018-03-26 Thread Rares Vernica
ill update the issue if I have any news. > On Mon, Feb 26, 2018 at 10:22 AM, Rares Vernica wrote: > >> - On the coordinator side, do I really need to read and write a record >> batch? Could I copy the buffer directly somehow? > > No, you don't need to necessarily. The i

[jira] [Created] (ARROW-2351) [C++] StringBuilder::append(vector...) not implemented

2018-03-24 Thread Rares Vernica (JIRA)
Rares Vernica created ARROW-2351: Summary: [C++] StringBuilder::append(vector...) not implemented Key: ARROW-2351 URL: https://issues.apache.org/jira/browse/ARROW-2351 Project: Apache Arrow

C++ optimize stream output

2018-02-26 Thread Rares Vernica
Hello, I am using the C++ API to serialize and centralize data over the network. I am wondering if I am using the API in an efficient way. I have multiple nodes and a coordinator communicating over the network. I do not have fine control over the network communication. Individual nodes write one

[jira] [Created] (ARROW-2203) [C++] StderrStream class

2018-02-22 Thread Rares Vernica (JIRA)
Rares Vernica created ARROW-2203: Summary: [C++] StderrStream class Key: ARROW-2203 URL: https://issues.apache.org/jira/browse/ARROW-2203 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-2189) [C++] Seg. fault on make_shared

2018-02-19 Thread Rares Vernica (JIRA)
Rares Vernica created ARROW-2189: Summary: [C++] Seg. fault on make_shared Key: ARROW-2189 URL: https://issues.apache.org/jira/browse/ARROW-2189 Project: Apache Arrow Issue Type: Bug

C++ OutputStream for both StdoutStream and FileOutputStream

2018-02-19 Thread Rares Vernica
Hi, This might be more a C++ question, but I'm trying to have one variable store the output stream for both StdoutStream and FileOutputStream. I do this: shared_ptr f; if (fn == "stdout") f.reset(new StdoutStream()); else FileOutputStream::Open(fn, false, &f); As is, the code does not wo

[jira] [Created] (ARROW-2179) [C++] arrow/util/io-util.h missing from libarrow-dev

2018-02-19 Thread Rares Vernica (JIRA)
Rares Vernica created ARROW-2179: Summary: [C++] arrow/util/io-util.h missing from libarrow-dev Key: ARROW-2179 URL: https://issues.apache.org/jira/browse/ARROW-2179 Project: Apache Arrow

Merge multiple record batches

2018-02-13 Thread Rares Vernica
Hi, If I have multiple RecordBatchStreamReader inputs, what is the recommended way to get all the RecordBatch from all the inputs together, maybe in a Table? They all have the same schema. The source for the readers are different files. So, I do something like: reader1 = pa.open_stream('foo') ta

[jira] [Created] (ARROW-1801) [Docs] Update install instructions to use red-data-tools repos

2017-11-11 Thread Rares Vernica (JIRA)
Rares Vernica created ARROW-1801: Summary: [Docs] Update install instructions to use red-data-tools repos Key: ARROW-1801 URL: https://issues.apache.org/jira/browse/ARROW-1801 Project: Apache Arrow

[jira] [Created] (ARROW-1676) [C++] Featehr inserts 0 in the beginning and trims one value at the end

2017-10-15 Thread Rares Vernica (JIRA)
Rares Vernica created ARROW-1676: Summary: [C++] Featehr inserts 0 in the beginning and trims one value at the end Key: ARROW-1676 URL: https://issues.apache.org/jira/browse/ARROW-1676 Project

[C++] raw_values() for BinaryArray

2017-09-17 Thread Rares Vernica
Hi, I have a question about the Array C++ API. BinaryArray has a raw_value_offsets() public member. Should it also have a raw_vaues() public member to give a pointer to the start of raw data? Or is this not feasible? Thanks, Rares

Chunked arrays in Feather files

2017-09-15 Thread Rares Vernica
Hi, I have a question about chunks in Feather files. A TableReader can be used to read a Column. For a column, the data is in a ChunkedArray. For a Feather file, what is the chunk size? Can the chunk size be modified? Thanks! Rares

[jira] [Created] (ARROW-1545) Int64Builder should not need int64() as arg

2017-09-15 Thread Rares Vernica (JIRA)
Rares Vernica created ARROW-1545: Summary: Int64Builder should not need int64() as arg Key: ARROW-1545 URL: https://issues.apache.org/jira/browse/ARROW-1545 Project: Apache Arrow Issue Type

Reuse Buffer or BufferOutputStream

2017-09-10 Thread Rares Vernica
Hi, During the life of the program, can/should the Buffer or BufferOutputStream be resused? If the data in them is no longer needed, can they be reset? Or should I not worry about this as they get out of scope and just create new instances? What is the intended pattern? Thanks! Rares

pipe Feather bytes between processes

2017-09-10 Thread Rares Vernica
Hi, I am having trouble piping Feather structures between two processes. On the receiving-process side, I get: pyarrow.lib.ArrowIOError: [Errno 29] Illegal seek I have process A and process B which communicate via pipes. Process A sends the bytes of a Feather structure to process B. Process A co

[jira] [Created] (ARROW-1520) [Docs] PyArrow docs missing Feather documentation

2017-09-10 Thread Rares Vernica (JIRA)
Rares Vernica created ARROW-1520: Summary: [Docs] PyArrow docs missing Feather documentation Key: ARROW-1520 URL: https://issues.apache.org/jira/browse/ARROW-1520 Project: Apache Arrow Issue

[jira] [Created] (ARROW-1512) [Docs] NumericArray has no member named 'raw_data'

2017-09-09 Thread Rares Vernica (JIRA)
Rares Vernica created ARROW-1512: Summary: [Docs] NumericArray has no member named 'raw_data' Key: ARROW-1512 URL: https://issues.apache.org/jira/browse/ARROW-1512 Project: Ap

[jira] [Created] (ARROW-1378) whl is not a supported wheel on this platform on Debian/Jessie

2017-08-19 Thread Rares Vernica (JIRA)
Rares Vernica created ARROW-1378: Summary: whl is not a supported wheel on this platform on Debian/Jessie Key: ARROW-1378 URL: https://issues.apache.org/jira/browse/ARROW-1378 Project: Apache Arrow