Re: Getting issues in cpp build

2023-02-15 Thread Shaheer Ahmad
[1/249] Building CXX object src/arrow/CMakeFiles/arrow_shared.dir/array/array_binary.cc.obj FAILED: src/arrow/CMakeFiles/arrow_shared.dir/array/array_binary.cc.obj C:\MinGW\bin\c++.exe -DARROW_EXPORTING -DARROW_EXTRA_ERROR_CONTEXT -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_BMI2 -DARROW_HAVE_RUN

Re: Getting issues in cpp build

2023-02-15 Thread Bryce Mecum
Hi Shaheer, welcome! I think the mailing list may have had an issue with your attachment. If the output is short, could you reply with it here? If it's more than 10-20 lines, you might put it in a Gist [1] or similar type of pastebin and reply with a link. [1] https://gist.github.com/

Getting issues in cpp build

2023-02-15 Thread Shaheer Ahmad
I am a beginner in opensource contributions and I am following the guide to build the arrow’s source code. Following the steps, I installed the requirements by vcpkg, install cmake and ninja-debug to build using a preset and finally when I passed the command “cmake –build .” within “arrow/cpp” dire

Re: Question about memory usage and type casting using pyarrow Table

2023-02-15 Thread Aldrin
I think you can replace the schema metadata using [1]. You can perhaps also do the same for the field metadata, depending on where timezone metadata may be on a timestamp array [2]. [1]: https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.replace_schema_metadata [2]: ht

Re: Question about memory usage and type casting using pyarrow Table

2023-02-15 Thread Li Jin
Oh thanks that could be a workaround! I thought pa tables are supposed to be immutable , is there a safe way to just change the metadata? On Wed, Feb 15, 2023 at 5:44 PM Rok Mihevc wrote: > Well that's suboptimal. As a workaround I suppose you could just change the > metadata if the array is tim

Re: Question about memory usage and type casting using pyarrow Table

2023-02-15 Thread Rok Mihevc
Well that's suboptimal. As a workaround I suppose you could just change the metadata if the array is timezone aware. On Wed, Feb 15, 2023 at 10:37 PM Li Jin wrote: > Oh found this comment: > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_cast_temporal.cc#L156

Re: Question about memory usage and type casting using pyarrow Table

2023-02-15 Thread Li Jin
Oh found this comment: https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_cast_temporal.cc#L156 On Wed, Feb 15, 2023 at 4:23 PM Li Jin wrote: > Not sure if this is actually a bug or expected behavior - I filed > https://github.com/apache/arrow/issues/34210 > > On

Re: Question about memory usage and type casting using pyarrow Table

2023-02-15 Thread Li Jin
Not sure if this is actually a bug or expected behavior - I filed https://github.com/apache/arrow/issues/34210 On Wed, Feb 15, 2023 at 4:15 PM Li Jin wrote: > Hmm..something feels off here - I did the following experiment on Arrow 11 > and casting timestamp-naive to int64 is much faster than cas

Re: Question about memory usage and type casting using pyarrow Table

2023-02-15 Thread Li Jin
Hmm..something feels off here - I did the following experiment on Arrow 11 and casting timestamp-naive to int64 is much faster than casting timestamp-naive to timestamp-utc: In [16]: %time table.cast(schema_int) CPU times: user 114 µs, sys: 30 µs, total: 144 µs *Wall time: 231 µs* Out[16]: pyarrow

Re: [RESULT] Release Apache Arrow ADBC 0.2.0 - RC1

2023-02-15 Thread David Li
Thanks Kou, it worked! Post-release tasks: [x] Close the GitHub milestone/project [x] Add the new release to the Apache Reporter System [x] Upload source release artifacts to Subversion [x] Create the final GitHub release [x] Update website [x] Upload wheels/sdist to PyPI [x] Publish Maven package

Re: [RESULT] Release Apache Arrow ADBC 0.2.0 - RC1

2023-02-15 Thread Sutou Kouhei
Hi, > I am not an owner for the RubyGems package - would you mind adding me or > helping me do this? I've sent an invitation to you. If it doesn't work, I can update the RubyGems package. Thanks, -- kou In "[RESULT] Release Apache Arrow ADBC 0.2.0 - RC1" on Wed, 15 Feb 2023 13:52:57 -0500

Re: [VOTE] Release Apache Arrow ADBC 0.2.0 - RC1

2023-02-15 Thread Sutou Kouhei
> not finding /usr/bin/mkdir Could you show the log of this? In "Re: [VOTE] Release Apache Arrow ADBC 0.2.0 - RC1" on Wed, 15 Feb 2023 13:46:06 +0100, Joris Van den Bossche wrote: > +1 (binding) > > I ran the verification on Ubuntu 20.04 using conda: > > $ USE_CONDA=1 ARROW_TMPDIR=/tmp/

Re: Question about memory usage and type casting using pyarrow Table

2023-02-15 Thread Rok Mihevc
I'm not sure about (1) but I'm pretty sure for (2) doing a cast of tz-aware timestamp to tz-naive should be a metadata-only change. On Wed, Feb 15, 2023 at 4:19 PM Li Jin wrote: > Asking (2) because IIUC this is a metadata operation that could be zero > copy but I am not sure if this is actually

[RESULT] Release Apache Arrow ADBC 0.2.0 - RC1

2023-02-15 Thread David Li
The release is verified with 3 binding +1 votes, 3 non-binding +1 votes. I will handle post-release tasks: [x] Close the GitHub milestone/project [x] Add the new release to the Apache Reporter System [x] Upload source release artifacts to Subversion [x] Create the final GitHub release [x] Update

Re: Question about memory usage and type casting using pyarrow Table

2023-02-15 Thread Li Jin
Asking (2) because IIUC this is a metadata operation that could be zero copy but I am not sure if this is actually the case. On Wed, Feb 15, 2023 at 10:17 AM Li Jin wrote: > Hello! > > I have some questions about type casting memory usage with pyarrow Table. > Let's say I have a pyarrow Table wi

Re: [DISCUSS] Flight RPC/Flight SQL/ADBC enhancements

2023-02-15 Thread David Li
The ADBC and Flight SQL proposals have been updated for Micah/Taeyun/Will's comments. On Wed, Feb 15, 2023, at 09:17, David Li wrote: > Hi Taeyun, > > Thanks for the detailed feedback! > > - I will clarify that PollFlightInfo should return as quickly as > possible on the first call, and that upd

Question about memory usage and type casting using pyarrow Table

2023-02-15 Thread Li Jin
Hello! I have some questions about type casting memory usage with pyarrow Table. Let's say I have a pyarrow Table with 100 columns. (1) if I want to cast n columns to a different type (e.g., float to int). What is the smallest memory overhead that I can do? (memory overhead of 1 column, n columns

Arrow community meeting February 15 at 17:00 UTC

2023-02-15 Thread Ian Cook
Hi all, Our biweekly Arrow community meeting is today at 17:00 UTC / 12:00 EST. Zoom meeting URL: https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09 Meeting ID: 876 4903 3008 Passcode: 958092 The notes for this and future instances of this meeting will be captured in this Google

Re: [DISCUSS] Flight RPC/Flight SQL/ADBC enhancements

2023-02-15 Thread David Li
Hi Taeyun, Thanks for the detailed feedback! - I will clarify that PollFlightInfo should return as quickly as possible on the first call, and that updates in progress value are also OK (though the server shouldn't spam updates). (I wanted to avoid streaming calls as it does not work as well wi

Re: [VOTE] Release Apache Arrow ADBC 0.2.0 - RC1

2023-02-15 Thread Joris Van den Bossche
+1 (binding) I ran the verification on Ubuntu 20.04 using conda: $ USE_CONDA=1 ARROW_TMPDIR=/tmp/adbc-verification ./dev/release/verify-release-candidate.sh 0.2.0 1 ... Release candidate looks good! I only had a problem with installing some ruby dependencies (for GLIB tests), not finding /usr/bi

[DataFusion] Discuss impact of formatting changes on upgrade

2023-02-15 Thread Andrew Lamb
Hi, The most recent version of arrow-rs standardizes the display of many types so they are consistent across pretty print, json and csv output. However, this means that the prettyprint output changes for certain types and this may cause non trivial burden when upgrading downstream. For example: