Below is a summary of the notes from this week's meeting

Attendees:

 - Ian Cook
 - Raúl Cumplido
 - Dewey Dunnington
 - Ian Joiner
 - Will Jones
 - David Li
 - Bryce Mecum
 - Rok Mihevc
 - Sri Nadukudy
 - Weston Pace
 - Dane Pitkin


Discussion:

Fixed Shape Tensor canonical ExtensionType proposal
 - It seems like we have converged to the final state at this point,
but we are waiting for a few conversations to conclude
 - Alenka called a vote [1] but this sparked some additional feedback
so she plans to give it a few more days then open a new vote next week


PR automation Workflow
 - Proposal discussed on mailing list [2] has been implemented
 - There are a few hiccups that Raúl is working out
 - Feedback welcome


Self-hosted arm64 runners [3]
 - Raúl has been working with ASF Infra and has set up a GitHub
integration to add self-hosted runners at the organization level,
which allows us to use them from multiple arrow repos in the apache
organization on GitHub
 - This will allow us to retire some Travis CI jobs, but Travis CI
will continue to be used for some Crossbow jobs, e.g. for s390x
(big-endian)


Initial nanoarrow release candidate [4]
 - We are looking for people to verify the RC


Default Parquet row group size change [5]
 - This is specific to the Arrow C++ implementation and its bindings
 - Before this change, the default row group size was 64 million rows;
this was based on a misunderstanding and is much too large
 - Weston has changed the default to 1 million rows
 - There was some discussion about whether this should be something
smaller e.g. 100K rows, but overall there were no objections
 - This change caused a performance regression to write performance,
which Weston is investigating [6]
 - Is it possible to set the row group size based on bytes instead of
rows? Not yet but there was a recent change that should enable this
[7]


[1] https://lists.apache.org/thread/3cj0cr44hg3t2rn0kxly8td82yfob1nd
[2] https://lists.apache.org/thread/1rhsd8ovy4bfr8hcdohn0vh65frw0ggk
[3] https://lists.apache.org/thread/mskpqwpdq65t1wpj4f5klfq9217ljodw
[4] https://lists.apache.org/thread/slomdw52n9j7jq8zwl5v8cb4v8yfk9sj
[5] https://github.com/apache/arrow/pull/34281
[6] https://github.com/apache/arrow/issues/34374
[7] https://github.com/apache/arrow/pull/33897



On Tue, Feb 28, 2023 at 10:44 PM Ian Cook <i...@ursacomputing.com> wrote:
>
> Hi all,
>
> Our biweekly Arrow community meeting is tomorrow at 17:00 UTC / 12:00 EST.
>
> Zoom meeting URL:
> https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09
> Meeting ID: 876 4903 3008
> Passcode: 958092
>
> The notes for this and future instances of this meeting will be
> captured in this Google Doc:
> https://docs.google.com/document/d/1xrji8fc6_24TVmKiHJB4ECX1Zy2sy2eRbBjpVJMnPmk/
> If you plan to attend the meeting tomorrow, you are welcome to edit the
> document to add the topics that you would like to discuss.
>
> Thanks,
> Ian

Reply via email to