Below is a summary of the notes from this week's meeting Attendees:
- Ian Cook - Raúl Cumplido - Dewey Dunnington - Ian Joiner - Will Jones - David Li - Bryce Mecum - Rok Mihevc - Sri Nadukudy - Weston Pace - Dane Pitkin Discussion: Fixed Shape Tensor canonical ExtensionType proposal - It seems like we have converged to the final state at this point, but we are waiting for a few conversations to conclude - Alenka called a vote [1] but this sparked some additional feedback so she plans to give it a few more days then open a new vote next week PR automation Workflow - Proposal discussed on mailing list [2] has been implemented - There are a few hiccups that Raúl is working out - Feedback welcome Self-hosted arm64 runners [3] - Raúl has been working with ASF Infra and has set up a GitHub integration to add self-hosted runners at the organization level, which allows us to use them from multiple arrow repos in the apache organization on GitHub - This will allow us to retire some Travis CI jobs, but Travis CI will continue to be used for some Crossbow jobs, e.g. for s390x (big-endian) Initial nanoarrow release candidate [4] - We are looking for people to verify the RC Default Parquet row group size change [5] - This is specific to the Arrow C++ implementation and its bindings - Before this change, the default row group size was 64 million rows; this was based on a misunderstanding and is much too large - Weston has changed the default to 1 million rows - There was some discussion about whether this should be something smaller e.g. 100K rows, but overall there were no objections - This change caused a performance regression to write performance, which Weston is investigating [6] - Is it possible to set the row group size based on bytes instead of rows? Not yet but there was a recent change that should enable this [7] [1] https://lists.apache.org/thread/3cj0cr44hg3t2rn0kxly8td82yfob1nd [2] https://lists.apache.org/thread/1rhsd8ovy4bfr8hcdohn0vh65frw0ggk [3] https://lists.apache.org/thread/mskpqwpdq65t1wpj4f5klfq9217ljodw [4] https://lists.apache.org/thread/slomdw52n9j7jq8zwl5v8cb4v8yfk9sj [5] https://github.com/apache/arrow/pull/34281 [6] https://github.com/apache/arrow/issues/34374 [7] https://github.com/apache/arrow/pull/33897 On Tue, Feb 28, 2023 at 10:44 PM Ian Cook <i...@ursacomputing.com> wrote: > > Hi all, > > Our biweekly Arrow community meeting is tomorrow at 17:00 UTC / 12:00 EST. > > Zoom meeting URL: > https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09 > Meeting ID: 876 4903 3008 > Passcode: 958092 > > The notes for this and future instances of this meeting will be > captured in this Google Doc: > https://docs.google.com/document/d/1xrji8fc6_24TVmKiHJB4ECX1Zy2sy2eRbBjpVJMnPmk/ > If you plan to attend the meeting tomorrow, you are welcome to edit the > document to add the topics that you would like to discuss. > > Thanks, > Ian