Can someone remove me from this thread, please. Thanks
Sent from my iPhone > On Aug 18, 2016, at 8:54 PM, Micah Kornfield <emkornfi...@gmail.com> wrote: > > Thanks Julien for organizing the meeting and taking notes. I wrote up some > initial thoughts on shared memory IPC on > https://issues.apache.org/jira/browse/ARROW-263 > > I'll try to flesh out a more concrete spec today/tomorrow. > > -Micah > >> On Thu, Aug 18, 2016 at 10:25 AM, Julien Le Dem <jul...@dremio.com> wrote: >> >> My notes: (I'll schedule another one in 2 weeks but people should feel free >> to do ad-hoc discussion in the meantime) >> >> Attendees and their topic of interest for today: >> - Micah Kornfield: Dictionary encoding, Reusing dictionaries across record >> batches, Shared memory, memory management, releasing memory shared accross >> processes >> - Wes McKinney: Finalize types (Category, ...), File format RPC format, >> IPC >> - Julien Le Dem: finalize metadata (RPC, IPC, File), File format >> implementation, UDF use case >> - Erol: Shared memory across Java and C++ to share large amounts of data >> >> Arrow IPC: >> - Shared memory: >> - current version doesn’t do Schema negotiation yet. >> - all unit tests reading writing out memory with a predefined schema >> and known based address. >> - no dictionary encoding yet. >> - issues to discuss: >> - communicating the base memory address: >> - possibly use RPC for coordination. >> - options for shared memory >> - forking a process: anonymous shared memory implicitly >> - starting a new process. Need to spawn alternate shared memory that >> needs to be cleaned up >> - direct memory mapped system call (communicate file name to >> subprocess). >> - Action (Micah) create a JIRA to sum this up >> >> - Memory management: >> - the process producing the data will allocate the memory and pass it >> read only. It needs to wait for the consumer to be done to release it. >> - one option is memory mapped file (persistent independent of the >> process) >> - each process responsible for its memory. Reader needs to release >> memory. >> - mechanism for handling too much memory allocation. >> - In the case of record batches over RPC this is not an issue (memory is >> copied over). >> >> - RPC transport >> definition of the protocol and how we send message. >> - File transport >> >> - Dictionary encoding: >> - start simple: simple buffer<int> layout >> - enable extension in the future (v2: bit packing?) >> >> - Category type: >> - Semantic difference with Dictionary encoded. >> - TODO(Julien): Add Category type in Parquet? >> >> >>> On Thu, Aug 18, 2016 at 9:39 AM, Julien Le Dem <jul...@dremio.com> wrote: >>> >>> Hi Nicole. >>> Can you try again? >>> I was accepting you but it did not seem to work. >>> Julien >>> >>> On Thu, Aug 18, 2016 at 9:26 AM, Nicole Nemer <nicole.ne...@rms.com> >>> wrote: >>> >>>> I am trying to join and it not letting me inŠ >>>> nn >>>> ‹ >>>> Nicole Nemer, PhD >>>> Technical Architect/Dev Manager >>>> >>>> 303-641-3340 >>>> >>>> >>>> >>>> >>>> >>>> >>>>> On 8/18/16, 10:00 AM, "Julien Le Dem" <jul...@dremio.com> wrote: >>>>> >>>>> And this is starting now. >>>>> https://plus.google.com/hangouts/_/dremio.com/arrow >>>>> >>>>>> On Wed, Aug 17, 2016 at 7:07 PM, Julien Le Dem <jul...@dremio.com> >>>>> wrote: >>>>> >>>>>> Here is the hangout link for tomorrow: >>>>>> https://plus.google.com/hangouts/_/dremio.com/arrow >>>>>> >>>>>> I have also added to a google calendar event everyone who replied to >>>>>> that >>>>>> thread. >>>>>> >>>>>> >>>>>> On Wed, Aug 17, 2016 at 6:12 PM, Wes McKinney <wesmck...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> hi folks, >>>>>>> >>>>>>> Reminder that the Arrow sync is tomorrow morning at 09:00 Pacific >>>>>>> (http://timesched.pocoo.org/?date=2016-08-18&tz=pacific-stan >>>>>>> dard-time!&range=540,600). >>>>>>> I believe Julien will send a public Google hangout link to the >> mailing >>>>>>> list for you all to join. >>>>>>> >>>>>>> Thanks >>>>>>> Wes >>>>>>> >>>>>>> On Tue, Aug 16, 2016 at 11:07 AM, Wes McKinney <wesmck...@gmail.com >>> >>>>>>> wrote: >>>>>>>> +1. If there is demand for an Asia-friendly time we can change >>>> things >>>>>>> up from week to week. >>>>>>>> >>>>>>>>> On Aug 16, 2016, at 10:52 AM, Jacques Nadeau <jacq...@apache.org >>> >>>>>>> wrote: >>>>>>>>> >>>>>>>>> sounds good >>>>>>>>> >>>>>>>>>> On Tue, Aug 16, 2016 at 10:39 AM, Julien Le Dem < >>>> jul...@dremio.com> >>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Based on the feedback I'm proposing Thursday Aug 18 at 4PM UTC >> as >>>>>>> the >>>>>>> first >>>>>>>>>> Arrow sync. >>>>>>>>>> That's: >>>>>>>>>> - 9AM PDT (San Francisco) >>>>>>>>>> - 12PM EDT (New York) >>>>>>>>>> - 5PM CET (London) >>>>>>>>>> - 6PM CEST (Paris, Berlin) >>>>>>>>>> >>>>>>>>>>> On Tue, Aug 9, 2016 at 6:45 AM, Uwe L. Korn <uw...@xhochy.com> >>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> +1 for bi-weekly and europeen friendly times: CET (GMT+1) >>>>>>>>>>> >>>>>>>>>>>> Am 09.08.2016 um 00:39 schrieb Julien Le Dem < >> jul...@dremio.com >>>>> : >>>>>>>>>>>> >>>>>>>>>>>> Also to all who are responding let me know your timezone as >>>> well. >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Aug 8, 2016 at 3:30 PM, Micah Kornfield < >>>>>>> emkornfi...@gmail.com >>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Sounds good to me as well. Biweekly would be preferred. >>>>>>>>>>>>> >>>>>>>>>>>>>> On Monday, August 8, 2016, Wes McKinney < >> wesmck...@gmail.com> >>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> hi Julien -- this sounds like a good idea, also +1 for >>>>>>> bi-weekly. >>>>>>> I >>>>>>>>>>>>>> will do my best to join when possible. So far we've mostly >>>> been >>>>>>>>>>>>>> communicating via pull request, so I think periodic syncs >> will >>>>>>> be >>>>>>>>>>>>>> helpful. >>>>>>>>>>>>>> >>>>>>>>>>>>>> - Wes >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Aug 8, 2016 at 2:45 PM, P. Taylor Goetz < >>>>>>> ptgo...@gmail.com >>>>>>>>>>>>>> <javascript:;>> wrote: >>>>>>>>>>>>>>> +1 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> My preference would be for bi-weekly. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Taylor >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Aug 8, 2016, at 5:25 PM, Julien Le Dem < >>>> jul...@dremio.com >>>>>>>>>>>>>> <javascript:;>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>> My experience with Parquet is that a regular sync up over >>>>>>> hangout >>>>>>>>>>>>> helps >>>>>>>>>>>>>>>> keeping in touch and staying updated about what everyone >> is >>>>>>> doing. >>>>>>>>>>>>>>>> I was thinking of scheduling it weekly or bi-weekly. >>>>>>>>>>>>>>>> Who would join? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The way it goes is first we do a round table where people >>>>>>> introduce >>>>>>>>>>>>>>>> themselves an list the topics they'd like to talk or hear >>>>>>> about. >>>>>>>>>>>>>>>> That makes the agenda and we go through it. >>>>>>>>>>>>>>>> At the end we send notes to the mailing list with >>>> discussions >>>>>>> and >>>>>>>>>>>>> action >>>>>>>>>>>>>>>> items (for example: open JIRA, comment on JIRA, review PR, >>>>>>> etc). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Julien >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Julien >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Julien >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Julien >>>>> >>>>> >>>>> >>>>> -- >>>>> Julien >>> >>> >>> -- >>> Julien >> >> >> >> -- >> Julien >>