William, You have subscribed to dev@arrow.apache.org. to unsubscribe email dev-unsubscr...@arrow.apache.org
On Thu, Aug 18, 2016 at 7:57 PM, William Wood <willwo...@yahoo.com.invalid> wrote: > Can someone remove me from this thread, please. > > Thanks > > Sent from my iPhone > > > On Aug 18, 2016, at 8:54 PM, Micah Kornfield <emkornfi...@gmail.com> > wrote: > > > > Thanks Julien for organizing the meeting and taking notes. I wrote up > some > > initial thoughts on shared memory IPC on > > https://issues.apache.org/jira/browse/ARROW-263 > > > > I'll try to flesh out a more concrete spec today/tomorrow. > > > > -Micah > > > >> On Thu, Aug 18, 2016 at 10:25 AM, Julien Le Dem <jul...@dremio.com> > wrote: > >> > >> My notes: (I'll schedule another one in 2 weeks but people should feel > free > >> to do ad-hoc discussion in the meantime) > >> > >> Attendees and their topic of interest for today: > >> - Micah Kornfield: Dictionary encoding, Reusing dictionaries across > record > >> batches, Shared memory, memory management, releasing memory shared > accross > >> processes > >> - Wes McKinney: Finalize types (Category, ...), File format RPC format, > >> IPC > >> - Julien Le Dem: finalize metadata (RPC, IPC, File), File format > >> implementation, UDF use case > >> - Erol: Shared memory across Java and C++ to share large amounts of data > >> > >> Arrow IPC: > >> - Shared memory: > >> - current version doesn’t do Schema negotiation yet. > >> - all unit tests reading writing out memory with a predefined schema > >> and known based address. > >> - no dictionary encoding yet. > >> - issues to discuss: > >> - communicating the base memory address: > >> - possibly use RPC for coordination. > >> - options for shared memory > >> - forking a process: anonymous shared memory implicitly > >> - starting a new process. Need to spawn alternate shared memory > that > >> needs to be cleaned up > >> - direct memory mapped system call (communicate file name to > >> subprocess). > >> - Action (Micah) create a JIRA to sum this up > >> > >> - Memory management: > >> - the process producing the data will allocate the memory and pass it > >> read only. It needs to wait for the consumer to be done to release it. > >> - one option is memory mapped file (persistent independent of the > >> process) > >> - each process responsible for its memory. Reader needs to release > >> memory. > >> - mechanism for handling too much memory allocation. > >> - In the case of record batches over RPC this is not an issue (memory > is > >> copied over). > >> > >> - RPC transport > >> definition of the protocol and how we send message. > >> - File transport > >> > >> - Dictionary encoding: > >> - start simple: simple buffer<int> layout > >> - enable extension in the future (v2: bit packing?) > >> > >> - Category type: > >> - Semantic difference with Dictionary encoded. > >> - TODO(Julien): Add Category type in Parquet? > >> > >> > >>> On Thu, Aug 18, 2016 at 9:39 AM, Julien Le Dem <jul...@dremio.com> > wrote: > >>> > >>> Hi Nicole. > >>> Can you try again? > >>> I was accepting you but it did not seem to work. > >>> Julien > >>> > >>> On Thu, Aug 18, 2016 at 9:26 AM, Nicole Nemer <nicole.ne...@rms.com> > >>> wrote: > >>> > >>>> I am trying to join and it not letting me inŠ > >>>> nn > >>>> ‹ > >>>> Nicole Nemer, PhD > >>>> Technical Architect/Dev Manager > >>>> > >>>> 303-641-3340 > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>> On 8/18/16, 10:00 AM, "Julien Le Dem" <jul...@dremio.com> wrote: > >>>>> > >>>>> And this is starting now. > >>>>> https://plus.google.com/hangouts/_/dremio.com/arrow > >>>>> > >>>>>> On Wed, Aug 17, 2016 at 7:07 PM, Julien Le Dem <jul...@dremio.com> > >>>>> wrote: > >>>>> > >>>>>> Here is the hangout link for tomorrow: > >>>>>> https://plus.google.com/hangouts/_/dremio.com/arrow > >>>>>> > >>>>>> I have also added to a google calendar event everyone who replied to > >>>>>> that > >>>>>> thread. > >>>>>> > >>>>>> > >>>>>> On Wed, Aug 17, 2016 at 6:12 PM, Wes McKinney <wesmck...@gmail.com> > >>>>>> wrote: > >>>>>> > >>>>>>> hi folks, > >>>>>>> > >>>>>>> Reminder that the Arrow sync is tomorrow morning at 09:00 Pacific > >>>>>>> (http://timesched.pocoo.org/?date=2016-08-18&tz=pacific-stan > >>>>>>> dard-time!&range=540,600). > >>>>>>> I believe Julien will send a public Google hangout link to the > >> mailing > >>>>>>> list for you all to join. > >>>>>>> > >>>>>>> Thanks > >>>>>>> Wes > >>>>>>> > >>>>>>> On Tue, Aug 16, 2016 at 11:07 AM, Wes McKinney < > wesmck...@gmail.com > >>> > >>>>>>> wrote: > >>>>>>>> +1. If there is demand for an Asia-friendly time we can change > >>>> things > >>>>>>> up from week to week. > >>>>>>>> > >>>>>>>>> On Aug 16, 2016, at 10:52 AM, Jacques Nadeau <jacq...@apache.org > >>> > >>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> sounds good > >>>>>>>>> > >>>>>>>>>> On Tue, Aug 16, 2016 at 10:39 AM, Julien Le Dem < > >>>> jul...@dremio.com> > >>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> Based on the feedback I'm proposing Thursday Aug 18 at 4PM UTC > >> as > >>>>>>> the > >>>>>>> first > >>>>>>>>>> Arrow sync. > >>>>>>>>>> That's: > >>>>>>>>>> - 9AM PDT (San Francisco) > >>>>>>>>>> - 12PM EDT (New York) > >>>>>>>>>> - 5PM CET (London) > >>>>>>>>>> - 6PM CEST (Paris, Berlin) > >>>>>>>>>> > >>>>>>>>>>> On Tue, Aug 9, 2016 at 6:45 AM, Uwe L. Korn <uw...@xhochy.com> > >>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> +1 for bi-weekly and europeen friendly times: CET (GMT+1) > >>>>>>>>>>> > >>>>>>>>>>>> Am 09.08.2016 um 00:39 schrieb Julien Le Dem < > >> jul...@dremio.com > >>>>> : > >>>>>>>>>>>> > >>>>>>>>>>>> Also to all who are responding let me know your timezone as > >>>> well. > >>>>>>>>>>>> > >>>>>>>>>>>> On Mon, Aug 8, 2016 at 3:30 PM, Micah Kornfield < > >>>>>>> emkornfi...@gmail.com > >>>>>>>>>>> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Sounds good to me as well. Biweekly would be preferred. > >>>>>>>>>>>>> > >>>>>>>>>>>>>> On Monday, August 8, 2016, Wes McKinney < > >> wesmck...@gmail.com> > >>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> hi Julien -- this sounds like a good idea, also +1 for > >>>>>>> bi-weekly. > >>>>>>> I > >>>>>>>>>>>>>> will do my best to join when possible. So far we've mostly > >>>> been > >>>>>>>>>>>>>> communicating via pull request, so I think periodic syncs > >> will > >>>>>>> be > >>>>>>>>>>>>>> helpful. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - Wes > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Mon, Aug 8, 2016 at 2:45 PM, P. Taylor Goetz < > >>>>>>> ptgo...@gmail.com > >>>>>>>>>>>>>> <javascript:;>> wrote: > >>>>>>>>>>>>>>> +1 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> My preference would be for bi-weekly. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -Taylor > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Aug 8, 2016, at 5:25 PM, Julien Le Dem < > >>>> jul...@dremio.com > >>>>>>>>>>>>>> <javascript:;>> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Hi all, > >>>>>>>>>>>>>>>> My experience with Parquet is that a regular sync up over > >>>>>>> hangout > >>>>>>>>>>>>> helps > >>>>>>>>>>>>>>>> keeping in touch and staying updated about what everyone > >> is > >>>>>>> doing. > >>>>>>>>>>>>>>>> I was thinking of scheduling it weekly or bi-weekly. > >>>>>>>>>>>>>>>> Who would join? > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> The way it goes is first we do a round table where people > >>>>>>> introduce > >>>>>>>>>>>>>>>> themselves an list the topics they'd like to talk or hear > >>>>>>> about. > >>>>>>>>>>>>>>>> That makes the agenda and we go through it. > >>>>>>>>>>>>>>>> At the end we send notes to the mailing list with > >>>> discussions > >>>>>>> and > >>>>>>>>>>>>> action > >>>>>>>>>>>>>>>> items (for example: open JIRA, comment on JIRA, review PR, > >>>>>>> etc). > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>> Julien > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -- > >>>>>>>>>>>> Julien > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> Julien > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Julien > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Julien > >>> > >>> > >>> -- > >>> Julien > >> > >> > >> > >> -- > >> Julien > >> > > -- Julien