hi William — you will need to send an email to dev-unsubscr...@arrow.apache.org
On Fri, Aug 19, 2016 at 11:57 AM, William Wood <willwo...@yahoo.com.invalid> wrote: > Can someone remove me from this thread, please. > > Thanks > > Sent from my iPhone > >> On Aug 18, 2016, at 8:54 PM, Micah Kornfield <emkornfi...@gmail.com> wrote: >> >> Thanks Julien for organizing the meeting and taking notes. I wrote up some >> initial thoughts on shared memory IPC on >> https://issues.apache.org/jira/browse/ARROW-263 >> >> I'll try to flesh out a more concrete spec today/tomorrow. >> >> -Micah >> >>> On Thu, Aug 18, 2016 at 10:25 AM, Julien Le Dem <jul...@dremio.com> wrote: >>> >>> My notes: (I'll schedule another one in 2 weeks but people should feel free >>> to do ad-hoc discussion in the meantime) >>> >>> Attendees and their topic of interest for today: >>> - Micah Kornfield: Dictionary encoding, Reusing dictionaries across record >>> batches, Shared memory, memory management, releasing memory shared accross >>> processes >>> - Wes McKinney: Finalize types (Category, ...), File format RPC format, >>> IPC >>> - Julien Le Dem: finalize metadata (RPC, IPC, File), File format >>> implementation, UDF use case >>> - Erol: Shared memory across Java and C++ to share large amounts of data >>> >>> Arrow IPC: >>> - Shared memory: >>> - current version doesn’t do Schema negotiation yet. >>> - all unit tests reading writing out memory with a predefined schema >>> and known based address. >>> - no dictionary encoding yet. >>> - issues to discuss: >>> - communicating the base memory address: >>> - possibly use RPC for coordination. >>> - options for shared memory >>> - forking a process: anonymous shared memory implicitly >>> - starting a new process. Need to spawn alternate shared memory that >>> needs to be cleaned up >>> - direct memory mapped system call (communicate file name to >>> subprocess). >>> - Action (Micah) create a JIRA to sum this up >>> >>> - Memory management: >>> - the process producing the data will allocate the memory and pass it >>> read only. It needs to wait for the consumer to be done to release it. >>> - one option is memory mapped file (persistent independent of the >>> process) >>> - each process responsible for its memory. Reader needs to release >>> memory. >>> - mechanism for handling too much memory allocation. >>> - In the case of record batches over RPC this is not an issue (memory is >>> copied over). >>> >>> - RPC transport >>> definition of the protocol and how we send message. >>> - File transport >>> >>> - Dictionary encoding: >>> - start simple: simple buffer<int> layout >>> - enable extension in the future (v2: bit packing?) >>> >>> - Category type: >>> - Semantic difference with Dictionary encoded. >>> - TODO(Julien): Add Category type in Parquet? >>> >>> >>>> On Thu, Aug 18, 2016 at 9:39 AM, Julien Le Dem <jul...@dremio.com> wrote: >>>> >>>> Hi Nicole. >>>> Can you try again? >>>> I was accepting you but it did not seem to work. >>>> Julien >>>> >>>> On Thu, Aug 18, 2016 at 9:26 AM, Nicole Nemer <nicole.ne...@rms.com> >>>> wrote: >>>> >>>>> I am trying to join and it not letting me inŠ >>>>> nn >>>>> ‹ >>>>> Nicole Nemer, PhD >>>>> Technical Architect/Dev Manager >>>>> >>>>> 303-641-3340 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> On 8/18/16, 10:00 AM, "Julien Le Dem" <jul...@dremio.com> wrote: >>>>>> >>>>>> And this is starting now. >>>>>> https://plus.google.com/hangouts/_/dremio.com/arrow >>>>>> >>>>>>> On Wed, Aug 17, 2016 at 7:07 PM, Julien Le Dem <jul...@dremio.com> >>>>>> wrote: >>>>>> >>>>>>> Here is the hangout link for tomorrow: >>>>>>> https://plus.google.com/hangouts/_/dremio.com/arrow >>>>>>> >>>>>>> I have also added to a google calendar event everyone who replied to >>>>>>> that >>>>>>> thread. >>>>>>> >>>>>>> >>>>>>> On Wed, Aug 17, 2016 at 6:12 PM, Wes McKinney <wesmck...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> hi folks, >>>>>>>> >>>>>>>> Reminder that the Arrow sync is tomorrow morning at 09:00 Pacific >>>>>>>> (http://timesched.pocoo.org/?date=2016-08-18&tz=pacific-stan >>>>>>>> dard-time!&range=540,600). >>>>>>>> I believe Julien will send a public Google hangout link to the >>> mailing >>>>>>>> list for you all to join. >>>>>>>> >>>>>>>> Thanks >>>>>>>> Wes >>>>>>>> >>>>>>>> On Tue, Aug 16, 2016 at 11:07 AM, Wes McKinney <wesmck...@gmail.com >>>> >>>>>>>> wrote: >>>>>>>>> +1. If there is demand for an Asia-friendly time we can change >>>>> things >>>>>>>> up from week to week. >>>>>>>>> >>>>>>>>>> On Aug 16, 2016, at 10:52 AM, Jacques Nadeau <jacq...@apache.org >>>> >>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> sounds good >>>>>>>>>> >>>>>>>>>>> On Tue, Aug 16, 2016 at 10:39 AM, Julien Le Dem < >>>>> jul...@dremio.com> >>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Based on the feedback I'm proposing Thursday Aug 18 at 4PM UTC >>> as >>>>>>>> the >>>>>>>> first >>>>>>>>>>> Arrow sync. >>>>>>>>>>> That's: >>>>>>>>>>> - 9AM PDT (San Francisco) >>>>>>>>>>> - 12PM EDT (New York) >>>>>>>>>>> - 5PM CET (London) >>>>>>>>>>> - 6PM CEST (Paris, Berlin) >>>>>>>>>>> >>>>>>>>>>>> On Tue, Aug 9, 2016 at 6:45 AM, Uwe L. Korn <uw...@xhochy.com> >>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> +1 for bi-weekly and europeen friendly times: CET (GMT+1) >>>>>>>>>>>> >>>>>>>>>>>>> Am 09.08.2016 um 00:39 schrieb Julien Le Dem < >>> jul...@dremio.com >>>>>> : >>>>>>>>>>>>> >>>>>>>>>>>>> Also to all who are responding let me know your timezone as >>>>> well. >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Aug 8, 2016 at 3:30 PM, Micah Kornfield < >>>>>>>> emkornfi...@gmail.com >>>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Sounds good to me as well. Biweekly would be preferred. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Monday, August 8, 2016, Wes McKinney < >>> wesmck...@gmail.com> >>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> hi Julien -- this sounds like a good idea, also +1 for >>>>>>>> bi-weekly. >>>>>>>> I >>>>>>>>>>>>>>> will do my best to join when possible. So far we've mostly >>>>> been >>>>>>>>>>>>>>> communicating via pull request, so I think periodic syncs >>> will >>>>>>>> be >>>>>>>>>>>>>>> helpful. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - Wes >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Aug 8, 2016 at 2:45 PM, P. Taylor Goetz < >>>>>>>> ptgo...@gmail.com >>>>>>>>>>>>>>> <javascript:;>> wrote: >>>>>>>>>>>>>>>> +1 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> My preference would be for bi-weekly. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -Taylor >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Aug 8, 2016, at 5:25 PM, Julien Le Dem < >>>>> jul...@dremio.com >>>>>>>>>>>>>>> <javascript:;>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>> My experience with Parquet is that a regular sync up over >>>>>>>> hangout >>>>>>>>>>>>>> helps >>>>>>>>>>>>>>>>> keeping in touch and staying updated about what everyone >>> is >>>>>>>> doing. >>>>>>>>>>>>>>>>> I was thinking of scheduling it weekly or bi-weekly. >>>>>>>>>>>>>>>>> Who would join? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The way it goes is first we do a round table where people >>>>>>>> introduce >>>>>>>>>>>>>>>>> themselves an list the topics they'd like to talk or hear >>>>>>>> about. >>>>>>>>>>>>>>>>> That makes the agenda and we go through it. >>>>>>>>>>>>>>>>> At the end we send notes to the mailing list with >>>>> discussions >>>>>>>> and >>>>>>>>>>>>>> action >>>>>>>>>>>>>>>>> items (for example: open JIRA, comment on JIRA, review PR, >>>>>>>> etc). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> Julien >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Julien >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Julien >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Julien >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Julien >>>> >>>> >>>> -- >>>> Julien >>> >>> >>> >>> -- >>> Julien >>> >