Can someone remove me from this thread, please. 

Thanks

Sent from my iPhone

> On Aug 18, 2016, at 8:54 PM, Micah Kornfield <emkornfi...@gmail.com> wrote:
> 
> Thanks Julien for organizing the meeting and taking notes.  I wrote up some
> initial thoughts on shared memory IPC on
> https://issues.apache.org/jira/browse/ARROW-263
> 
> I'll try to flesh out a more concrete spec today/tomorrow.
> 
> -Micah
> 
>> On Thu, Aug 18, 2016 at 10:25 AM, Julien Le Dem <jul...@dremio.com> wrote:
>> 
>> My notes: (I'll schedule another one in 2 weeks but people should feel free
>> to do ad-hoc discussion in the meantime)
>> 
>> Attendees and their topic of interest for today:
>> - Micah Kornfield: Dictionary encoding, Reusing dictionaries across record
>> batches, Shared memory, memory management, releasing memory shared accross
>> processes
>> - Wes McKinney: Finalize types (Category, ...), File format RPC format,
>> IPC
>> - Julien Le Dem: finalize metadata (RPC, IPC, File), File format
>> implementation, UDF use case
>> - Erol: Shared memory across Java and C++ to share large amounts of data
>> 
>> Arrow IPC:
>>  - Shared memory:
>>     - current version doesn’t do Schema negotiation yet.
>>     - all unit tests reading writing out memory with a predefined schema
>> and known based address.
>>     - no dictionary encoding yet.
>>  - issues to discuss:
>>    - communicating the base memory address:
>>       - possibly use RPC for coordination.
>>    - options for shared memory
>>      - forking a process: anonymous shared memory implicitly
>>      - starting a new process. Need to spawn alternate shared memory that
>> needs to be cleaned up
>>      - direct memory mapped system call (communicate file name to
>> subprocess).
>>  - Action (Micah) create a JIRA to sum this up
>> 
>> - Memory management:
>>  - the process producing the data will allocate the memory and pass it
>> read only. It needs to wait for the consumer to be done to release it.
>>     - one option is memory mapped file (persistent independent of the
>> process)
>>     - each process responsible for its memory. Reader needs to release
>> memory.
>>  - mechanism for handling too much memory allocation.
>>  - In the case of record batches over RPC this is not an issue (memory is
>> copied over).
>> 
>>  - RPC transport
>>     definition of the protocol and how we send message.
>>  - File transport
>> 
>> - Dictionary encoding:
>>    - start simple: simple buffer<int> layout
>>    - enable extension in the future (v2: bit packing?)
>> 
>> - Category type:
>>   - Semantic difference with Dictionary encoded.
>>   - TODO(Julien): Add Category type in Parquet?
>> 
>> 
>>> On Thu, Aug 18, 2016 at 9:39 AM, Julien Le Dem <jul...@dremio.com> wrote:
>>> 
>>> Hi Nicole.
>>> Can you try again?
>>> I was accepting you but it did not seem to work.
>>> Julien
>>> 
>>> On Thu, Aug 18, 2016 at 9:26 AM, Nicole Nemer <nicole.ne...@rms.com>
>>> wrote:
>>> 
>>>> I am trying to join and it not letting me inŠ
>>>> nn
>>>> ‹
>>>> Nicole Nemer, PhD
>>>> Technical Architect/Dev Manager
>>>> 
>>>> 303-641-3340
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 8/18/16, 10:00 AM, "Julien Le Dem" <jul...@dremio.com> wrote:
>>>>> 
>>>>> And this is starting now.
>>>>> https://plus.google.com/hangouts/_/dremio.com/arrow
>>>>> 
>>>>>> On Wed, Aug 17, 2016 at 7:07 PM, Julien Le Dem <jul...@dremio.com>
>>>>> wrote:
>>>>> 
>>>>>> Here is the hangout link for tomorrow:
>>>>>> https://plus.google.com/hangouts/_/dremio.com/arrow
>>>>>> 
>>>>>> I have also added to a google calendar event everyone who replied to
>>>>>> that
>>>>>> thread.
>>>>>> 
>>>>>> 
>>>>>> On Wed, Aug 17, 2016 at 6:12 PM, Wes McKinney <wesmck...@gmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> hi folks,
>>>>>>> 
>>>>>>> Reminder that the Arrow sync is tomorrow morning at 09:00 Pacific
>>>>>>> (http://timesched.pocoo.org/?date=2016-08-18&tz=pacific-stan
>>>>>>> dard-time!&range=540,600).
>>>>>>> I believe Julien will send a public Google hangout link to the
>> mailing
>>>>>>> list for you all to join.
>>>>>>> 
>>>>>>> Thanks
>>>>>>> Wes
>>>>>>> 
>>>>>>> On Tue, Aug 16, 2016 at 11:07 AM, Wes McKinney <wesmck...@gmail.com
>>> 
>>>>>>> wrote:
>>>>>>>> +1. If there is demand for an Asia-friendly time we can change
>>>> things
>>>>>>> up from week to week.
>>>>>>>> 
>>>>>>>>> On Aug 16, 2016, at 10:52 AM, Jacques Nadeau <jacq...@apache.org
>>> 
>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> sounds good
>>>>>>>>> 
>>>>>>>>>> On Tue, Aug 16, 2016 at 10:39 AM, Julien Le Dem <
>>>> jul...@dremio.com>
>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Based on the feedback I'm proposing Thursday Aug 18 at 4PM UTC
>> as
>>>>>>> the
>>>>>>> first
>>>>>>>>>> Arrow sync.
>>>>>>>>>> That's:
>>>>>>>>>> - 9AM PDT (San Francisco)
>>>>>>>>>> - 12PM EDT (New York)
>>>>>>>>>> - 5PM CET (London)
>>>>>>>>>> - 6PM CEST (Paris, Berlin)
>>>>>>>>>> 
>>>>>>>>>>> On Tue, Aug 9, 2016 at 6:45 AM, Uwe L. Korn <uw...@xhochy.com>
>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> +1 for bi-weekly and europeen friendly times: CET (GMT+1)
>>>>>>>>>>> 
>>>>>>>>>>>> Am 09.08.2016 um 00:39 schrieb Julien Le Dem <
>> jul...@dremio.com
>>>>> :
>>>>>>>>>>>> 
>>>>>>>>>>>> Also to all who are responding let me know your timezone as
>>>> well.
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Aug 8, 2016 at 3:30 PM, Micah Kornfield <
>>>>>>> emkornfi...@gmail.com
>>>>>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Sounds good to me as well.  Biweekly would be preferred.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Monday, August 8, 2016, Wes McKinney <
>> wesmck...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> hi Julien -- this sounds like a good idea, also +1 for
>>>>>>> bi-weekly.
>>>>>>> I
>>>>>>>>>>>>>> will do my best to join when possible. So far we've mostly
>>>> been
>>>>>>>>>>>>>> communicating via pull request, so I think periodic syncs
>> will
>>>>>>> be
>>>>>>>>>>>>>> helpful.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - Wes
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Mon, Aug 8, 2016 at 2:45 PM, P. Taylor Goetz <
>>>>>>> ptgo...@gmail.com
>>>>>>>>>>>>>> <javascript:;>> wrote:
>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> My preference would be for bi-weekly.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> -Taylor
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Aug 8, 2016, at 5:25 PM, Julien Le Dem <
>>>> jul...@dremio.com
>>>>>>>>>>>>>> <javascript:;>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>> My experience with Parquet is that a regular sync up over
>>>>>>> hangout
>>>>>>>>>>>>> helps
>>>>>>>>>>>>>>>> keeping in touch and staying updated about what everyone
>> is
>>>>>>> doing.
>>>>>>>>>>>>>>>> I was thinking of scheduling it weekly or bi-weekly.
>>>>>>>>>>>>>>>> Who would join?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The way it goes is first we do a round table where people
>>>>>>> introduce
>>>>>>>>>>>>>>>> themselves an list the topics they'd like to talk or hear
>>>>>>> about.
>>>>>>>>>>>>>>>> That makes the agenda and we go through it.
>>>>>>>>>>>>>>>> At the end we send notes to the mailing list with
>>>> discussions
>>>>>>> and
>>>>>>>>>>>>> action
>>>>>>>>>>>>>>>> items (for example: open JIRA, comment on JIRA, review PR,
>>>>>>> etc).
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Julien
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Julien
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Julien
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Julien
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Julien
>>> 
>>> 
>>> --
>>> Julien
>> 
>> 
>> 
>> --
>> Julien
>> 

Reply via email to