My notes: (I'll schedule another one in 2 weeks but people should feel free
to do ad-hoc discussion in the meantime)

Attendees and their topic of interest for today:
 - Micah Kornfield: Dictionary encoding, Reusing dictionaries across record
batches, Shared memory, memory management, releasing memory shared accross
processes
 - Wes McKinney: Finalize types (Category, ...), File format RPC format, IPC
 - Julien Le Dem: finalize metadata (RPC, IPC, File), File format
implementation, UDF use case
 - Erol: Shared memory across Java and C++ to share large amounts of data

Arrow IPC:
  - Shared memory:
     - current version doesn’t do Schema negotiation yet.
     - all unit tests reading writing out memory with a predefined schema
and known based address.
     - no dictionary encoding yet.
  - issues to discuss:
    - communicating the base memory address:
       - possibly use RPC for coordination.
    - options for shared memory
      - forking a process: anonymous shared memory implicitly
      - starting a new process. Need to spawn alternate shared memory that
needs to be cleaned up
      - direct memory mapped system call (communicate file name to
subprocess).
  - Action (Micah) create a JIRA to sum this up

 - Memory management:
  - the process producing the data will allocate the memory and pass it
read only. It needs to wait for the consumer to be done to release it.
     - one option is memory mapped file (persistent independent of the
process)
     - each process responsible for its memory. Reader needs to release
memory.
  - mechanism for handling too much memory allocation.
  - In the case of record batches over RPC this is not an issue (memory is
copied over).

  - RPC transport
     definition of the protocol and how we send message.
  - File transport

 - Dictionary encoding:
    - start simple: simple buffer<int> layout
    - enable extension in the future (v2: bit packing?)

- Category type:
   - Semantic difference with Dictionary encoded.
   - TODO(Julien): Add Category type in Parquet?


On Thu, Aug 18, 2016 at 9:39 AM, Julien Le Dem <jul...@dremio.com> wrote:

> Hi Nicole.
> Can you try again?
> I was accepting you but it did not seem to work.
> Julien
>
> On Thu, Aug 18, 2016 at 9:26 AM, Nicole Nemer <nicole.ne...@rms.com>
> wrote:
>
>> I am trying to join and it not letting me inŠ
>> nn
>> ‹
>> Nicole Nemer, PhD
>> Technical Architect/Dev Manager
>>
>> 303-641-3340
>>
>>
>>
>>
>>
>>
>> On 8/18/16, 10:00 AM, "Julien Le Dem" <jul...@dremio.com> wrote:
>>
>> >And this is starting now.
>> >https://plus.google.com/hangouts/_/dremio.com/arrow
>> >
>> >On Wed, Aug 17, 2016 at 7:07 PM, Julien Le Dem <jul...@dremio.com>
>> wrote:
>> >
>> >> Here is the hangout link for tomorrow:
>> >> https://plus.google.com/hangouts/_/dremio.com/arrow
>> >>
>> >> I have also added to a google calendar event everyone who replied to
>> >>that
>> >> thread.
>> >>
>> >>
>> >> On Wed, Aug 17, 2016 at 6:12 PM, Wes McKinney <wesmck...@gmail.com>
>> >>wrote:
>> >>
>> >>> hi folks,
>> >>>
>> >>> Reminder that the Arrow sync is tomorrow morning at 09:00 Pacific
>> >>> (http://timesched.pocoo.org/?date=2016-08-18&tz=pacific-stan
>> >>> dard-time!&range=540,600).
>> >>> I believe Julien will send a public Google hangout link to the mailing
>> >>> list for you all to join.
>> >>>
>> >>> Thanks
>> >>> Wes
>> >>>
>> >>> On Tue, Aug 16, 2016 at 11:07 AM, Wes McKinney <wesmck...@gmail.com>
>> >>> wrote:
>> >>> > +1. If there is demand for an Asia-friendly time we can change
>> things
>> >>> up from week to week.
>> >>> >
>> >>> >> On Aug 16, 2016, at 10:52 AM, Jacques Nadeau <jacq...@apache.org>
>> >>> wrote:
>> >>> >>
>> >>> >> sounds good
>> >>> >>
>> >>> >>> On Tue, Aug 16, 2016 at 10:39 AM, Julien Le Dem <
>> jul...@dremio.com>
>> >>> wrote:
>> >>> >>>
>> >>> >>> Based on the feedback I'm proposing Thursday Aug 18 at 4PM UTC as
>> >>>the
>> >>> first
>> >>> >>> Arrow sync.
>> >>> >>> That's:
>> >>> >>> - 9AM PDT (San Francisco)
>> >>> >>> - 12PM EDT (New York)
>> >>> >>> - 5PM CET (London)
>> >>> >>> - 6PM CEST (Paris, Berlin)
>> >>> >>>
>> >>> >>>> On Tue, Aug 9, 2016 at 6:45 AM, Uwe L. Korn <uw...@xhochy.com>
>> >>> wrote:
>> >>> >>>>
>> >>> >>>> +1 for bi-weekly and europeen friendly times: CET (GMT+1)
>> >>> >>>>
>> >>> >>>>> Am 09.08.2016 um 00:39 schrieb Julien Le Dem <jul...@dremio.com
>> >:
>> >>> >>>>>
>> >>> >>>>> Also to all who are responding let me know your timezone as
>> well.
>> >>> >>>>>
>> >>> >>>>> On Mon, Aug 8, 2016 at 3:30 PM, Micah Kornfield <
>> >>> emkornfi...@gmail.com
>> >>> >>>>
>> >>> >>>>> wrote:
>> >>> >>>>>
>> >>> >>>>>> Sounds good to me as well.  Biweekly would be preferred.
>> >>> >>>>>>
>> >>> >>>>>>> On Monday, August 8, 2016, Wes McKinney <wesmck...@gmail.com>
>> >>> wrote:
>> >>> >>>>>>>
>> >>> >>>>>>> hi Julien -- this sounds like a good idea, also +1 for
>> >>>bi-weekly.
>> >>> I
>> >>> >>>>>>> will do my best to join when possible. So far we've mostly
>> been
>> >>> >>>>>>> communicating via pull request, so I think periodic syncs will
>> >>>be
>> >>> >>>>>>> helpful.
>> >>> >>>>>>>
>> >>> >>>>>>> - Wes
>> >>> >>>>>>>
>> >>> >>>>>>> On Mon, Aug 8, 2016 at 2:45 PM, P. Taylor Goetz <
>> >>> ptgo...@gmail.com
>> >>> >>>>>>> <javascript:;>> wrote:
>> >>> >>>>>>>> +1
>> >>> >>>>>>>>
>> >>> >>>>>>>> My preference would be for bi-weekly.
>> >>> >>>>>>>>
>> >>> >>>>>>>> -Taylor
>> >>> >>>>>>>>
>> >>> >>>>>>>>> On Aug 8, 2016, at 5:25 PM, Julien Le Dem <
>> jul...@dremio.com
>> >>> >>>>>>> <javascript:;>> wrote:
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> Hi all,
>> >>> >>>>>>>>> My experience with Parquet is that a regular sync up over
>> >>> hangout
>> >>> >>>>>> helps
>> >>> >>>>>>>>> keeping in touch and staying updated about what everyone is
>> >>> doing.
>> >>> >>>>>>>>> I was thinking of scheduling it weekly or bi-weekly.
>> >>> >>>>>>>>> Who would join?
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> The way it goes is first we do a round table where people
>> >>> introduce
>> >>> >>>>>>>>> themselves an list the topics they'd like to talk or hear
>> >>>about.
>> >>> >>>>>>>>> That makes the agenda and we go through it.
>> >>> >>>>>>>>> At the end we send notes to the mailing list with
>> discussions
>> >>> and
>> >>> >>>>>> action
>> >>> >>>>>>>>> items (for example: open JIRA, comment on JIRA, review PR,
>> >>>etc).
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> --
>> >>> >>>>>>>>> Julien
>> >>> >>>>>
>> >>> >>>>>
>> >>> >>>>>
>> >>> >>>>> --
>> >>> >>>>> Julien
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>> --
>> >>> >>> Julien
>> >>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Julien
>> >>
>> >
>> >
>> >
>> >--
>> >Julien
>>
>>
>
>
> --
> Julien
>



-- 
Julien

Reply via email to