Re: Hadoop File API v.s. Commons VFS

David Mollitor Tue, 10 Mar 2020 18:26:23 -0700

And by "wow" I mean to say, "your input was awesome and generous," not "wow
that's a lot of work."


On Tue, Mar 10, 2020, 9:24 PM David Mollitor <dam6...@gmail.com> wrote:

> Wow. Thanks for that started point.
>
> On Tue, Mar 10, 2020, 8:48 PM Owen O'Malley <owen.omal...@gmail.com>
> wrote:
>
>> It would be a lot of work. Of course there is a lot of overlap, but they
>> have different use cases, so there are significant differences. From the
>> big data side, there are a lot of blockers.
>>
>>    1. CVFS does not have the concept of replication, so there is no way
>>    to get or set a file's replication.
>>    2. It doesn't look like CVFS supports appending to files.
>>    3. CVFS doesn't support data locality.
>>    4. CVFS positioned reads are difficult/inefficient. The equivalent of
>>    file.readFully(seekPos, buffer, offset, length) is
>>    1. FileContent fc = file.getContent();
>>       2. RandomAccessContent random = fc.getRandomAccessContent();
>>       3. random.seek(seekPos);
>>       4. InputStream stream = random.getInputStream()
>>       5. loop until stream.read(buffer, offset, length) returns enough
>>       bytes.
>>
>> .. Owen
>>
>> On Tue, Mar 10, 2020 at 3:57 PM David Mollitor <dam6...@gmail.com> wrote:
>>
>>> I just see a lot of overlap and doubling of effort here.  Would be nice
>>> if
>>> we can all be working in tandem.
>>>
>>> On Tue, Mar 10, 2020, 6:36 PM Aaron Fabbri <ajfab...@gmail.com> wrote:
>>>
>>> > It is a good question. I'm not familiar with Apache commons VFS (which
>>> I
>>> > assume you are talking about, versus the BSD/Unix VFS layer). There no
>>> > doubt will be semantic differences between Hadoop FS interface and
>>> VFS. It
>>> > would be an interesting exercise to implement a connector that bridges
>>> the
>>> > gap, running a Hadoop FileSystem etc. on top of VFS libraries. Anyone
>>> else
>>> > looked at this or have experience with Apache VFS?
>>> >
>>> > On Fri, Feb 28, 2020 at 6:42 AM David Mollitor <dam6...@gmail.com>
>>> wrote:
>>> >
>>> >> Hello,
>>> >>
>>> >> I'm curious to know what the history of Hadoop File API is in
>>> relationship
>>> >> to VFS.  Hadoop supports several file schemes and so does VFS.  Why
>>> are
>>> >> there two projects working on this same effort and what are the
>>> pros/cons
>>> >> of each?
>>> >>
>>> >> Thanks.
>>> >>
>>> >
>>>
>>

Re: Hadoop File API v.s. Commons VFS

Reply via email to