Sorry for the late answear
But thanks for the tips guys!!!

Now the hard work i think would be try to understand HDFS wire
protocol on ipc package.

thanks!!


On Tue, Apr 6, 2010 at 4:50 PM, Brian Bockelman <bbock...@cse.unl.edu> wrote:
> Hey Jay,
>
> I think, if you're experienced in implementing transfer protocols, it is not 
> difficult to implement the HDFS wire protocol.  As you point out, they are 
> subject to change between releases (especially between 0.20, 0.21, and 0.22) 
> and basically documented in fragments in the java source code.  At least, I 
> looked at doing this for the read portions, and it wasn't horrible.
>
> However, the *really hard part* is the client retry/recovery logic.  That's 
> where a lot of the intelligence is, in very large classes, and not incredibly 
> well-documented.
>
> I've had lots of luck with scaling libhdfs - we average >20TB / day and 
> billions of I/O operations a day with it.  I'd strongly advise not 
> re-inventing the wheel, unless it's for a research project.
>
> Brian
>
> On Apr 6, 2010, at 8:53 AM, Jay Booth wrote:
>
>> A pure C library to communicate with HDFS?
>>
>> Certainly possible, but it would be a lot of work, and the HDFS wire
>> protocols are ad hoc, only somewhat documented and subject to change between
>> releases right now so you'd be chasing a moving target.  I'd try to think of
>> another way to accomplish what you want to do before attempting a client
>> reimplementation in C right now..  if you only need to talk to the namenode
>> and not the datanodes it might be a little easier but still, lots of work
>> that will probably be obsolete after another release or two.
>>
>>
>> On Tue, Apr 6, 2010 at 9:47 AM, Alberich de megres 
>> <alberich...@gmail.com>wrote:
>>
>>> Thanks!
>>>
>>> I'm already using eclipse to browse the code.
>>> In this scenario, i could understand that java serializes the object
>>> through the network and its parameters.  is that ok?
>>>
>>> For example, if i want to make a pure C library (with no JNI
>>> interfaces).. is it possible/feasible? or it will be like to freeze
>>> the hell?
>>>
>>> Thanks once again!!!
>>>
>>>
>>> On Sat, Apr 3, 2010 at 1:54 AM, Ryan Rawson <ryano...@gmail.com> wrote:
>>>> If you look at the getProxy code it passes an "Invoker" (or something
>>>> like that) which the proxy code uses to delegate calls TO.  The
>>>> Invoker will call another class "Client" which has sub-classes like
>>>> Call, and Connection which wrap the actual java IO.  This all lives in
>>>> the org.apache.hadoop.ipc package.
>>>>
>>>> Be sure to use a good IDE like IJ or Eclipse to browse the code, it
>>>> makes following all this stuff much easier.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Apr 2, 2010 at 4:39 PM, Alberich de megres
>>>> <alberich...@gmail.com> wrote:
>>>>> Hi again!
>>>>>
>>>>> Anyone could help me?
>>>>> I could not understand how RPC class works. For me, only tries to
>>>>> instantiates a single interfaces with no declaration for some methods
>>>>> like blockreport. But then it uses rpc.getproxy to get new class wich
>>>>> send messages with name node.
>>>>>
>>>>> I'm sorry for this silly question, but i am really lost at this point.
>>>>>
>>>>> Thanks for the patience.
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 2, 2010 at 2:11 AM, Alberich de megres
>>>>> <alberich...@gmail.com> wrote:
>>>>>> Hi Jay!
>>>>>>
>>>>>> thanks for the answear but i'm asking for what it works it sends?
>>>>>> blockreport is an interface in DatanodeProtocol that has no
>>>>>> declaration.
>>>>>>
>>>>>> thanks!
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Apr 1, 2010 at 5:50 PM, Jay Booth <jaybo...@gmail.com> wrote:
>>>>>>> In DataNode:
>>>>>>> public DatanodeProtocol namenode
>>>>>>>
>>>>>>> It's not a reference to an actual namenode, it's a wrapper for a
>>> network
>>>>>>> protocol created by that RPC.waitForProxy call -- so when it calls
>>>>>>> namenode.blockReport, it's sending that information over RPC to the
>>> namenode
>>>>>>> instance over the network
>>>>>>>
>>>>>>> On Thu, Apr 1, 2010 at 5:50 AM, Alberich de megres <
>>> alberich...@gmail.com>wrote:
>>>>>>>
>>>>>>>> Hi everyone!
>>>>>>>>
>>>>>>>> sailing throught the hdfs source code that comes with hadoop 0.20.2,
>>> i
>>>>>>>> could not understand how hdfs sends blockreport to nameNode.
>>>>>>>>
>>>>>>>> As i can see, in
>>>>>>>> src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java we
>>>>>>>> create this.namenode interface with RPC.waitForProxy call (wich i
>>>>>>>> could not understand which class it instantiates, and how it works).
>>>>>>>>
>>>>>>>> After that, datanode generates block list report (blockListAsLongs)
>>>>>>>> with data.getBlockReport, and call this.namenode.blockReport(..),
>>>>>>>> inside namenode.blockReport it calls again namesystem.processReport.
>>>>>>>> This leads to an update of block lists inside nameserver.
>>>>>>>>
>>>>>>>> But how it sends over the network this blockreport?
>>>>>>>>
>>>>>>>> Anyone can point me some light?
>>>>>>>>
>>>>>>>> thanks for all!
>>>>>>>> (and sorry for the newbie question)
>>>>>>>>
>>>>>>>> Alberich
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>
>

Reply via email to