Sorry for the late answear But thanks for the tips guys!!! Now the hard work i think would be try to understand HDFS wire protocol on ipc package.
thanks!! On Tue, Apr 6, 2010 at 4:50 PM, Brian Bockelman <bbock...@cse.unl.edu> wrote: > Hey Jay, > > I think, if you're experienced in implementing transfer protocols, it is not > difficult to implement the HDFS wire protocol. As you point out, they are > subject to change between releases (especially between 0.20, 0.21, and 0.22) > and basically documented in fragments in the java source code. At least, I > looked at doing this for the read portions, and it wasn't horrible. > > However, the *really hard part* is the client retry/recovery logic. That's > where a lot of the intelligence is, in very large classes, and not incredibly > well-documented. > > I've had lots of luck with scaling libhdfs - we average >20TB / day and > billions of I/O operations a day with it. I'd strongly advise not > re-inventing the wheel, unless it's for a research project. > > Brian > > On Apr 6, 2010, at 8:53 AM, Jay Booth wrote: > >> A pure C library to communicate with HDFS? >> >> Certainly possible, but it would be a lot of work, and the HDFS wire >> protocols are ad hoc, only somewhat documented and subject to change between >> releases right now so you'd be chasing a moving target. I'd try to think of >> another way to accomplish what you want to do before attempting a client >> reimplementation in C right now.. if you only need to talk to the namenode >> and not the datanodes it might be a little easier but still, lots of work >> that will probably be obsolete after another release or two. >> >> >> On Tue, Apr 6, 2010 at 9:47 AM, Alberich de megres >> <alberich...@gmail.com>wrote: >> >>> Thanks! >>> >>> I'm already using eclipse to browse the code. >>> In this scenario, i could understand that java serializes the object >>> through the network and its parameters. is that ok? >>> >>> For example, if i want to make a pure C library (with no JNI >>> interfaces).. is it possible/feasible? or it will be like to freeze >>> the hell? >>> >>> Thanks once again!!! >>> >>> >>> On Sat, Apr 3, 2010 at 1:54 AM, Ryan Rawson <ryano...@gmail.com> wrote: >>>> If you look at the getProxy code it passes an "Invoker" (or something >>>> like that) which the proxy code uses to delegate calls TO. The >>>> Invoker will call another class "Client" which has sub-classes like >>>> Call, and Connection which wrap the actual java IO. This all lives in >>>> the org.apache.hadoop.ipc package. >>>> >>>> Be sure to use a good IDE like IJ or Eclipse to browse the code, it >>>> makes following all this stuff much easier. >>>> >>>> >>>> >>>> >>>> >>>> On Fri, Apr 2, 2010 at 4:39 PM, Alberich de megres >>>> <alberich...@gmail.com> wrote: >>>>> Hi again! >>>>> >>>>> Anyone could help me? >>>>> I could not understand how RPC class works. For me, only tries to >>>>> instantiates a single interfaces with no declaration for some methods >>>>> like blockreport. But then it uses rpc.getproxy to get new class wich >>>>> send messages with name node. >>>>> >>>>> I'm sorry for this silly question, but i am really lost at this point. >>>>> >>>>> Thanks for the patience. >>>>> >>>>> >>>>> >>>>> On Fri, Apr 2, 2010 at 2:11 AM, Alberich de megres >>>>> <alberich...@gmail.com> wrote: >>>>>> Hi Jay! >>>>>> >>>>>> thanks for the answear but i'm asking for what it works it sends? >>>>>> blockreport is an interface in DatanodeProtocol that has no >>>>>> declaration. >>>>>> >>>>>> thanks! >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Apr 1, 2010 at 5:50 PM, Jay Booth <jaybo...@gmail.com> wrote: >>>>>>> In DataNode: >>>>>>> public DatanodeProtocol namenode >>>>>>> >>>>>>> It's not a reference to an actual namenode, it's a wrapper for a >>> network >>>>>>> protocol created by that RPC.waitForProxy call -- so when it calls >>>>>>> namenode.blockReport, it's sending that information over RPC to the >>> namenode >>>>>>> instance over the network >>>>>>> >>>>>>> On Thu, Apr 1, 2010 at 5:50 AM, Alberich de megres < >>> alberich...@gmail.com>wrote: >>>>>>> >>>>>>>> Hi everyone! >>>>>>>> >>>>>>>> sailing throught the hdfs source code that comes with hadoop 0.20.2, >>> i >>>>>>>> could not understand how hdfs sends blockreport to nameNode. >>>>>>>> >>>>>>>> As i can see, in >>>>>>>> src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java we >>>>>>>> create this.namenode interface with RPC.waitForProxy call (wich i >>>>>>>> could not understand which class it instantiates, and how it works). >>>>>>>> >>>>>>>> After that, datanode generates block list report (blockListAsLongs) >>>>>>>> with data.getBlockReport, and call this.namenode.blockReport(..), >>>>>>>> inside namenode.blockReport it calls again namesystem.processReport. >>>>>>>> This leads to an update of block lists inside nameserver. >>>>>>>> >>>>>>>> But how it sends over the network this blockreport? >>>>>>>> >>>>>>>> Anyone can point me some light? >>>>>>>> >>>>>>>> thanks for all! >>>>>>>> (and sorry for the newbie question) >>>>>>>> >>>>>>>> Alberich >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> > >