Got that. Was just reading about that and other options. Thanks again! From: Nitin Pawar <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Sunday, January 13, 2013 11:39 PM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Unable to setup HDFS sink
that might the default hdfs sink rollover time. you can always configure it the way you want 1) Number of events in each time 2) How much time you want till a the file gets rolled over etc On Mon, Jan 14, 2013 at 1:06 PM, Vikram Kulkarni <[email protected]<mailto:[email protected]>> wrote: Thanks for your prompt replies. I had switched my core-site.xml and was now using 8020. That worked, however, I am getting the following output on the console: Once I send the event to the flume source, it correctly outputs it to the console but display the following messages in the log: 2013-01-13 23:28:20,178 (hdfs-hdfssink-call-runner-0) [INFO - org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:208)] Creating hdfs://localhost:8020/usr/FlumeData.1358148499961.tmp 2013-01-13 23:28:50,237 (hdfs-hdfssink-roll-timer-0) [INFO - org.apache.flume.sink.hdfs.BucketWriter.renameBucket(BucketWriter.java:427)] Renaming hdfs://localhost:8020/usr/FlumeData.1358148499961.tmp to hdfs://localhost:8020/usr/FlumeData.1358148499961 Notice the time difference between the 'Creating..' and 'Renaming…' lines. Is about 30 secs normal ? Then when I actually go to the dfs file system I do find the FlumeData.1358148499961 file as expected. -Vikram From: Nitin Pawar <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Sunday, January 13, 2013 11:07 PM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Unable to setup HDFS sink Its a jobtracker uri There shd be a conf in ur hdfs-site.xml and core-site.xml which looks like hdfs://localhost:9100/ You need to use that value On Jan 14, 2013 12:34 PM, "Vikram Kulkarni" <[email protected]<mailto:[email protected]>> wrote: I was able to write using the same hdfs conf from a different sink. Also, I can open the MapRed administration page successfully at http://localhost:50030/jobtracker.jsp So that should indicate that the hdfs path below is valid right? Any other way to check? Thanks. On 1/13/13 10:57 PM, "Alexander Alten-Lorenz" <[email protected]<mailto:[email protected]>> wrote: >Hi, > >Check your HDFS cluster, he's not responding on >localhost/127.0.0.1:50030<http://127.0.0.1:50030> > >- Alex > >On Jan 14, 2013, at 7:43 AM, Vikram Kulkarni ><[email protected]<mailto:[email protected]>> >wrote: > >> I am trying to setup a sink for hdfs for HTTPSource . But I get the >>following exception when I try to send a simple Json event. I am also >>using a logger sink and I can clearly see the event output to the >>console window but it fails to write to hdfs. I have also in a separate >>conf file successfully written to hdfs sink. >> >> Thanks, >> Vikram >> >> Exception: >> [WARN - >>org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:456)] >> HDFS IO error >> java.io.IOException: Call to >> localhost/127.0.0.1:50030<http://127.0.0.1:50030> failed on local >>exception: java.io.EOFException >> at org.apache.hadoop.ipc.Client.wrapException(Client.java:1144) >> >> My conf file is as follows: >> # flume-httphdfs.conf: A single-node Flume with Http Source and hdfs >>sink configuration >> >> # Name the components on this agent >> agent1.sources = r1 >> agent1.channels = c1 >> >> # Describe/configure the source >> agent1.sources.r1.type = org.apache.flume.source.http.HTTPSource >> agent1.sources.r1.port = 5140 >> agent1.sources.r1.handler = org.apache.flume.source.http.JSONHandler >> agent1.sources.r1.handler.nickname = random props >> >> # Describe the sink >> agent1.sinks = logsink hdfssink >> agent1.sinks.logsink.type = logger >> >> agent1.sinks.hdfssink.type = hdfs >> agent1.sinks.hdfssink.hdfs.path = hdfs://localhost:50030/flume/events >> agent1.sinks.hdfssink.hdfs.file.Type = DataStream >> >> # Use a channel which buffers events in memory >> agent1.channels.c1.type = memory >> agent1.channels.c1.capacity = 1000 >> agent1.channels.c1.transactionCapacity = 100 >> >> # Bind the source and sink to the channel >> agent1.sources.r1.channels = c1 >> agent1.sinks.logsink.channel = c1 >> agent1.sinks.hdfssink.channel = c1 >> >> > >-- >Alexander Alten-Lorenz >http://mapredit.blogspot.com >German Hadoop LinkedIn Group: http://goo.gl/N8pCF > -- Nitin Pawar
