See comment below. -- Hari Shreedharan
On Monday, February 18, 2013 at 7:43 PM, 周梦想 wrote: > hello, > I change the conf file like this: > [zhouhh@Hadoop48 flume1.3.1]$ cat conf/testhdfs.conf > syslog-agent.sources = Syslog > syslog-agent.channels = MemoryChannel-1 > syslog-agent.sinks = HDFS-LAB > > syslog-agent.sources.Syslog.type = syslogTcp > syslog-agent.sources.Syslog.port = 5140 > > syslog-agent.sources.Syslog.channels = MemoryChannel-1 > syslog-agent.sinks.HDFS-LAB.channel = MemoryChannel-1 > > syslog-agent.sinks.HDFS-LAB.type = hdfs > > syslog-agent.sinks.HDFS-LAB.hdfs.path = hdfs://Hadoop48:54310/flume/%{host} > syslog-agent.sinks.HDFS-LAB.hdfs.file.Prefix = syslogfiles > syslog-agent.sinks.HDFS-LAB.hdfs.file.rollInterval = 60 > #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = SequenceFile > #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = DataStream > > > You need to uncomment the above line and change it to: syslog-agent.sinks.HDFS-LAB.hdfs.fileType = DataStream > #syslog-agent.sinks.HDFS-LAB.hdfs.file.writeFormat= Text > syslog-agent.channels.MemoryChannel-1.type = memory > > and I test again: > [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh " | > nc -v hadoop48 5140 > Connection to hadoop48 5140 port [tcp/*] succeeded! > [zhouhh@Hadoop47 ~]$ hadoop fs -cat > hdfs://Hadoop48:54310/flume//FlumeData.1361245092567.tmp > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒▒▒ʣ > > g▒▒C%< <▒▒)Mon Feb 18 18:25:26 2013 hello world zhh [zhouhh@Hadoop47 ~]$ > > > there still some text seems error. > > Andy > 2013/2/19 Hari Shreedharan <hshreedha...@cloudera.com > (mailto:hshreedha...@cloudera.com)> > > This is because the data is written out by default in Hadoop's SequenceFile > > format. Use the DataStream file format (as in the Flume docs) to get the > > event parsed as is (if you use the default serializer, the headers will not > > be serialized, do make sure you select the correct serializer). > > > > > > Hari > > > > -- > > Hari Shreedharan > > > > > > On Monday, February 18, 2013 at 7:09 PM, 周梦想 wrote: > > > > > hello, > > > I put some data to hdfs via flume 1.3.1,but it changed! > > > > > > source data: > > > [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh > > > " | nc -v hadoop48 5140 > > > Connection to hadoop48 5140 port [tcp/*] succeeded! > > > > > > > > > the flume agent received: > > > 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating > > > hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp > > > 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming > > > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp to > > > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > > > > > > > > > the content in hdfs: > > > > > > [zhouhh@Hadoop47 ~]$ hadoop fs -cat > > > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > > > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon > > > Feb 18 18:25:26 2013 hello world zhh > > > [zhouhh@Hadoop47 ~]$ > > > > > > > > > I don't know why there is some data like > > > "org.apache.hadoop.io.LongWritable",there are some bugs? > > > > > > Best Regards, > > > Andy > > > > > >