Ok, I pulled over all of the hadoop jar files. Now I am seeing this: 0 Sep 2014 19:39:26,973 INFO [Twitter Stream consumer-1[initializing]] (twitter4j.internal.logging.SLF4JLogger.info:83) - Establishing connection. 30 Sep 2014 19:39:28,204 INFO [Twitter Stream consumer-1[Establishing connection]] (twitter4j.internal.logging.SLF4JLogger.info:83) - Connection established. 30 Sep 2014 19:39:28,205 INFO [Twitter Stream consumer-1[Establishing connection]] (twitter4j.internal.logging.SLF4JLogger.info:83) - Receiving status stream. 30 Sep 2014 19:39:28,442 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSDataStream.configure:58) - Serializer = TEXT, UseRawLocalFileSystem = false 30 Sep 2014 19:39:28,591 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:261) - Creating hdfs://10.0.0.14/tmp//twitter.1412105968443.ds.tmp 30 Sep 2014 19:39:28,690 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:467) - process failed java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:214) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2365) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2375) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:270) at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:262) at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:718) at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:183) at org.apache.flume.sink.hdfs.BucketWriter.access$1700(BucketWriter.java:59) at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:715) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
Is there something misconfigured on my hadoop node? Thanks. On Sep 30, 2014, at 2:51 PM, Hari Shreedharan <hshreedha...@cloudera.com> wrote: > You actually need to add of all Hadoop’s dependencies to Flume classpath. > Looks like Apache Commons Configuration is missing in classpath. > > Thanks, > Hari > > > On Tue, Sep 30, 2014 at 11:48 AM, Ed Judge <ejud...@gmail.com> wrote: > > Thank you. I am using hadoop 2.5 which I think uses protobuf-java-2.5.0.jar. > > I am getting the following error even after adding those 2 jar files to my > flume-ng classpath: > > 30 Sep 2014 18:27:03,269 INFO [lifecycleSupervisor-1-0] > (org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start:61) > - Configuration provider starting > 30 Sep 2014 18:27:03,278 INFO [conf-file-poller-0] > (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:133) > - Reloading configuration file:./src.conf > 30 Sep 2014 18:27:03,288 INFO [conf-file-poller-0] > (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) > - Processing:k1 > 30 Sep 2014 18:27:03,289 INFO [conf-file-poller-0] > (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:930) > - Added sinks: k1 Agent: a1 > 30 Sep 2014 18:27:03,289 INFO [conf-file-poller-0] > (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) > - Processing:k1 > 30 Sep 2014 18:27:03,292 WARN [conf-file-poller-0] > (org.apache.flume.conf.FlumeConfiguration.<init>:101) - Configuration > property ignored: i# = Describe the sink > 30 Sep 2014 18:27:03,292 INFO [conf-file-poller-0] > (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) > - Processing:k1 > 30 Sep 2014 18:27:03,292 INFO [conf-file-poller-0] > (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) > - Processing:k1 > 30 Sep 2014 18:27:03,293 INFO [conf-file-poller-0] > (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) > - Processing:k1 > 30 Sep 2014 18:27:03,293 INFO [conf-file-poller-0] > (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) > - Processing:k1 > 30 Sep 2014 18:27:03,293 INFO [conf-file-poller-0] > (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) > - Processing:k1 > 30 Sep 2014 18:27:03,293 INFO [conf-file-poller-0] > (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) > - Processing:k1 > 30 Sep 2014 18:27:03,293 INFO [conf-file-poller-0] > (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) > - Processing:k1 > 30 Sep 2014 18:27:03,312 INFO [conf-file-poller-0] > (org.apache.flume.conf.FlumeConfiguration.validateConfiguration:140) - > Post-validation flume configuration contains configuration for agents: [a1] > 30 Sep 2014 18:27:03,312 INFO [conf-file-poller-0] > (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:150) - > Creating channels > 30 Sep 2014 18:27:03,329 INFO [conf-file-poller-0] > (org.apache.flume.channel.DefaultChannelFactory.create:40) - Creating > instance of channel c1 type memory > 30 Sep 2014 18:27:03,351 INFO [conf-file-poller-0] > (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:205) - > Created channel c1 > 30 Sep 2014 18:27:03,352 INFO [conf-file-poller-0] > (org.apache.flume.source.DefaultSourceFactory.create:39) - Creating instance > of source r1, type org.apache.flume.source.twitter.TwitterSource > 30 Sep 2014 18:27:03,363 INFO [conf-file-poller-0] > (org.apache.flume.source.twitter.TwitterSource.configure:110) - Consumer > Key: 'tobhMtidckJoe1tByXDmI4pW3' > 30 Sep 2014 18:27:03,363 INFO [conf-file-poller-0] > (org.apache.flume.source.twitter.TwitterSource.configure:111) - Consumer > Secret: '6eZKRpd6JvGT3Dg9jtd9fG9UMEhBzGxoLhLUGP1dqzkKznrXuQ' > 30 Sep 2014 18:27:03,363 INFO [conf-file-poller-0] > (org.apache.flume.source.twitter.TwitterSource.configure:112) - Access > Token: '1588514408-o36mOSbXYCVacQ3p6Knsf6Kho17iCwNYLZyA9V5' > 30 Sep 2014 18:27:03,364 INFO [conf-file-poller-0] > (org.apache.flume.source.twitter.TwitterSource.configure:113) - Access Token > Secret: 'vBtp7wKsi2BOQqZSBpSBQSgZcc93oHea38T9OdckDCLKn' > 30 Sep 2014 18:27:03,825 INFO [conf-file-poller-0] > (org.apache.flume.sink.DefaultSinkFactory.create:40) - Creating instance of > sink: k1, type: hdfs > 30 Sep 2014 18:27:03,874 ERROR [conf-file-poller-0] > (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:145) > - Failed to start agent because dependencies were not found in classpath. > Error follows. > java.lang.NoClassDefFoundError: org/apache/commons/configuration/Configuration > at > org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38) > at > org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36) > at > org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:106) > at > org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:208) > at > org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:553) > at > org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:272) > at org.apache.flume.conf.Configurables.configure(Configurables.java:41) > at > org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:418) > at > org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:103) > at > org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassNotFoundException: > org.apache.commons.configuration.Configuration > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > ... 17 more > 30 Sep 2014 18:27:33,491 INFO [agent-shutdown-hook] > (org.apache.flume.lifecycle.LifecycleSupervisor.stop:79) - Stopping > lifecycle supervisor 10 > 30 Sep 2014 18:27:33,493 INFO [agent-shutdown-hook] > (org.apache.flume.node.PollingPropertiesFileConfigurationProvider.stop:83) - > Configuration provider stopping > [vagrant@localhost 6]$ > > Is there another jar file I need? > > Thanks. > > On Sep 29, 2014, at 9:04 PM, shengyi.pan <shengyi....@gmail.com> wrote: > >> you need hadoop-common-x.x.x.jar and hadoop-hdfs-x.x.x.jar under your >> flume-ng classpath, and the dependent hadoop jar version must match your >> hadoop system. >> >> if sink to hadoop-2.0.0, you should use "protobuf-java-2.4.1.jar" >> (defaultly, flume-1.5.0 uses "protobuf-java-2.5.0.jar", the jar file is >> under flume lib directory ), because the pb interface of hdfs-2.0 is >> compiled wtih protobuf-2.4, while using protobuf-2.5 the flume-ng will fail >> to start.... >> >> >> >> >> 2014-09-30 >> shengyi.pan >> 发件人:Ed Judge <ejud...@gmail.com> >> 发送时间:2014-09-29 22:38 >> 主题:HDFS sink to a remote HDFS node >> 收件人:"user@flume.apache.org"<user@flume.apache.org> >> 抄送: >> >> I am trying to run the flume-ng agent on one node with an HDFS sink pointing >> to an HDFS filesystem on another node. >> Is this possible? What packages/jar files are needed on the flume agent >> node for this to work? Secondary goal is to install only what is needed on >> the flume-ng node. >> >> # Describe the sink >> a1.sinks.k1.type = hdfs >> a1.sinks.k1.hdfs.path = hdfs://<remote IP address>/tmp/ >> >> >> Thanks, >> Ed > >