使用flume将twitter数据流式传输到hadoop时出错

kcugc4gi  于 2021-05-30  发布在  Hadoop
关注(0)|答案(2)|浏览(355)

我在ubuntu14.04上使用hadoop-1.2.1
我正在尝试使用flume-1.6.0将数据从twitter流式传输到hdfs。我已经下载了flume-sources-1.0-snapshot.jar并将其包含在flume/lib文件夹中。我在conf/flume-env.sh中将flume-sources-1.0-snapshot.jar的路径设置为flume\u classpath。这是我的flume代理配置文件:


# setting properties of agent

Twitter-agent.sources=source1
Twitter-agent.channels=channel1
Twitter-agent.sinks=sink1

# configuring sources

Twitter-agent.sources.source1.type=com.cloudera.flume.source.TwitterSource
Twitter-agent.sources.source1.channels=channel1
Twitter-agent.sources.source1.consumerKey=<consumer-key>
Twitter-agent.sources.source1.consumerSecret=<consumer Secret>
Twitter-agent.sources.source1.accessToken=<access Toekn>
Twitter-agent.sources.source1.accessTokenSecret=<acess Token Secret>
Twitter-agent.sources.source1.keywords= morning, night, hadoop, bigdata

# configuring channels

Twitter-agent.channels.channel1.type=memory
Twitter-agent.channels.channel1.capacity=10000
Twitter-agent.channels.channel1.transactionCapacity=100

# configuring sinks

Twitter-agent.sinks.sink1.channel=channel1
Twitter-agent.sinks.sink1.type=hdfs
Twitter-agent.sinks.sink1.hdfs.path=flume/twitter/logs
Twitter-agent.sinks.sink1.rollSize=0
Twitter-agent.sinks.sink1.rollCount=1000
Twitter-agent.sinks.sink1.batchSize=100
Twitter-agent.sinks.sink1.fileType=DataStream
Twitter-agent.sinks.sink1.writeFormat=Text

当我运行此代理时,会出现如下错误:

15/06/22 14:14:49 INFO source.DefaultSourceFactory: Creating instance of source source1, type com.cloudera.flume.source.TwitterSource
15/06/22 14:14:49 ERROR node.PollingPropertiesFileConfigurationProvider: Unhandled error
java.lang.NoSuchMethodError: twitter4j.conf.Configuration.isStallWarningsEnabled()Z
	at twitter4j.TwitterStreamImpl.<init>(TwitterStreamImpl.java:60)
	at twitter4j.TwitterStreamFactory.<clinit>(TwitterStreamFactory.java:40)
	at com.cloudera.flume.source.TwitterSource.<init>(TwitterSource.java:64)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
	at java.lang.Class.newInstance(Class.java:442)
	at org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:44)
	at org.apache.flume.node.AbstractConfigurationProvider.loadSources(AbstractConfigurationProvider.java:322)
	at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:97)
	at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

我的flume/lib文件夹已存在 twitter4j-core-3.0.3.jar 如何纠正此错误?

pod7payv

pod7payv1#

我找到了解决这个问题的办法。由于flume-sources-1.0-snapshot.jar和twitter4j-stream-3.0.3.jar包含相同的filterquery.class,因此会产生jar冲突。所有twitter4j-3.x.x都使用这个类,所以最好下载2.2.6版的twitter jar(twitter4j核心、twitter4j流、twitter4j媒体支持),并在flume/lib目录下用这些新下载的jar替换3.x.x。
再次运行代理,twitter数据将流式传输到hdfs。

tnkciper

tnkciper2#

将twitter agent.sources.source1.type=com.cloudera.flume.source.twittersource更改为twitteragent.sources.twitter.type=org.apache.flume.source.twitter.twittersource

相关问题