apacheflume将数据流到hdfs中

avwztpqn  于 2021-05-27  发布在  Hadoop
关注(0)|答案(0)|浏览(301)

目前,我正在使用apacheflume获取twitter数据,并希望将数据放入hadoop hdfs中。下面是我的配置为推特抓取


# Naming the components on the current agent.

TwitterAgent.sources = Twitter 
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS

# Describing/Configuring the source

TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sources.Twitter.consumerKey = 
TwitterAgent.sources.Twitter.consumerSecret = 
TwitterAgent.sources.Twitter.accessToken = 
TwitterAgent.sources.Twitter.accessTokenSecret = 
TwitterAgent.sources.Twitter.keywords = hadoop

# Describing/Configuring the sink

# TwitterAgent.sinks.LoggerSink.type = logger

# Describing/Configuring the sink

TwitterAgent.sinks.HDFS.type = hdfs 
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://localhost:9000/user/hadoop/twitter_data/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream 
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text 
TwitterAgent.sinks.HDFS.hdfs.batchSize = 10000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0 
TwitterAgent.sinks.HDFS.hdfs.rollCount = 100000 

# Describing/Configuring the channel

TwitterAgent.channels.MemChannel.type = memory 
TwitterAgent.channels.MemChannel.capacity = 100000 
TwitterAgent.channels.MemChannel.transactionCapacity = 10000

# Binding the source and sink to the channel

TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sinks.HDFS.channel = MemChannel

但是当我在flume文件夹中运行下面的脚本来运行apacheflume抓取时

bin/flume-ng agent --conf conf --conf-file conf/twitter.conf --name TwitterAgent -Dflume.root.logger=INFO,console

面对hdfs的错误

2020-09-18 18:13:54,900 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:459)] process failed
java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
    at org.apache.hadoop.conf.Configuration.set(Configuration.java:1380)
    at org.apache.hadoop.conf.Configuration.set(Configuration.java:1361)
    at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1703)
    at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:221)
    at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:572)
    at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:412)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
    at java.lang.Thread.run(Thread.java:748)
Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
    at org.apache.hadoop.conf.Configuration.set(Configuration.java:1380)
    at org.apache.hadoop.conf.Configuration.set(Configuration.java:1361)
    at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1703)
    at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:221)
    at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:572)
    at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:412)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
    at java.lang.Thread.run(Thread.java:748)

任何人都可以建议,谢谢

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题