无法将flume中的数据摄取到hdfs hadoop以获取日志

txu3uszq  于 2021-05-30  发布在  Hadoop
关注(0)|答案(3)|浏览(472)

我使用以下配置将数据从日志文件推送到hdfs。

agent.channels.memory-channel.type = memory
agent.channels.memory-channel.capacity=5000
agent.sources.tail-source.type = exec
agent.sources.tail-source.command = tail -F /home/training/Downloads/log.txt
agent.sources.tail-source.channels = memory-channel
agent.sinks.log-sink.channel = memory-channel
agent.sinks.log-sink.type = logger
agent.sinks.hdfs-sink.channel = memory-channel
agent.sinks.hdfs-sink.type = hdfs
agent.sinks.hdfs-sink.batchSize=10
agent.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/user/flume/data/log.txt
agent.sinks.hdfs-sink.hdfs.fileType = DataStream
agent.sinks.hdfs-sink.hdfs.writeFormat = Text
agent.channels = memory-channel
agent.sources = tail-source
agent.sinks = log-sink hdfs-sink
agent.channels = memory-channel
agent.sources = tail-source
agent.sinks = log-sink hdfs-sink

我没有收到错误消息,但仍然无法在hdfs中找到输出。在中断时,我可以看到sink中断异常&该日志文件的一些数据。我正在运行以下命令:flume ng agent--conf/etc/flume ng/conf/--conf file/etc/flume ng/conf/flume.conf-dflume.root.logger=debug,console-n agent;

bf1o4zei

bf1o4zei1#

我也有类似的问题
在我的例子中,它现在正在工作,下面是conf文件:


# Exec Source

execAgent.sources=e
execAgent.channels=memchannel
execAgent.sinks=HDFS

# channels

execAgent.channels.memchannel.type=file
execAgent.channels.memchannel.capacity = 20000
execAgent.channels.memchannel.transactionCapacity = 1000

# Define Source

execAgent.sources.e.type=org.apache.flume.source.ExecSource
execAgent.sources.e.channels=memchannel
execAgent.sources.e.shell=/bin/bash -c
execAgent.sources.e.fileHeader=false
execAgent.sources.e.fileSuffix=.txt
execAgent.sources.e.command=cat /home/sample.txt

# Define Sink

execAgent.sinks.HDFS.type=hdfs
execAgent.sinks.HDFS.hdfs.path=hdfs://localhost:8020/user/flume/
execAgent.sinks.HDFS.hdfs.fileType=DataStream
execAgent.sinks.HDFS.hdfs.writeFormat=Text
execAgent.sinks.HDFS.hdfs.batchSize=1000
execAgent.sinks.HDFS.hdfs.rollSize=268435
execAgent.sinks.HDFS.hdfs.rollInterval=0

# Bind Source Sink Channel

execAgent.sources.e.channels=memchannel
execAgent.sinks.HDFS.channel=memchannel
`

我希望这能帮助你。

3b6akqbq

3b6akqbq2#

我建议在hdfs中放置文件时使用前缀配置:
agent.sinks.hdfs-sink.hdfs.fileprefix=注销

byqmnocz

byqmnocz3#

@bhavesh-您确定日志文件(agent.sources.tail-source.command=tail-f/home/training/downloads/log.txt)一直在追加数据吗?因为您使用了带-f的tail命令,所以只有更改的数据(在文件中)才会转储到hdfs中

相关问题