如何使用flume将日志从centos传输到hadoop服务器而不重复?

pgky5nke  于 2021-05-29  发布在  Hadoop
关注(0)|答案(0)|浏览(302)

我需要一些Flume配置的帮助。我有一个centos服务器,在那里生成django日志。我想读取这些日志并将其传输到另一台服务器。我有两个配置可用一个服务器和另一个服务器。我得到的问题是我在配置中使用了exec命令tail-f。它正在成功地从centosto hadoop服务器传输日志。我面临的问题是centos中生成的日志是单一的,但是当它被传输到hadoop服务器时,它在其中复制了两次。在这个问题上有人能帮我吗。我做错了什么。谢谢您
centos服务器中的配置:


# configure the agent

    agent.sources = r1
    agent.channels = k1
    agent.sinks = c1

# using memory channel to hold upto 1000 events

    agent.channels.k1.type = memory
    agent.channels.k1.capacity = 1000
    agent.channels.k1.transactionCapacity = 100

# connect source, channel, sink

    agent.sources.r1.channels = k1
    agent.sinks.c1.channel = k1

# cat the file

    agent.sources.r1.type = exec
    agent.sources.r1.command = tail -f  /home/bluedata/mysite/log/debug.log

# connect to another box using AVRO and send the data

    agent.sinks.c1.type = avro
    agent.sinks.c1.hostname = x.x.x.x
                        #NOTE: use your server 2s ip address here
    agent.sinks.c1.port = 9049
                        #NOTE: This port should be open on Server 2

hadoop服务器中的配置:


# THIS ONE WRITES TO A FILE

# configure the agent

   agent.sources = r1
   agent.channels = k1
   agent.sinks = c1

# using memory channel to hold upto 1000 events

   agent.channels.k1.type = memory
   agent.channels.k1.capacity = 1000
   agent.channels.k1.transactionCapacity = 100

# connect source, channel, sink

   agent.sources.r1.channels = k1
   agent.sinks.c1.channel = k1

# here source is listening at the specified port using AVRO for data

   agent.sources.r1.type = avro
   agent.sources.r1.bind = 0.0.0.0
   agent.sources.r1.port = 9049

# this is what’s different.

# We use file_roll and write file at specified directory.

   agent.sinks.c1.type = file_roll
   agent.sinks.c1.sink.directory = /bdaas/debug
                                #Note: Change this path to your server path

登录centos服务器:

"GET / HTTP/1.1" 200 5533 
"GET /contact/ HTTP/1.1" 200 1833    
"GET /blog/ HTTP/1.1" 200 1909

hadoop服务器中的日志:每个日志重复两次。

"GET / HTTP/1.1" 200 5533
"GET / HTTP/1.1" 200 5533
"GET /contact/ HTTP/1.1" 200 1833
"GET /contact/ HTTP/1.1" 200 1833
"GET /blog/ HTTP/1.1" 200 1909
"GET /blog/ HTTP/1.1" 200 1909

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题