我使用我的自定义配置为flume代理如下
# Name the components on this agent
agent.sources = r1
agent.sinks = k1
agent.channels = c1
# Describe the source
agent.sources.r1.type = org.apache.flume.source.AvroSource
agent.sources.r1.bind = 192.168.1.31
agent.sources.r1.port = 43999
# Describe the sink
agent.sinks.k1.type = com.zaloni.bedrock.collection.flume.sink.BedrockAvroHDFSEventSink
agent.sinks.k1.hdfs.path = /user/bedrock/sentimentAnalysis/TweetData
agent.sinks.k1.hdfs.rollInterval = 300
agent.sinks.k1.hdfs.rollSize = 1000
agent.sinks.k1.hdfs.rollCount = 100
agent.sinks.k1.hdfs.fileType = DataStream
agent.sinks.k1.hdfs.writeFormat = Text
# Describe the channel
agent.channels.c1.type = org.apache.flume.channel.MemoryChannel
# bind the source and sink to the channel
agent.sources.r1.channels = c1
agent.sinks.k1.channel = c1
使用上述配置,我将流数据从java程序发送到avro源代码。当flume代理将输出写入hdfs时,它会在每行末尾附加一个额外的'\n'字符。
下面是输出示例
@VermaAmrutaRT @AnjneyaParashar: IBM Watson can now transcribe speech perfectly #ibm #watson #transcription http://t.co/pm5iyLXOOe06-17-2015 13:35:00 +0530 #IBM1
@ThomasLaceyEire#IBM @IBM_DS_Europe https://t.co/c3ybimNkc606-17-2015 13:35:00 +0530#CSCO1
@INQRT @IBMPowerSystems: #IBM and @OpenPOWERorg encouraging #OpenSource all around the world: http://t.co/duyPrzaZL6 via @ChrisTheDJ @INQ06-17-2015 13:35:00 +0530 #IBM1
在上面的输出中,每个带空格的额外行都有'\n'字符。
结论:为什么我会有额外的'\n'字符?有什么可能的解决办法?
暂无答案!
目前还没有任何答案,快来回答吧!