我目前在flume中有以下配置:
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# The configuration file needs to define the sources,
# the channels and the sinks.
# Sources, channels and sinks are defined per agent,
# in this case called 'TwitterAgent'
TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS
TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey = YPTxqtRamIZ1bnJXYwGW
TwitterAgent.sources.Twitter.consumerSecret = Wjyw9714OBzao7dktH0csuTByk4iLG9Zu4ddtI6s0ho
TwitterAgent.sources.Twitter.accessToken = 2340010790-KhWiNLt63GuZ6QZNYuPMJtaMVjLFpiMP4A2v
TwitterAgent.sources.Twitter.accessTokenSecret = x1pVVuyxfvaTbPoKvXqh2r5xUA6tf9einoByLIL8rar
TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics, bigdata, cloudera, data science, data scientiest, business intelligence, mapreduce, data warehouse, data warehousing, mahout, hbase, nosql, newsql, businessintelligence, cloudcomputing
TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://hadoop1:8020/user/flume/tweets/%Y/%m/%d/%H/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 100
twitter应用程序验证密钥正确。我在flume日志文件中不断发现这个错误:
ERROR org.apache.flume.SinkRunner
Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.IllegalArgumentException: java.net.UnknownHostException: hadoop1
at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:446)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: hadoop1
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:448)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:410)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:128)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2310)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2344)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2326)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:353)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:227)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:221)
at org.apache.flume.sink.hdfs.BucketWriter$8$1.run(BucketWriter.java:589)
at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:161)
at org.apache.flume.sink.hdfs.BucketWriter.access$800(BucketWriter.java:57)
at org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:586)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
... 1 more
Caused by: java.net.UnknownHostException: hadoop1
... 23 more
这里有人知道原因并能向我解释吗?提前谢谢。
1条答案
按热度按时间ibps3vxo1#
根据异常,问题是主机hadoop1未知。
根据flume配置文件,您给出的路径是
应该可以通过Flume代理从机器上访问。由于机器名不能用于访问不在同一域中的hdfs,因此需要使用中设置的ip地址访问hdfs
core-site.xml