storm:storm hdfs hdfs blolt在24小时后失败

mpbci0fu  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(432)

我的storm拓扑从kafka读取并写入hadoop hdfs,但在24小时后就失败了!!
我怀疑问题是,topology无法续订令牌/找不到要续订的keytab。请分享你的想法,帮我解决这个问题。
请查找用于配置hdfs螺栓的代码。。
配置对象:

//building a 'map' with hdfs related configuration for key tab
Map<String, Object> hdfsSecConfigMap = new HashMap<String, Object>();
hdfsSecConfigMap.put("hdfs.keytab.file", ktPath);
hdfsSecConfigMap.put("hdfs.kerberos.principal", ktPrincipal);

//building a 'map' with hbase related configuration
Map<String, Object> hbaseConfigMap = new HashMap<String, Object>();
hbaseConfigMap.put("hbase.rootdir", hbaseRootDir);
hbaseConfigMap.put("storm.keytab.file", ktPath);
hbaseConfigMap.put("storm.kerberos.principal", ktPrincipal);

Config configured = new Config();
configured.setDebug(true);
configured.put(hdfsConfKey, hdfsSecConfigMap);
configured.put(hbaseConfKey, hbaseConfigMap);
configured.setNumWorkers(2);
configured.setMaxSpoutPending(300);
configured.setNumAckers(30);
configured.setMessageTimeoutSecs(1200);

configured.put(HdfsSecurityUtil.STORM_KEYTAB_FILE_KEY, ktPath);
configured.put(HdfsSecurityUtil.STORM_USER_NAME_KEY, ktPrincipal);

configured.put(HBaseSecurityUtil.STORM_KEYTAB_FILE_KEY, ktPath);
configured.put(HBaseSecurityUtil.STORM_USER_NAME_KEY, ktPrincipal);

回收hdfs螺栓

HdfsBolt hdfsbolt = new HdfsBolt()
        .withFsUrl(hdfsuri)
        .withRecordFormat(recFormat)
        .withFileNameFormat(fileNameWithPath)
        .withRotationPolicy(fileRotationSize)
        .withSyncPolicy(syncPolicy)
        .withConfigKey(secBypassConfigKey);

拓扑生成器设置如下

builder.setBolt(“hdfsBolt", avroHDFSBolt, 1)
        .setNumTasks(1)
        .shuffleGrouping(“kafka-spout");

例外情况如下:

java.io.IOException: IOException flush:java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: “**********"; destination host is: “***************":8020;
        at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2082) ~[stormjar.jar:?]
        at org.apache.hadoop.hdfs.DFSOutputStream.hsync(DFSOutputStream.java:1969) ~[stormjar.jar:?]
        at org.apache.hadoop.hdfs.client.HdfsDataOutputStream.hsync(HdfsDataOutputStream.java:95) ~[stormjar.jar:?]
        at org.apache.storm.hdfs.bolt.HdfsBolt.execute(HdfsBolt.java:100) [stormjar.jar:?]
        at backtype.storm.daemon.executor$fn__3697$tuple_action_fn__3699.invoke(executor.clj:670) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.daemon.executor$mk_task_receiver$fn__3620.invoke(executor.clj:426) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.disruptor$clojure_handler$reify__3196.onEvent(disruptor.clj:58) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.daemon.executor$fn__3697$fn__3710$fn__3761.invoke(executor.clj:808) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.util$async_loop$fn__544.invoke(util.clj:475) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_73]
wecizke3

wecizke31#

在根据hadoop集群中使用的hadoop的正确版本重新构建代码/应用程序之后,我能够解决这个问题。
由于版本不匹配而观察到该问题,并在使用正确的版本重新生成后修复!!

相关问题