Zookeeper CuratorFrameworkImpl-后台异常不可重试或放弃重试

iq0todco  于 2022-12-09  发布在  Apache
关注(0)|答案(1)|浏览(631)

策展人框架版本- 4.3.0,动物园守护人版本- 5.5.0

我们已经在Kubernetes上部署了apache atlas,它使用Zookeeper从两个atlas中选出一个作为领导者。我们正在运行三个Zookeeper(3节点群集)和一个pod关闭应该不会产生任何问题。当一个zookeeper pod关闭时,Zookeeper集群仍然健康,并且有一个Zookeeper领导可用。2我通过执行到一个Zookeeper吊舱并检查Zookeeper状态来测试这个。但管理器框架抛出以下错误-**

[main:] ~ Background exception was not retry-able or retry gave up (CuratorFrameworkImpl:685)
java.net.UnknownHostException: zookeeper-2.zookeeper-headless.atlas.svc.cluster.local: Name or service not known
    at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
    at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
    at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
    at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
    at java.net.InetAddress.getAllByName(InetAddress.java:1193)
    at java.net.InetAddress.getAllByName(InetAddress.java:1127)
    at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61)
    at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
    at org.apache.curator.utils.DefaultZookeeperFactory.newZooKeeper(DefaultZookeeperFactory.java:29)
    at org.apache.curator.framework.imps.CuratorFrameworkImpl$2.newZooKeeper(CuratorFrameworkImpl.java:196)
    at org.apache.curator.HandleHolder$1.getZooKeeper(HandleHolder.java:101)
    at org.apache.curator.HandleHolder.getZooKeeper(HandleHolder.java:57)
    at org.apache.curator.ConnectionState.reset(ConnectionState.java:201)
    at org.apache.curator.ConnectionState.start(ConnectionState.java:111)
    at org.apache.curator.CuratorZookeeperClient.start(CuratorZookeeperClient.java:214)
    at org.apache.curator.framework.imps.CuratorFrameworkImpl.start(CuratorFrameworkImpl.java:314)
    at org.apache.atlas.web.service.CuratorFactory.initializeCuratorFramework(CuratorFactory.java:88)
    at org.apache.atlas.web.service.CuratorFactory.<init>(CuratorFactory.java:78)
    at org.apache.atlas.web.service.CuratorFactory.<init>(CuratorFactory.java:73)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:142)
    at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:89)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateBean(AbstractAutowireCapableBeanFactory.java:1152)

zookeeperConnectionString=“Zookeeper-0.无头Zookeeper.Map集.服务.集群.本地:2181,Zookeeper-1.无头Zookeeper.Map集.服务.集群.本地:2181,Zookeeper-2.无头Zookeeper.Map集.服务.集群.本地:2181”
我们面临的问题是,当我们尝试运行leaderLatch.start()时,它没有返回任何错误,但在zookeeper中没有创建相应的znode。

2cmtqfgy

2cmtqfgy1#

您看到该错误的原因是在Kubernetes上,当pod重新启动时,它的DNS记录也会被删除一小段时间,直到pod再次启动。在您的情况下,不会有问题,因为馆长将连接到您的CS中的另一个ZK服务器。

相关问题