使用nfs删除pod后,kafka pod无法出现

wtlkbnrh  于 2021-06-07  发布在  Kafka
关注(0)|答案(0)|浏览(364)

我们试图使用nfs provisioner在kubernetes上运行kafka集群。集群出现得很好。然而,当我们杀死一个Kafka吊舱时,替换吊舱却没有出现。
pod删除前的持久卷:


# mount

10.102.32.184:/export/pvc-ce1461b3-1b38-11e8-a88e-005056073f99 on /opt/kafka/data type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.133.40.245,local_lock=none,addr=10.102.32.184)

# ls -al /opt/kafka/data/logs

total 4
drwxr-sr-x 2 99 99 152 Feb 26 21:07 .
drwxrwsrwx 3 99 99  18 Feb 26 21:07 ..
-rw-r--r-- 1 99 99   0 Feb 26 21:07 .lock
-rw-r--r-- 1 99 99   0 Feb 26 21:07 cleaner-offset-checkpoint
-rw-r--r-- 1 99 99  57 Feb 26 21:07 meta.properties
-rw-r--r-- 1 99 99   0 Feb 26 21:07 recovery-point-offset-checkpoint
-rw-r--r-- 1 99 99   0 Feb 26 21:07 replication-offset-checkpoint

# cat /opt/kafka/data/logs   /meta.properties

# 

# Mon Feb 26 21:07:08 UTC 2018

version=0
broker.id=1003

删除pod:

kubectl delete pod kafka-iced-unicorn-1

新创建的pod中重新连接的持久卷:


# ls -al /opt/kafka/data/logs

total 4
drwxr-sr-x 2 99 99 180 Feb 26 21:10 .
drwxrwsrwx 3 99 99  18 Feb 26 21:07 ..
-rw-r--r-- 1 99 99   0 Feb 26 21:10 .kafka_cleanshutdown
-rw-r--r-- 1 99 99   0 Feb 26 21:07 .lock
-rw-r--r-- 1 99 99   0 Feb 26 21:07 cleaner-offset-checkpoint
-rw-r--r-- 1 99 99  57 Feb 26 21:07 meta.properties
-rw-r--r-- 1 99 99   0 Feb 26 21:07 recovery-point-offset-checkpoint
-rw-r--r-- 1 99 99   0 Feb 26 21:07 replication-offset-checkpoint

# cat /opt/kafka/data/logs/meta.properties

# 

# Mon Feb 26 21:07:08 UTC 2018

version=0
broker.id=1003

我们在Kafka日志中看到以下错误:

[2018-02-26 21:26:40,606] INFO [ThrottledRequestReaper-Produce], Starting      (kafka.server.ClientQuotaManager$ThrottledRequestReaper)
[2018-02-26 21:26:40,711] FATAL [Kafka Server 1002], Fatal error during         KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
java.io.IOException: Invalid argument
    at java.io.UnixFileSystem.createFileExclusively(Native Method)
    at java.io.File.createNewFile(File.java:1012)
    at kafka.utils.FileLock.<init>(FileLock.scala:28)
    at kafka.log.LogManager$$anonfun$lockLogDirs$1.apply(LogManager.scala:104)
    at kafka.log.LogManager$$anonfun$lockLogDirs$1.apply(LogManager.scala:103)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.AbstractTraversable.map(Traversable.scala:104)
    at kafka.log.LogManager.lockLogDirs(LogManager.scala:103)
    at kafka.log.LogManager.<init>(LogManager.scala:65)
    at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:648)
    at kafka.server.KafkaServer.startup(KafkaServer.scala:208)
    at io.confluent.support.metrics.SupportedServerStartable.startup(SupportedServerStartable.java:102)
    at io.confluent.support.metrics.SupportedKafka.main(SupportedKafka.java:49)
[2018-02-26 21:26:40,713] INFO [Kafka Server 1002], shutting down (kafka.server.KafkaServer)
[2018-02-26 21:26:40,715] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)

解决这个问题的唯一方法似乎是删除持久卷声明并再次强制删除pod。或者使用nfs以外的另一个存储提供程序(rook在这种情况下运行良好)。
是否有人在nfs provisioner中遇到此问题?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题