- 问题**
我正在尝试部署一个pod,但由于一个我无法理解的错误而失败。该pod通过Airflow运行以执行特定任务。Airflow显示该pod失败,没有任何日志。当我运行kubectl describe pod my-pod
时,我得到以下输出。
- 我应该如何确定问题的根本原因?*
失败的容器部分:
base:
Container ID: <ID>
Image: <IMAGE>
Image ID: <ID>
Port: <none>
Host Port: <none>
Command:
airflow
run
/var/airflow/my_dag_name.py
task_name
2023-02-20T23:15:00+00:00
--local
--pool
default_pool
-sd
/var/airflow/my_dag_name.py
State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 20 Feb 2023 20:55:07 -0600
Finished: Mon, 20 Feb 2023 20:55:11 -0600
Ready: False
Restart Count: 0
Limits:
cpu: 1
ephemeral-storage: 100Gi
memory: 8Gi
Requests:
cpu: 500m
ephemeral-storage: 1Gi
memory: 8Gi
Environment:
<ENV VARS>
Mounts:
<VARIOUS MOUNTS>
事件部分(已完成):
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 58s default-scheduler Successfully assigned <TASK> to <IP>
Normal Pulled 58s kubelet Container image <SIDECAR IMAGE 1> already present on machine
Normal Created 57s kubelet Created container <SIDECAR CONTAINER 1>
Normal Started 57s kubelet Started container <SIDECAR CONTAINER 1>
Normal Pulling 54s kubelet Pulling image <SIDECAR IMAGE 2>
Normal Pulled 53s kubelet Successfully pulled image <SIDECAR IMAGE 2> in 125.691281ms
Normal Created 53s kubelet Created container <SIDECAR CONTAINER 2>
Normal Started 53s kubelet Started container <SIDECAR CONTAINER 2>
Normal Pulled 52s kubelet Container image <FAILING POD IMAGE> already present on machine
Normal Created 52s kubelet Created container <FAILING POD CONTAINER>
Normal Started 52s kubelet Started container <FAILING POD CONTAINER>
Normal Pulled 52s kubelet Container image <SIDECAR IMAGE 3> already present on machine
Normal Created 52s kubelet Created container <SIDECAR CONTAINER 3>
Normal Started 52s kubelet Started container <SIDECAR CONTAINER 3>
Normal Pulled 52s kubelet Container image <SIDECAR IMAGE 4> already present on machine
Normal Created 52s kubelet Created container <SIDECAR CONTAINER 4>
Normal Started 51s kubelet Started container <SIDECAR CONTAINER 4>
- 背景**
pod使用这些临时边车连接到系统/注入信息/等。
1条答案
按热度按时间hc2pp10m1#
在Kubernetes中,容器退出代码对于诊断pod问题非常有用。如果pod不健康,可以使用以下命令查找问题
您已经提供了它的输出,其中显示了以下信息:
由于容器以退出代码1终止,因此需要对容器及其应用程序进行彻底的调查,因为这主要是由于应用程序错误或无效引用造成的。
作为Harsh Manvar建议的第一步,请使用以下命令检索pod中第一个容器的日志,以检查相关pod的日志。
-p代表-previous,表示如果Pod已重新启动,它将返回Pod上一个示例的日志。
日志将显示退出代码1的根本原因,此信息可用于修复pod的YAML文件中的命令字段。更新后,请使用kubectl apply命令将其重新应用到群集。
上述信息来源于James步行者编写的link。