我已经在kubernetes v1.11.2上安装了metrics-server。
我正在运行一个使用3个节点和1个主节点的裸机集群
在metrics-server日志中,我有以下错误:
E0907 14:29:51.774592 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:vps01: unable to
fetch metrics from Kubelet vps01 (vps01): Get https://vps01:10250/stats/summary/: dial tcp: lookup vps01 on 10.96.0.10:53: no such host, unable to fully scr
ape metrics from source kubelet_summary:vps04: unable to fetch metrics from Kubelet vps04 (vps04): Get https://vps04:10250/stats/summary/: dial tcp: lookup
vps04 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:vps03: unable to fetch metrics from Kubelet vps03 (vps03):
Get https://vps03:10250/stats/summary/: dial tcp: lookup vps03 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:vp
s02: unable to fetch metrics from Kubelet vps02 (vps02): Get https://vps02:10250/stats/summary/: dial tcp: lookup vps02 on 10.96.0.10:53: no such host]
E0907 14:30:01.694794 1 reststorage.go:98] unable to fetch pod metrics for pod boxweb/boxweb-deployment-7756c49688-fz625: no metrics known for pod "bo
xweb/boxweb-deployment-7756c49688-fz625"
E0907 14:30:10.517886 1 reststorage.go:112] unable to fetch node metrics for node "vps01": no metrics known for node "vps01"
字符串
我也无法使用kubectl top node vps 01获得任何指标
与自动缩放相同,它不起作用
unable to get metrics for resource cpu: unable to fetch metrics from
resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
型
4条答案
按热度按时间q35jwt9p1#
我找到了以下解决方案:
更改
metrics-server-deployment.yaml
文件并添加:字符串
kq0g1dla2#
您的
metrics-server
pod似乎出现了DNS问题。您可以连接到pod:字符串
如果你不能ping,你就不能解析你的节点。
core-dns或kube-dns在你的节点上也使用
/etc/resolv.conf
,所以我会检查你是否可以解析彼此之间的节点。比如,你可以从vps02
或vps03
pingvps01
,等等。gz5pxeao3#
我得到了同样的问题,我通过在每个节点上的
/etc/hosts
中添加主机名来解决。为了收集指标数据(CPU/内存使用率),metric服务器尝试访问节点。但是,metric服务器无法解析主机名(
vps01
,vps02
,vps03
和vps04
),因为这些主机名未在DNS中注册。正如您所提到的,您无法在DNS中注册主机名。因此,您必须将主机名添加到运行度量服务器POD的节点上的
/etc/hosts
。自动定标器不工作,因为指标服务器不工作,没有指标数据。
lnvxswe24#
修补程序
metrics-server
部署:字符串