kubernetes Kube Prometheus堆栈Helm不工作-多对多错误

z9smfwbn 于 2023-08-03 发布在 Kubernetes

关注(0)|答案(2)|浏览(158)

我安装了kube prometheus stack helm chart。图表显示安装在我的3节点Kubernetes集群（1.27.3）上没有问题。所有部署和吊舱似乎都进入了就绪状态。端口转发了prometheus pod并尝试连接，但无法连接。
当我查看prometheus pod的日志时，似乎有一个错误或警告，说明它生成的规则集有问题：

ts=2023-07-15T22:26:26.110Z caller=manager.go:663 level=warn component="rule manager" file=/etc/prometheus/rules/prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0/default-prometheus-kube-prometheus-kubernetes-system-kubelet-e76e1f61-e704-4f1c-a9f8-87d91012dd7c.yaml group=kubernetes-system-kubelet name=KubeletPodStartUpLatencyHigh index=5 msg="Evaluating rule failed" rule="alert: KubeletPodStartUpLatencyHigh\nexpr: histogram_quantile(0.99, sum by (cluster, instance, le) (rate(kubelet_pod_worker_duration_seconds_bucket{job=\"kubelet\",metrics_path=\"/metrics\"}[5m])))\n  * on (cluster, instance) group_left (node) kubelet_node_name{job=\"kubelet\",metrics_path=\"/metrics\"}\n  > 60\nfor: 15m\nlabels:\n  severity: warning\nannotations:\n  description: Kubelet Pod startup 99th percentile latency is {{ $value }} seconds\n    on node {{ $labels.node }}.\n  runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletpodstartuplatencyhigh\n  summary: Kubelet Pod startup latency is too high.\n" err="found duplicate series for the match group {instance=\"192.168.2.10:10250\"} on the right hand-side of the operation: [{__name__=\"kubelet_node_name\", endpoint=\"https-metrics\", instance=\"192.168.2.10:10250\", job=\"kubelet\", metrics_path=\"/metrics\", namespace=\"kube-system\", node=\"kubenode03\", service=\"prometheus-prime-kube-prom-kubelet\"}, {__name__=\"kubelet_node_name\", endpoint=\"https-metrics\", instance=\"192.168.2.10:10250\", job=\"kubelet\", metrics_path=\"/metrics\", namespace=\"kube-system\", node=\"kubenode03\", service=\"prometheus-kube-prometheus-kubelet\"}];many-to-many matching not allowed: matching labels must be unique on one side

字符串
我是普罗米修斯的新手，所以我不确定该找什么或如何解决这个问题。有人可以帮助我理解：
1.此错误是什么原因引起的？
1.如何修复此错误？
它似乎是cube-system命名空间中的某个东西，但我在该命名空间中只有3个pod，它们都具有唯一的名称：
CoreDNS Local Path Provisioner metrics-server
我感谢任何帮助或建议，如何解决这个问题。

kubernetes

来源：https://stackoverflow.com/questions/76696048/kube-prometheus-stack-helm-not-working-many-to-many-error

2条答案

按热度按时间

fgw7neuy1#

此错误是什么原因引起的？
问题出在你的表情上：

histogram_quantile(0.99, 
 sum by (cluster, instance, le) (
  rate(
   kubelet_pod_worker_duration_seconds_bucket{job="kubelet",metrics_path="/metrics"}
   [5m])))
* on (cluster, instance) group_left (node) kubelet_node_name{job="kubelet",metrics_path="/metrics"}
> 60

字符串
似乎在on的两侧有多个指标，标签cluster和instance的值相同。它构成了多对多的关系，而普罗米修斯不允许这种关系。
如何修复此错误？
转到Prometheus的web ui，执行on子句的两个部分。你会看到他们生产的标签。更正on子句，以便不发生多对多关系。
我不熟悉这个确切的指标，所以很难说出更具体的内容。

赞(0）回复(0）举报 2023-08-03

bbmckpt72#

我做了更多的搜索，我发现了这个GitHub的问题：
https://github.com/prometheus-community/helm-charts/issues/635#issuecomment-774771566
看来，我有一个遗留的prometheus服务，从以前的失败安装， Helm 卸载没有删除。看起来这是复制品。一旦我删除了它，一切都开始工作。

赞(0）回复(0）举报 2023-08-03

我来回答

kubernetes Kube Prometheus堆栈Helm不工作-多对多错误

2条答案

相关问题

热门标签

最新问答