elasticsearch 为什么ES显示错误日志“readiness probe failed”?

sqyvllje  于 2023-10-17  发布在  ElasticSearch
关注(0)|答案(2)|浏览(441)

我正在AWS EKS上部署Elasticsearch集群。下面是k8s spec yml文件。

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: datasource
spec:
  version: 7.14.0
  nodeSets:
  - name: node
    count: 3
    config:
      node.store.allow_mmap: true
      xpack.security.http.ssl.enabled: false
      xpack.security.transport.ssl.enabled: false
      xpack.security.enabled: false
    podTemplate:
      spec:
        initContainers:
        - name: sysctl
          securityContext:
            privileged: true
          command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
        containers:
        - name: elasticsearch
          readinessProbe:
              exec:
                command:
                - bash
                - -c
                - /mnt/elastic-internal/scripts/readiness-probe-script.sh
              failureThreshold: 3
              initialDelaySeconds: 10
              periodSeconds: 12
              successThreshold: 1
              timeoutSeconds: 12
          env:
          - name: READINESS_PROBE_TIMEOUT
            value: "30"
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
          - ReadWriteOnce
        storageClassName: ebs-sc
        resources:
          requests:
            storage: 1024Gi

部署后,我看到所有三个pod都有错误:

{"type": "server", "timestamp": "2021-10-05T05:19:37,041Z", "level": "INFO", "component": "o.e.c.m.MetadataMappingService", "cluster.name": "datasource", "node.name": "datasource-es-node-0", "message": "[.kibana/g5_90XpHSI-y-I7MJfBZhQ] update_mapping [_doc]", "cluster.uuid": "xJ00drroT_CbJPfzi8jSAg", "node.id": "qmtgUZHbR4aTWsYaoIEDEA"  }
{"type": "server", "timestamp": "2021-10-05T05:19:37,622Z", "level": "INFO", "component": "o.e.c.r.a.AllocationService", "cluster.name": "datasource", "node.name": "datasource-es-node-0", "message": "Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.kibana][0]]]).", "cluster.uuid": "xJ00drroT_CbJPfzi8jSAg", "node.id": "qmtgUZHbR4aTWsYaoIEDEA"  }
{"timestamp": "2021-10-05T05:19:40+00:00", "message": "readiness probe failed", "curl_rc": "35"}
{"timestamp": "2021-10-05T05:19:45+00:00", "message": "readiness probe failed", "curl_rc": "35"}
{"timestamp": "2021-10-05T05:19:50+00:00", "message": "readiness probe failed", "curl_rc": "35"}
{"timestamp": "2021-10-05T05:19:55+00:00", "message": "readiness probe failed", "curl_rc": "35"}
{"timestamp": "2021-10-05T05:20:00+00:00", "message": "readiness probe failed", "curl_rc": "35"}
{"timestamp": "2021-10-05T05:20:05+00:00", "message": "readiness probe failed", "curl_rc": "35"}
{"timestamp": "2021-10-05T05:20:10+00:00", "message": "readiness probe failed", "curl_rc": "35"}
{"timestamp": "2021-10-05T05:20:15+00:00", "message": "readiness probe failed", "curl_rc": "35"}

从上面的日志中,它首先显示Cluster health status changed from [YELLOW] to [GREEN],然后出现此错误readiness probe failed。我想知道如何解决这个问题。是Elasticsearch相关的错误还是k8s相关的错误?

irlmq6kh

irlmq6kh1#

你可以在你的规范中像this一样声明READNOTE_PROBE_TIMEOUT。

...
env:
- name: READINESS_PROBE_TIMEOUT
  value: "30"

如果需要,您可以自定义就绪探测器,最新的elasticsearch.k8s.elastic.co/v1 API规范在这里,它与您可以在Elasticsearch规范中使用的K8s PodTemplateSpec相同。
更新:curl错误代码35是指SSL错误。这里有一个post regarding the script。是否可以从等级库中删除以下设置并重新运行:

xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.enabled: false
xpack.security.enabled: false
sqougxex

sqougxex2#

您应该将READPROBE_PROTOCOL设置为'*HTTP'。
举例来说:

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: data
spec:
  version: 7.15.2
  nodeSets:
  - name: data
    count: 1
    config:
      node.remote_cluster_client: false
      xpack.ml.enabled: false
      xpack.security.enabled: false
    podTemplate:
      spec:
        containers:
        - name: elasticsearch
          env:
          - name: ES_JAVA_OPTS
            value: -Xms1g -Xmx1g
          - name: READINESS_PROBE_TIMEOUT
            value: "30"
          - name: READINESS_PROBE_PROTOCOL
            value: http

相关问题