在Kubernetes OKD 4.11中使用ECK部署时的ElasticSearch CrashLoopBackoff

krugob8w  于 2022-10-06  发布在  Kubernetes
关注(0)|答案(1)|浏览(166)

我正在使用OKD 4.11(在vSphere上运行)运行Kubernetes,并且已经验证了基本功能(包括Dyn)。卷配置)使用应用程序(如nginx)。

我也申请了

oc adm policy add-scc-to-group anyuid system:authenticated

以允许经过身份验证的用户使用anyuid(这似乎是部署我正在测试的nginx示例所必需的)。

然后,我使用this quickstart和kubectl安装了ECK,以安装CRD和RBAC清单。这似乎奏效了。

然后,我使用以下清单使用kubectl apply -f quickstart.yaml部署了最基本的ElasticSearch快速入门示例:

apiVersion: elasticsearch.k8s.elastic.co/v1
  kind: Elasticsearch
  metadata:
    name: quickstart
  spec:
    version: 8.4.2
    nodeSets:
    - name: default
      count: 1
      config:
        node.store.allow_mmap: false

部署按预期进行,拉取映像并启动容器,但以CrashLoopBackoff结束,日志末尾的ElasticSearch显示以下错误:

"elasticsearch.cluster.name":"quickstart",
 "error.type":"java.lang.IllegalStateException",
 "error.message":"failed to obtain node locks, tried 
 [/usr/share/elasticsearch/data]; maybe these locations 
 are not writable or multiple nodes were started on the same data path?"

查看存储,成功创建了PV和PVC,kubectl get pv,pvc,sc -A -n my-namespace的输出为:

NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                  STORAGECLASS   REASON   AGE
  persistentvolume/pvc-9d7b57db-8afd-40f7-8b3d-6334bdc07241   1Gi        RWO            Delete           Bound    my-namespace/elasticsearch-data-quickstart-es-default-0   thin                    41m

  NAMESPACE                       NAME                                                               STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
  my-namespace                       persistentvolumeclaim/elasticsearch-data-quickstart-es-default-0   Bound    pvc-9d7b57db-8afd-40f7-8b3d-6334bdc07241   1Gi        RWO            thin           41m

  NAMESPACE   NAME                                         PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
              storageclass.storage.k8s.io/thin (default)   kubernetes.io/vsphere-volume   Delete          Immediate              false                  19d
              storageclass.storage.k8s.io/thin-csi         csi.vsphere.vmware.com         Delete          WaitForFirstConsumer   true                   19d

查看Pod YAML,卷似乎已正确连接:

volumes:
  - name: elasticsearch-data
    persistentVolumeClaim:
      claimName: elasticsearch-data-quickstart-es-default-0
  - name: downward-api
    downwardAPI:
      items:
        - path: labels
          fieldRef:
            apiVersion: v1
            fieldPath: metadata.labels
      defaultMode: 420
  ....
  volumeMounts:
    ...
    - name: elasticsearch-data
      mountPath: /usr/share/elasticsearch/data

我不明白为什么卷是只读的,或者更确切地说,为什么ES不能创建锁。

我确实找到了这个similar issue,但我不确定在使用Eck时如何应用UID权限(总的来说,我对权限在OKD中的工作方式相当天真)。

谁对K8S/OKD或ECK/ElasticSearch有更深入的了解,知道如何更好地隔离和/或解决此问题?

更新:我认为这与这个问题有关,正在研究与OKD相关的选项。

mm9b1k5b

mm9b1k5b1#

对于后代,ECK启动一个init容器,该容器“应该”处理数据卷上的chown,但只有在它以根用户身份运行时才能这样做。

我的解决方案记录在下面:https://repo1.dso.mil/dsop/elastic/elasticsearch/elasticsearch/-/issues/7

清单现在如下所示:

apiVersion: elasticsearch.k8s.elastic.co/v1
  kind: Elasticsearch
  metadata:
    name: quickstart
  spec:
    version: 8.4.2
    nodeSets:
    - name: default
      count: 1
      config:
        node.store.allow_mmap: false
    # run init container as root to chown the volume to uid 1000
      podTemplate:
        spec:
          securityContext:
            runAsUser: 1000
            runAsGroup: 0
          initContainers:
          - name: elastic-internal-init-filesystem
            securityContext:
              runAsUser: 0
              runAsGroup: 0

并且Pod启动,并且可以作为UID 1000写入卷。

相关问题