我正在gcloud上的Kubernetes集群中运行一个Django应用程序。我将数据库迁移实现为helm pre-intall钩子,它启动我的应用程序容器并执行数据库迁移。我在官方教程中推荐的sidecar模式中使用cloud-sql-proxy:https://cloud.google.com/sql/docs/mysql/connect-kubernetes-engine
基本上,这会在作业描述的pod中启动我的应用程序和cloud-sql-proxy容器。问题是cloud-sql-proxy在我的应用程序完成迁移后永远不会终止,导致预安装作业超时并取消我的部署。在我的应用容器完成后,如何优雅地退出cloud-sql-proxy容器,以便作业可以完成?
下面是我的helm pre-intall hook模板定义:
apiVersion: batch/v1
kind: Job
metadata:
name: database-migration-job
labels:
app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
app.kubernetes.io/instance: {{ .Release.Name | quote }}
app.kubernetes.io/version: {{ .Chart.AppVersion }}
helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
annotations:
# This is what defines this resource as a hook. Without this line, the
# job is considered part of the release.
"helm.sh/hook": pre-install,pre-upgrade
"helm.sh/hook-weight": "-1"
"helm.sh/hook-delete-policy": hook-succeeded,hook-failed
spec:
activeDeadlineSeconds: 230
template:
metadata:
name: "{{ .Release.Name }}"
labels:
app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
app.kubernetes.io/instance: {{ .Release.Name | quote }}
helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
spec:
restartPolicy: Never
containers:
- name: db-migrate
image: {{ .Values.my-project.docker_repo }}{{ .Values.backend.image }}:{{ .Values.my-project.image.tag}}
imagePullPolicy: {{ .Values.my-project.image.pullPolicy }}
env:
- name: DJANGO_SETTINGS_MODULE
value: "{{ .Values.backend.django_settings_module }}"
- name: SENDGRID_API_KEY
valueFrom:
secretKeyRef:
name: sendgrid-api-key
key: sendgrid-api-key
- name: DJANGO_SECRET_KEY
valueFrom:
secretKeyRef:
name: django-secret-key
key: django-secret-key
- name: DB_USER
value: {{ .Values.postgresql.postgresqlUsername }}
- name: DB_PASSWORD
{{- if .Values.postgresql.enabled }}
value: {{ .Values.postgresql.postgresqlPassword }}
{{- else }}
valueFrom:
secretKeyRef:
name: database-password
key: database-pwd
{{- end }}
- name: DB_NAME
value: {{ .Values.postgresql.postgresqlDatabase }}
- name: DB_HOST
{{- if .Values.postgresql.enabled }}
value: "postgresql"
{{- else }}
value: "127.0.0.1"
{{- end }}
workingDir: /app-root
command: ["/bin/sh"]
args: ["-c", "python manage.py migrate --no-input"]
{{- if eq .Values.postgresql.enabled false }}
- name: cloud-sql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.17
command:
- "/cloud_sql_proxy"
- "-instances=<INSTANCE_CONNECTION_NAME>=tcp:<DB_PORT>"
- "-credential_file=/secrets/service_account.json"
securityContext:
#fsGroup: 65532
runAsNonRoot: true
runAsUser: 65532
volumeMounts:
- name: db-con-mnt
mountPath: /secrets/
readOnly: true
volumes:
- name: db-con-mnt
secret:
secretName: db-service-account-credentials
{{- end }}
有趣的是,如果我在迁移完成后使用“kubectl delete jobs database-migration-job”杀死作业,则helm升级完成,我的新应用版本安装完成。
2条答案
按热度按时间tyu7yeag1#
我有一个解决方案,它会工作,但可能是黑客。首先,这是Kubernetes缺乏的特性,这在issue中讨论。
Kubernetes v117、集装箱在同一个Pods can share process namespaces。这使我们能够从应用容器中杀死代理容器。由于这是一个Kubernetes作业,因此enable postStop handlers对于应用容器不应该有任何异常。
使用此解决方案,当您的应用程序正常(或异常)完成并退出时,Kubernetes将从即将死亡的容器中运行最后一个命令,在这种情况下将是
kill another process
。这应该会导致作业完成,成功或失败取决于您将如何杀死进程。进程退出代码将是容器退出代码,然后它将基本上是作业退出代码。wvt8vs2t2#
对于cloud-sql-proxy v2,应该使用
quiquiquit
端点。使用
--quitquitquit
标志运行cloud-sql-proxy然后,运行migrate命令,如下所示:
有用链接: