您好,我们正在DR规划中使用velero,我们正在制定跨区域备份恢复策略,我们正在备份工作负载、PV和PVC我们在将备份从(US-EAST-2)恢复到第二个区域(US-West-2)时遇到问题。
使用以下命令在两个群集上顺利完成安装
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.4.0 \
--bucket velerobucket\
--backup-location-config region=us-east-2 \
--snapshot-location-config region=us-east-2 \
--secret-file secret-file
备份创建也会顺利完成,不会出现任何错误
velero backup create zookeeperbkp --include-namespaces zookeeper --snapshot-volumes
从us-east-2在us-west-2集群上执行恢复时,恢复成功完成,velero恢复日志中没有任何错误,但zookeeper pod进入挂起状态
velero restore create --from-backup zookeeperbkp
kubectl get pods -n zookeeper
NAME READY STATUS RESTARTS AGE
zookeeper-0 0/2 Pending 0 3m24s
zookeeper-1 0/2 Pending 0 3m24s
zookeeper-2 0/2 Pending 0 3m24s
在描述了它抱怨的豆荚之后
0/1 nodes are available: 1 node(s) had volume node affinity conflict.
描述PV后,似乎试图在us-east-2中创建PV,标签为us-east-2,而它应该是us-west-2(恢复群集)
在所有这些之后,我读到了更多关于velero在跨区域集群中恢复PV和PVC的限制。
我试着做同样的事情,通过修改s3中的velero快照json文件。
aws s3 cp s3://velerobkpxyz/backups/zookeeper/ ./ --recursive
gunzip zookeeper-volumesnapshots.json.gz
sed -i "s/us-east-2/us-west-2/g" zookeeper-volumesnapshots.json
s3 cp zookeeper-volumesnapshots.json.gz s3://velerobkp/backups/zookeeper/zookeeper-volumesnapshots.json.gz
同样地,我对zookeeper.tar.gz做了更改
mkdir zookeeper-temp
tar xzf zookeeper.tar.gz -C zookeeper-temp/
cd zookeeper-temp/
find . -name \*.json -exec sh -c "sed -i 's/us-east-2/us-west-2/g' {}" \;
tar czf ../zookeeper.tar.gz *
aws s3 cp zookeeper.tar.gz s3://velerobkp/backups/zookeeper/
在此之后,备份时所描述的拿出正确的区域名称的PV的
velero backup describe zookeeper --details
Name: zookeeper9
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.21.5-eks-bc4871b
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=21+
Phase: Completed
Errors: 0
Warnings: 0
Namespaces:
Included: zookeeper
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: true
TTL: 720h0m0s
Hooks: <none>
Backup Format Version: 1.1.0
Started: 2022-03-30 20:37:53 +0530 IST
Completed: 2022-03-30 20:37:57 +0530 IST
Expiration: 2022-04-29 20:37:53 +0530 IST
Total items to be backed up: 52
Items backed up: 52
Resource List:
apiextensions.k8s.io/v1/CustomResourceDefinition:
- servicemonitors.monitoring.coreos.com
apps/v1/ControllerRevision:
- zookeeper/zookeeper-596cddb599
- zookeeper/zookeeper-5977bdccb6
- zookeeper/zookeeper-5cd569cbf9
- zookeeper/zookeeper-6585c9bc89
- zookeeper/zookeeper-6bf55cfd99
- zookeeper/zookeeper-856646d9f6
- zookeeper/zookeeper-8cdd5f46
- zookeeper/zookeeper-ccf87988c
apps/v1/StatefulSet:
- zookeeper/zookeeper
discovery.k8s.io/v1/EndpointSlice:
- zookeeper/zookeeper-headless-2tnx5
- zookeeper/zookeeper-mzdlc
monitoring.coreos.com/v1/ServiceMonitor:
- zookeeper/zookeeper-exporter
policy/v1/PodDisruptionBudget:
- zookeeper/zookeeper
v1/ConfigMap:
- zookeeper/kube-root-ca.crt
- zookeeper/zookeeper
v1/Endpoints:
- zookeeper/zookeeper
- zookeeper/zookeeper-headless
v1/Namespace:
- zookeeper
v1/PersistentVolume:
- pvc-261b9803-8e55-4880-bb31-b29ca3a6c323
- pvc-89cfd5b9-65da-4fd1-a095-83d21d1d21db
- pvc-9e027e4c-cc9e-11ea-9ce3-061b42a2865e
- pvc-a835d78d-9dfd-41f7-92bd-7f2e752dbeb7
- pvc-c0e454f7-cc9e-11ea-9ce3-061b42a2865e
- pvc-ee6aad46-cc9e-11ea-9ce3-061b42a2865e
v1/PersistentVolumeClaim:
- zookeeper/data-zookeeper-0
- zookeeper/data-zookeeper-1
- zookeeper/data-zookeeper-2
- zookeeper/data-zookeeper-3
- zookeeper/data-zookeeper-4
- zookeeper/data-zookeeper-5
v1/Pod:
- zookeeper/zookeeper-0
- zookeeper/zookeeper-1
- zookeeper/zookeeper-2
- zookeeper/zookeeper-3
- zookeeper/zookeeper-4
- zookeeper/zookeeper-5
v1/Secret:
- zookeeper/default-token-kcl4m
- zookeeper/sh.helm.release.v1.zookeeper.v1
- zookeeper/sh.helm.release.v1.zookeeper.v10
- zookeeper/sh.helm.release.v1.zookeeper.v11
- zookeeper/sh.helm.release.v1.zookeeper.v12
- zookeeper/sh.helm.release.v1.zookeeper.v13
- zookeeper/sh.helm.release.v1.zookeeper.v4
- zookeeper/sh.helm.release.v1.zookeeper.v5
- zookeeper/sh.helm.release.v1.zookeeper.v6
- zookeeper/sh.helm.release.v1.zookeeper.v7
- zookeeper/sh.helm.release.v1.zookeeper.v8
- zookeeper/sh.helm.release.v1.zookeeper.v9
v1/Service:
- zookeeper/zookeeper
- zookeeper/zookeeper-headless
v1/ServiceAccount:
- zookeeper/default
Velero-Native Snapshots:
pvc-9e027e4c-cc9e-11ea-9ce3-061b42a2865e:
Snapshot ID: snap-0f81f2f62e476584a
Type: gp2
Availability Zone: us-west-2b
IOPS: <N/A>
pvc-c0e454f7-cc9e-11ea-9ce3-061b42a2865e:
Snapshot ID: snap-0c689771f3dbfa361
Type: gp2
Availability Zone: us-west-2a
IOPS: <N/A>
pvc-ee6aad46-cc9e-11ea-9ce3-061b42a2865e:
Snapshot ID: snap-068c63f1bb31af3cc
Type: gp2
Availability Zone: us-west-2b
IOPS: <N/A>
pvc-89cfd5b9-65da-4fd1-a095-83d21d1d21db:
Snapshot ID: snap-050e2e51eac92bd74
Type: gp2
Availability Zone: us-west-2a
IOPS: <N/A>
pvc-261b9803-8e55-4880-bb31-b29ca3a6c323:
Snapshot ID: snap-08e45396c99e7aac3
Type: gp2
Availability Zone: us-west-2b
IOPS: <N/A>
pvc-a835d78d-9dfd-41f7-92bd-7f2e752dbeb7:
Snapshot ID: snap-07ad93657b0bdc1a6
Type: gp2
Availability Zone: us-west-2a
IOPS: <N/A>
但当试图恢复失败时
velero restore create --from-backup zookeeper
velero restore describe zookeeper9-20220331145320
Name: zookeeper9-20220331145320
Namespace: velero
Labels: <none>
Annotations: <none>
Phase: PartiallyFailed (run 'velero restore logs zookeeper9-20220331145320' for more information)
Total items to be restored: 52
Items restored: 52
Started: 2022-03-31 14:53:24 +0530 IST
Completed: 2022-03-31 14:53:36 +0530 IST
Warnings:
Velero: <none>
Cluster: <none>
Namespaces:
zookeeper: could not restore, ConfigMap "kube-root-ca.crt" already exists. Warning: the in-cluster version is different than the backed-up version.
Errors:
Velero: <none>
Cluster: error executing PVAction for persistentvolumes/pvc-261b9803-8e55-4880-bb31-b29ca3a6c323: rpc error: code = Unknown desc = InvalidZone.NotFound: The zone 'us-west-2b' does not exist.
status code: 400, request id: 2b5ae55c-dfd5-4c52-8494-105e46bce78b
error executing PVAction for persistentvolumes/pvc-89cfd5b9-65da-4fd1-a095-83d21d1d21db: rpc error: code = Unknown desc = InvalidZone.NotFound: The zone 'us-west-2a' does not exist.
status code: 400, request id: ed91b698-d3b9-450f-b7b4-a3869cbae6ae
error executing PVAction for persistentvolumes/pvc-9e027e4c-cc9e-11ea-9ce3-061b42a2865e: rpc error: code = Unknown desc = InvalidZone.NotFound: The zone 'us-west-2b' does not exist.
status code: 400, request id: 2b493106-84c6-4210-9663-4d00f47c06de
error executing PVAction for persistentvolumes/pvc-a835d78d-9dfd-41f7-92bd-7f2e752dbeb7: rpc error: code = Unknown desc = InvalidZone.NotFound: The zone 'us-west-2a' does not exist.
status code: 400, request id: 387c6c27-6b18-4bc6-9bb8-3ed152cb49d1
error executing PVAction for persistentvolumes/pvc-c0e454f7-cc9e-11ea-9ce3-061b42a2865e: rpc error: code = Unknown desc = InvalidZone.NotFound: The zone 'us-west-2a' does not exist.
status code: 400, request id: 7d7d2931-e7d9-4bc5-8cb1-20e3b2849fe2
error executing PVAction for persistentvolumes/pvc-ee6aad46-cc9e-11ea-9ce3-061b42a2865e: rpc error: code = Unknown desc = InvalidZone.NotFound: The zone 'us-west-2b' does not exist.
status code: 400, request id: 75648031-97ca-4e2a-a079-8f6618902b2a
Namespaces: <none>
Backup: zookeeper9
Namespaces:
Included: all namespaces found in the backup
Excluded: <none>
Resources:
Included: *
Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io
Cluster-scoped: auto
Namespace mappings: <none>
Label selector: <none>
Restore PVs: auto
Preserve Service NodePorts: auto
它抱怨
Cluster: error executing PVAction for persistentvolumes/pvc-261b9803-8e55-4880-bb31-b29ca3a6c323: rpc error: code = Unknown desc = InvalidZone.NotFound: The zone 'us-west-2b' does not exist.
状态代码:400,请求id:2b5ae55c-dfd5-4c52-8494-105e46bce78b
我不知道为什么会这样,有没有什么我错过了。
这使我想到是否也需要对快照执行某些操作,因为备份的快照ID位于源区域中,而不可用于目标区域
1条答案
按热度按时间vojdkbi01#
由于目前velero中没有对此的支持,因此不得不通过变通方案解决此问题。
感谢jglick提交的PR,该PR在
velero
repo和velero plugin for aws
中针对此功能提出在从这2个仓库采购图像后,我能够将PV和PVC复制到不同的区域。
如上所述,这不是一个核心解决方案,因为它没有合并,我建议不要直接在Prod中使用它,并进行彻底的测试。这也是由本PR的贡献者建议的。
请仔细阅读此问题的讨论和步骤https://github.com/vmware-tanzu/velero-plugin-for-aws/pull/90
步骤1:从这2个PR存储库https://github.com/jglick/velero/tree/concurrent-snapshot中获取图像
https://github.com/jglick/velero-plugin-for-aws/tree/x-region
步骤,您可以将其替换为您选择的相应存储库
AWS ECR使用步骤
1.创建velero和velero-plugin-for-aws仓库
ex
:aws ecr create-repository --repository-name testing/velero --region $region || echo already exists
为velero-plugin-for-aws创建仓库
ex
:aws ecr create-repository --repository-name testing/velero-plugin-for-aws --region $region || echo already exists
2.为velero创建容器
command
make -C /path/to/velero REGISTRY=$registry/testing VERSION=testing container
ex
:make -C . REGISTRY=123456789.dkr.ecr.us-west-2.amazonaws.com/testing version=0.1 container
3.创建velero-plugin-for-aws容器
command
docker build -t $registry/testing/velero-plugin-for-aws /path/to/patched/velero-plugin-for-aws
ex
:docker build -t 123456789.dkr.ecr.us-west-2.amazonaws.com/testing/velero-plugin-for-aws velero-plugin-for-aws
4.现在登录需要推送镜像的地域的AWS ECR
command
aws ecr get-login-password --region $region | docker login --username AWS --password-stdin $registry
ex
:aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin 123456789.dkr.ecr.us-west-2.amazonaws.com
5.将velero和velero-plugin-for-aws镜像推送到仓库
command
docker push $registry/testing/velero
docker push $registry/testing/velero-plugin-for-aws
ex:
docker push 123456789.dkr.ecr.us-west-2.amazonaws.com/testing/velero:main
docker push 123456789.dkr.ecr.us-west-2.amazonaws.com/testing/velero-plugin-for-aws
现在,您的映像将被推送到存储库,并可用于在您希望的任何区域中创建备份和恢复
现在将velero安装在你要备份的区域和另一个你要恢复的区域
使用当前的
region
和alt_region
创建一个values
文件,因此当备份在当前区域中发生时,对于具有PV的statefulsets,卷将被复制到您指定的备用区域。下面是一个示例,其中我们将
us-east-2
设置为源区域,将us-west-2
设置为备用区域ex
:因此,在这种情况下,当在us-east-2中进行备份时,快照将复制到us-west-2区域
使用
helm
在us-east-2
源区域中安装velerohelm install velero vmware-tanzu/velero --version 2.24.0 --namespace velero --create-namespace -f /tmp/velero.yaml
类似地,这需要在我们需要恢复备份的区域中进行配置,在我们的案例中为us-west-2
Ex:
做一个 Helm 安装
helm install velero vmware-tanzu/velero --version 2.24.0 --namespace velero --create-namespace -f /tmp/velero-restore-us-west-2.yaml
现在检查velero备份位置是否配置正确
两个群集都设置好后,我们可以相应地在us-east-2和us-west-2上运行备份和恢复命令
velero backup create zookeeper-z --include-namespaces zookeeper
使用查看状态
velero describe backup zookeeper-z --details
从us-east-2区域恢复到us-west-2
velero restore create --from-backup zookeeper-z
恢复应成功,并且Pod应连接到其所需卷运行
我们假设您已经运行了为velero和S3存储桶创建iam用户的所有步骤
README用于velero IAM用户和S3存储桶配置