As Prefect工作池和worker现在在2.11.0之后普遍可用。我正在尝试从“完美代理”切换到“完美工作者”。
我通过以下方式部署了Prefect Server
helm upgrade \
prefect-server \
prefect-server \
--install \
--repo=https://prefecthq.github.io/prefect-helm \
--namespace=hm-prefect \
--create-namespace \
--values=prefect-server-values.yaml
字符串
prefut-server-values.yaml:
server:
image:
repository: docker.io/prefecthq/prefect
prefectTag: 2.11.0-python3.11-kubernetes
publicApiUrl: https://prefect.mydomain.com/api
helm upgrade \
prefect-worker \
prefect-worker \
--install \
--repo=https://prefecthq.github.io/prefect-helm \
--namespace=hm-prefect \
--create-namespace \
--values=prefect-worker-values.yaml
的数据
prefut-worker-values.yaml:
worker:
image:
repository: docker.io/prefecthq/prefect
prefectTag: 2.11.0-python3.11-kubernetes
apiConfig: server
config:
workPool: hm-kubernetes-pool
serverApiConfig:
apiUrl: http://prefect-server.hm-prefect.svc:4200/api
➜ helm list -n hm-prefect
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
prefect-server hm-prefect 1 2023-07-31 17:07:50.401888 -0700 PDT deployed prefect-server-2023.07.27 2.11.1
prefect-worker hm-prefect 1 2023-07-31 17:18:57.586027 -0700 PDT deployed prefect-worker-2023.07.27 2.11.1
➜ kubectl get deployment -n hm-prefect
NAME READY UP-TO-DATE AVAILABLE AGE
prefect-server 1/1 1 1 17m
prefect-worker 1/1 1 1 6m2s
我可以在UI中看到Prefect Worker:
x1c 0d1x的数据
然后我通过以下方式生成YAML文件:
➜ prefect deployment build src/main.py:print_platform --name=print-platform --infra-block=kubernetes-job/print-platform-kubernetes-job-block --apply --pool=hm-kubernetes-pool
Found flow 'print-platform'
Deployment YAML created at '/Users/hongbo-miao/Clouds/Git/hongbomiao.com/hm-prefect/workflows/print-platform/print_platform-deployment.yaml'.
Deployment storage None does not have upload capabilities; no files uploaded. Pass --skip-upload to suppress this warning.
Deployment 'print-platform/print-platform' successfully created with id '7f7603ca-697c-4dca-9bcb-28a889165fe8'.
型
生成文件print_platform-deployment.yaml内容如下:
###
### A complete description of a Prefect Deployment for flow 'print-platform'
###
name: print-platform
description: null
version: e4da5dae95465f73a0e3e0bece1555bb
# The work queue that will handle this deployment's runs
work_queue_name: default
work_pool_name: hm-kubernetes-pool
tags: []
parameters: {}
schedule: null
is_schedule_active: true
infra_overrides: {}
###
### DO NOT EDIT BELOW THIS LINE
###
flow_name: print-platform
manifest_path: null
infrastructure:
type: kubernetes-job
env: {}
labels: {}
name: null
command: null
image: ghcr.io/hongbo-miao/hm-prefect-print-platform:latest
namespace: hm-prefect
service_account_name: null
image_pull_policy: Always
cluster_config: null
job:
apiVersion: batch/v1
kind: Job
metadata:
labels: {}
spec:
template:
spec:
parallelism: 1
completions: 1
restartPolicy: Never
containers:
- name: prefect-job
env: []
customizations: []
job_watch_timeout_seconds: null
pod_watch_timeout_seconds: 60
stream_output: true
finished_job_ttl: null
_block_document_id: 1f5b585c-581d-4ca4-adfa-c69dc5319941
_block_document_name: print-platform-kubernetes-job-block
_is_anonymous: false
block_type_slug: kubernetes-job
_block_type_slug: kubernetes-job
storage: null
path: /opt/prefect/flows
entrypoint: src/main.py:print_platform
parameter_openapi_schema:
title: Parameters
type: object
properties: {}
required: null
definitions: null
timestamp: '2023-08-01T00:32:45.975410+00:00'
triggers: []
型
接下来,我试着跑过去
➜ prefect deployment run print-platform/print-platform
Creating flow run for deployment 'print-platform/print-platform'...
Created flow run 'onyx-fennec'.
└── UUID: 065326e7-1d3e-455a-86fb-b15d553af5bd
└── Parameters: {}
└── Scheduled start time: 2023-07-31 17:32:50 PDT (now)
└── URL: https://prefect.mydomain.com/flow-runs/flow-run/065326e7-1d3e-455a-86fb-b15d553af5bd
型
然而,这给了我错误:
Worker 'KubernetesWorker 180550e0-fe47-4a0d-998d-b772d53e14b0' submitting flow run '065326e7-1d3e-455a-86fb-b15d553af5bd'
Creating Kubernetes job...
Failed to submit flow run '065326e7-1d3e-455a-86fb-b15d553af5bd' to infrastructure.
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/prefect_kubernetes/worker.py", line 628, in _create_job
job = batch_client.create_namespaced_job(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
return self.create_namespaced_job_with_http_info(namespace, body, **kwargs) # noqa: E501
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
return self.api_client.call_api(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 391, in request
return self.rest_client.POST(url,
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/rest.py", line 276, in POST
return self.request("POST", url,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/rest.py", line 235, in request
raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': '7871421d-254d-4d72-9a30-a7ff3306822b', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': 'e5d21bfa-f8ff-4689-965a-2c8efc99569b', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'f86dde2c-b36e-4c12-a44c-31e36a8ecf05', 'Date': 'Tue, 01 Aug 2023 00:32:51 GMT', 'Content-Length': '321'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch is forbidden: User \"system:serviceaccount:hm-prefect:prefect-worker\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"default\"","reason":"Forbidden","details":{"group":"batch","kind":"jobs"},"code":403}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/prefect/workers/base.py", line 834, in _submit_run_and_capture_errors
result = await self.run(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/prefect_kubernetes/worker.py", line 506, in run
job = await run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/prefect_kubernetes/worker.py", line 637, in _create_job
message += ": " + exc.body["message"]
~~~~~~~~^^^^^^^^^^^
TypeError: string indices must be integers, not 'str'
Completed submission of flow run '065326e7-1d3e-455a-86fb-b15d553af5bd'
Reported flow run '065326e7-1d3e-455a-86fb-b15d553af5bd' as crashed: Flow run could not be submitted to infrastructure
型
里面好像是这条线的问题
{“kind”:“Status”,“apiVersion”:“v1”,“metadata”:{},“status”:“Failure”,“message”:“jobs.batch is forbidden:用户“system:serviceaccount:hm-prefect:prefet-worker”无法在命名空间“default”",“reason”:“Forbidden”,“details”:{“group”:“batch”,“kind”:“jobs”},“code”:403}的API组“batch”中创建资源“jobs”
我不知道为什么它试图在命名空间default
而不是hm-prefect
中创建作业。有什么想法吗?谢谢!
更新一:
➜ prefect init
? Would you like to initialize your deployment configuration with a recipe? [Use arrows to move; enter to select; n to select none]
┏━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ┃ Name ┃ Description ┃
┡━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ │ s3 │ Store code within an S3 bucket │
│ > │ docker │ Store code within a custom docker image alongside its runtime environment │
│ │ docker-s3 │ Store code within S3 and build a custom docker image for runtime │
│ │ docker-azure │ Store code within an Azure Blob Storage container and build a custom docker image for runtime │
│ │ azure │ Store code within an Azure Blob Storage container │
│ │ docker-gcs │ Store code within GCS and build a custom docker image for runtime │
│ │ docker-git │ Store code within a git repository and build a custom docker image for runtime │
│ │ local │ Store code on a local filesystem │
│ │ git │ Store code within git repository │
│ │ gcs │ Store code within a GCS bucket │
└────┴──────────────┴───────────────────────────────────────────────────────────────────────────────────────────────┘
No, I'll use the default deployment configuration.
Required inputs for 'docker' recipe
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Field Name ┃ Description ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ image_name │ The image name, including repository, to give the built Docker image │
│ tag │ The tag to give the built Docker image │
└────────────┴──────────────────────────────────────────────────────────────────────┘
image_name: ghcr.io/hongbo-miao/hm-prefect-print-platform
tag: latest
---------------
Created project in /Users/hongbo-miao/Clouds/Git/hongbomiao.com/hm-prefect/workflows/print-platform with the following new files:
prefect.yaml
型
我删除了build
和pubsh
部分,因为我的Docker镜像已经构建好了。以下是我更新的prefect.yaml::
name: print-platform
prefect-version: 2.11.1
pull:
- prefect.deployments.steps.set_working_directory:
directory: /opt/prefect/print-platform
deployments:
- name: print-platform
version: null
tags: []
description: null
schedule: {}
flow_name: null
entrypoint: src/main.py:print_platform
parameters: {}
work_pool:
name: hm-kubernetes-pool
work_queue_name: null
job_variables:
image: ghcr.io/hongbo-miao/hm-prefect-print-platform
型
我希望避免提示,这是我如何部署:
➜ prefect --no-prompt deploy
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Deployment 'print-platform/print-platform' successfully created with id 'e5bb4249-3a9f-4d62-bee2-fc9dce69fbd8'. │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
View Deployment in UI: https://prefect.hongbomiao.com/deployments/deployment/e5bb4249-3a9f-4d62-bee2-fc9dce69fbd8
To execute flow runs from this deployment, start a worker in a separate terminal that pulls work from the 'hm-kubernetes-pool' work pool:
$ prefect worker start --pool 'hm-kubernetes-pool'
To schedule a run for this deployment, use the following command:
$ prefect deployment run 'print-platform/print-platform'
型
接下来我就跑
➜ prefect deployment run print-platform/print-platform
Creating flow run for deployment 'print-platform/print-platform'...
Created flow run 'charming-chimpanzee'.
└── UUID: 1f83d2ee-2584-424e-96ff-11e236ff7f1b
└── Parameters: {}
└── Scheduled start time: 2023-08-01 13:57:20 PDT (now)
└── URL: https://prefect.hongbomiao.com/flow-runs/flow-run/1f83d2ee-2584-424e-96ff-11e236ff7f1b
Worker 'KubernetesWorker 59a0fab6-b9c8-4668-b626-9a5cc0311250' submitting flow run '1f83d2ee-2584-424e-96ff-11e236ff7f1b'
Creating Kubernetes job...
Failed to submit flow run '1f83d2ee-2584-424e-96ff-11e236ff7f1b' to infrastructure.
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/util/connection.py", line 95, in create_connection
raise err
File "/usr/local/lib/python3.11/site-packages/urllib3/util/connection.py", line 85, in create_connection
sock.connect(sa)
OSError: [Errno 113] No route to host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 714, in urlopen
httplib_response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 403, in _make_request
self._validate_conn(conn)
File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1053, in _validate_conn
conn.connect()
File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 363, in connect
self.sock = conn = self._new_conn()
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0xffffac3a3c10>: Failed to establish a new connection: [Errno 113] No route to host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/prefect/workers/base.py", line 834, in _submit_run_and_capture_errors
result = await self.run(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/prefect_kubernetes/worker.py", line 506, in run
job = await run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/prefect_kubernetes/worker.py", line 628, in _create_job
job = batch_client.create_namespaced_job(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
return self.create_namespaced_job_with_http_info(namespace, body, **kwargs) # noqa: E501
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
return self.api_client.call_api(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 391, in request
return self.rest_client.POST(url,
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/rest.py", line 276, in POST
return self.request("POST", url,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kubernetes/client/rest.py", line 169, in request
r = self.pool_manager.request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/request.py", line 78, in request
return self.request_encode_body(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/request.py", line 170, in request_encode_body
return self.urlopen(method, url, **extra_kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/poolmanager.py", line 376, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 826, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 826, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 826, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 798, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='10.43.0.1', port=443): Max retries exceeded with url: /apis/batch/v1/namespaces/default/jobs (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0xffffac3a3c10>: Failed to establish a new connection: [Errno 113] No route to host'))
Completed submission of flow run '1f83d2ee-2584-424e-96ff-11e236ff7f1b'
Reported flow run '1f83d2ee-2584-424e-96ff-11e236ff7f1b' as crashed: Flow run could not be submitted to infrastructure
这一次的错误有点不同。但是,我还是迷路了。
1条答案
按热度按时间iqxoj9l91#
这里的问题是你如何创建你的部署-这是一个常见的混淆点,所以我们正在努力使这在文档中更清楚。
TLDR:使用worker时使用
prefect deploy
而不是prefect deployment build ...
。基本问题是,使用
prefect deployment build
创建的部署被假定为由代理执行,因此不会正确地从工作池继承infra配置。在新部署所指向的k8s工作池上设置namespace
,并通过prefect deploy
创建这些新部署例如,在
字符串
使用
prefect deploy
(没有任何附加标志),交互式向导将在项目中找到流入口点,您可以选择一个,然后根据所需的工作池、是否需要计划等填充部署配置。在向导结束时,您可以保存部署的配置,以便以后在CI或其他非交互式使用中使用使用Prefect worker和工作池,您可以将有关每个部署的额外信息传递给服务器,例如
pull
步骤,该步骤在准备流运行(如果需要)时执行任意进程,最常见的是prefect.deployments.steps.git_clone
。为了为每个部署定义此信息,您可以在项目目录的根目录中运行
prefect init
,您将看到prefect.yaml
被创建,在那里您可以定义部署沿着步骤。您可以编辑您的k8s工作池(就定义作业变量而言,它取代了KubernetesJob infra块),然后基于每个部署覆盖值(如
image
或namespace
x1c 0d1x的数据