我不知道是什么在制造麻烦。
我的设置:
- 具有一个主节点和一个工作节点的Kubernetes(v1.26)群集,在虚拟机上自行部署
- Nginx反向代理(当前在主服务器上)
- 基本FastAPI pod,带有部署、服务和入口yaml(以下)
我在另一家云提供商上部署了完全相同的环境,一点也不麻烦。
在这里,一切都工作正常的时刻,API是通过浏览器访问,然后它失败了504网关超时错误.重新启动Nginx pod修复了一个未知的时间再次问题.我目睹了连接失败,并再次工作了几分钟apparts,在写它的时候已经一个小时,因为它的工作正常没有中断.
下面是成功请求和超时之间的nginx日志:
X.X.X.X - - [09/Feb/2023:12:30:18 +0000] "GET /docs HTTP/1.1" 200 952 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:108.0) Gecko/20100101 Firefox/108.0" 373 0.019 [my-app-8005] [] 172.16.180.6:8005 952 0.019 200 22cd1b13ef2dcbf4b1be2983649f658c
X.X.X.X - - [09/Feb/2023:12:30:19 +0000] "GET /openapi.json HTTP/1.1" 200 5868 "http://xxxx/docs" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:108.0) Gecko/20100101 Firefox/108.0" 323 0.003 [my-app-8005] [] 172.16.180.6:8005 5868 0.003 200 46551c8481d446ec69de2399f49b7f86
I0209 12:31:13.983933 7 queue.go:87] "queuing" item="&ObjectMeta{Name:sync status,GenerateName:,Namespace:,SelfLink:,UID:,ResourceVersion:,Generation:0,CreationTimestamp:0001-01-01 00:00:00 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{},Annotations:map[string]string{},OwnerReferences:[]OwnerReference{},Finalizers:[],ManagedFields:[]ManagedFieldsEntry{},}"
I0209 12:31:13.984018 7 queue.go:128] "syncing" key="&ObjectMeta{Name:sync status,GenerateName:,Namespace:,SelfLink:,UID:,ResourceVersion:,Generation:0,CreationTimestamp:0001-01-01 00:00:00 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{},Annotations:map[string]string{},OwnerReferences:[]OwnerReference{},Finalizers:[],ManagedFields:[]ManagedFieldsEntry{},}"
I0209 12:31:13.990418 7 status.go:275] "skipping update of Ingress (no change)" namespace="namespace" ingress="app-ingress-xxxx"
I0209 12:32:13.983857 7 queue.go:87] "queuing" item="&ObjectMeta{Name:sync status,GenerateName:,Namespace:,SelfLink:,UID:,ResourceVersion:,Generation:0,CreationTimestamp:0001-01-01 00:00:00 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{},Annotations:map[string]string{},OwnerReferences:[]OwnerReference{},Finalizers:[],ManagedFields:[]ManagedFieldsEntry{},}"
I0209 12:32:13.983939 7 queue.go:128] "syncing" key="&ObjectMeta{Name:sync status,GenerateName:,Namespace:,SelfLink:,UID:,ResourceVersion:,Generation:0,CreationTimestamp:0001-01-01 00:00:00 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{},Annotations:map[string]string{},OwnerReferences:[]OwnerReference{},Finalizers:[],ManagedFields:[]ManagedFieldsEntry{},}"
I0209 12:32:13.990895 7 status.go:275] "skipping update of Ingress (no change)" namespace="namespace" ingress="app-ingress-xxxx"
2023/02/09 12:32:59 [error] 30#30: *4409 upstream timed out (110: Operation timed out) while connecting to upstream, client: X.X.X.X , server: xxxx, request: "GET /docs HTTP/1.1", upstream: "http://172.16.180.6:8005/docs", host: "xxxx"
2023/02/09 12:33:04 [error] 30#30: *4409 upstream timed out (110: Operation timed out) while connecting to upstream, client: X.X.X.X , server: xxxx, request: "GET /docs HTTP/1.1", upstream: "http://172.16.180.6:8005/docs", host: "xxxx"
2023/02/09 12:33:09 [error] 30#30: *4409 upstream timed out (110: Operation timed out) while connecting to upstream, client: X.X.X.X , server: xxxx, request: "GET /docs HTTP/1.1", upstream: "http://172.16.180.6:8005/docs", host: "xxxx"
X.X.X.X - - [09/Feb/2023:12:33:09 +0000] "GET /docs HTTP/1.1" 504 160 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:108.0) Gecko/20100101 Firefox/108.0" 373 15.004 [my-app-8005] [] 172.16.180.6:8005, 172.16.180.6:8005, 172.16.180.6:8005 0, 0, 0 5.001, 5.001, 5.001 504, 504, 504 56fb622d8d89d8d7b3cdbc4a094215c3
Yaml配置文件:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress-xxxx
spec:
ingressClassName: nginx
rules:
- host: xxxx
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-app
port:
number: 8005
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: namespace
spec:
progressDeadlineSeconds: 3600
replicas: 1
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: backend
image: xxxx
imagePullPolicy: Always
ports:
- containerPort: 8005
imagePullSecrets:
- name: xxxx
---
apiVersion: v1
kind: Service
metadata:
name: my-app
namespace: namespace
labels:
app: my-app
spec:
type: NodePort
ports:
- nodePort: 30008
port: 8005
protocol: TCP
selector:
app: my-app
我更改了在此发布的应用程序和IP。
我坚持认为,在通过nginx查询时超时期间,我仍然可以使用主服务器上的worker-ip:nodePort地址和ssh访问它,并使用ClusterIP curl fastapi pod。
我的第一个猜测是内存问题,尽管服务器上现在没有运行任何其他东西。我刚刚安装了kubernetes metrics API,目前正在等待再次停机,目前为止没有问题。
这种行为的原因可能是什么?感谢您对进一步检查的内容提出任何建议!
1条答案
按热度按时间yeotifhr1#
如果您收到504网关超时错误,您的系统可能资源不足。请增加您的环境资源以解决此问题。504错误意味着nginx等待响应的时间过长,并且已经超时。此外,您需要将入口注解添加到yaml配置文件中。默认情况下,proxy_read_timeout为60 s;
定义从代理服务器阅读响应的超时。仅在两个连续读取操作之间设置超时,而不是为传输整个响应设置超时。如果代理服务器在此时间内未传输任何内容,则连接将关闭。有关详细信息,请参阅文档