我有一个相当简单的Flask应用程序(使用fastAPI),用于加载numpy数组和定义一些API端点。
import numpy as np
import pandas as pd
import logging
from fastapi import FastAPI
app = FastAPI()
logging.basicConfig(level=logging.DEBUG)
logging.info('Loading texts')
texts = pd.read_csv('cleaned.csv')
logging.info('Loading embeddings')
embeddings = np.load('laser-2020-04-30.npy') # 3.7G
logging.info('Loading completed!')
# some API endpoints below...
我可以用纯python3.7
运行这个应用程序没有任何问题。而且它在运行vanilla gunicorn时运行得很好。当在一个docker容器中运行所有内容时(并使用gunicorn),问题出现了。它似乎在加载大型numpy数组和引导新工作者时卡住了。
[2020-05-11 08:33:20 +0000] [1] [INFO] Starting gunicorn 20.0.4
[2020-05-11 08:33:20 +0000] [1] [DEBUG] Arbiter booted
[2020-05-11 08:33:20 +0000] [1] [INFO] Listening at: http://0.0.0.0:80 (1)
[2020-05-11 08:33:20 +0000] [1] [INFO] Using worker: sync
[2020-05-11 08:33:20 +0000] [7] [INFO] Booting worker with pid: 7
[2020-05-11 08:33:20 +0000] [1] [DEBUG] 1 workers
INFO:root:Loading texts
INFO:root:Loading embeddings
[2020-05-11 08:33:35 +0000] [18] [INFO] Booting worker with pid: 18
INFO:root:Loading texts
INFO:root:Loading embeddings
[2020-05-11 08:33:51 +0000] [29] [INFO] Booting worker with pid: 29
INFO:root:Loading texts
INFO:root:Loading embeddings
[2020-05-11 08:34:05 +0000] [40] [INFO] Booting worker with pid: 40
INFO:root:Loading texts
INFO:root:Loading embeddings
[2020-05-11 08:34:19 +0000] [51] [INFO] Booting worker with pid: 51
INFO:root:Loading texts
INFO:root:Loading embeddings
[2020-05-11 08:34:36 +0000] [62] [INFO] Booting worker with pid: 62
我将工作者数量设置为1,并将超时时间增加到900秒。但是,它每10-15秒引导一次新的工作者。
在我的Dockerfile
中运行应用程序的命令如下所示
CMD ["gunicorn","-b 0.0.0.0:8080", "main:app", "--timeout 900", "--log-level", "debug", "--workers", "1", "--graceful-timeout", "900"]
1条答案
按热度按时间8i9zcol21#
为了解决这个问题,我简单地增加了Docker容器可以使用的RAM数量。在我的MacBook 2019安装的Docker上,默认值是2G。由于numpy数组是3.7G,这就是为什么它无法加载它。