无法在Heroku上部署带有Celery的FastAPI

busg9geu  于 2022-11-13  发布在  其他
关注(0)|答案(2)|浏览(156)

我正在开发一个应用程序,它的主要任务是从LinkedIn的个人资料中截取文本,处理这些文本,并返回一个包含个人资料中最常见单词的字典。在本地机器上一切都运行得很好,但是当我决定在Heroku上部署这个应用程序时,出现了一些问题。我的截取过程似乎需要5-7分钟,所以我在Heroku上达到了请求超时。为了避免这种情况,我将Celery应用到我的项目中,在后台运行这个过程。现在我有一个问题,要在Heroku上顺利部署这个过程。

项目结构:

web-sourcing-tools
├── app
│   ├── agents
│   │   ├── __init__.py
│   │   ├── data_processing.py
│   │   ├── scraper.py
│   │   └── string_builder.py
│   ├── library
│   │   └── helpers.py
│   ├── pages
│   │   ├── __init__.py
│   │   └── home.md
│   └── __init__.py
├── static
│   ├── css
│   │   ├── mystyle.css
│   │   └── style3.css
│   └── images
│       └── favicon.ico
├── templates
│   ├── include
│   │   ├── sidebar.html
│   │   └── topnav.html
│   ├── base.html
│   ├── form.html
│   └── page.html
├── .gitignore
├── __init__.py
├── main.py
├── nltk.txt
├── Procfile
├── README.md
├── requirements.txt
├── runtime.txt
└── tasks.py

过程文件:

web: gunicorn -w 4 -k uvicorn.workers.UvicornWorker main:app
worker: celery worker --app=tasks.app

运行时间.txt

runtime.txt

任务.py

from celery import Celery
import os
from app.agents.scraper import Scraper

app = Celery(__name__)
app.conf.update(
    BROKER_URL=os.environ["REDIS_URL"],
    CELERY_RESULT_BACKEND=os.environ["REDIS_URL"]
)

@app.task(name="scraper")
def scraper(username, password, query, n_pages):
    results = Scraper(username, password, query, n_pages)
    return results

主文件名.py

from fastapi import FastAPI, Request, Form
from fastapi.responses import HTMLResponse
from fastapi.templating import Jinja2Templates
from fastapi.staticfiles import StaticFiles
from app.library.helpers import *
from app.agents.string_builder import string_builder
from tasks import scraper

LOGIN = os.environ.get("LOGIN")
PASS = os.environ.get("PASS")

app = FastAPI()
templates = Jinja2Templates(directory="templates")
app.mount("/static", StaticFiles(directory="static"), name="static")

@app.get("/", response_class=HTMLResponse)
async def home(request: Request):
    data = openfile("home.md")
    return templates.TemplateResponse("page.html", {"request": request, "data": data})

@app.post("/common-words")
def form_post(
    request: Request,
    string_or: str = Form(...),
    string_and: str = Form(...),
    string_not: str = Form(...),
):
    query = string_builder(OR=string_or, AND=string_and, NOT=string_not)
    n_page = 2
    task = scraper.delay(LOGIN, PASS, query, n_page)
    return templates.TemplateResponse(
        "form.html", context={"request": request, "result": task.get()}
    )

@app.get("/common-words")
def form_post(request: Request):

    result = ""
    return templates.TemplateResponse(
        "form.html", context={"request": request, "result": result}
    )

if __name__ == "__main__":
    app.run()

来自heroku控制台的错误:

2022-01-17T23:42:38.383531+00:00 heroku[router]: at=info method=GET path="/common-words" host=web-sourcing-tools.herokuapp.com request_id=21dd948a-b7e5-46f8-8c1c-9b5a3f091592 fwd="95.175.20.47" dyno=web.1 connect=0ms service=7ms status=200 bytes=6691 protocol=https
2022-01-17T23:43:11.505703+00:00 heroku[router]: at=error code=H12 desc="Request timeout" method=POST path="/common-words" host=web-sourcing-tools.herokuapp.com request_id=59465f6f-27d0-4583-83e8-40e6e6e5bd8d fwd="95.175.20.47" dyno=web.1 connect=0ms service=30000ms status=503 bytes=0 protocol=https
2022-01-17T23:43:12.148229+00:00 app[web.1]: 95.175.20.47:0 - "GET /favicon.ico HTTP/1.1" 404
2022-01-17T23:43:12.149208+00:00 heroku[router]: at=info method=GET path="/favicon.ico" host=web-sourcing-tools.herokuapp.com request_id=e079f8a2-a58b-4b3c-8bda-c2d4acd362ef fwd="95.175.20.47" dyno=web.1 connect=0ms service=3ms status=404 bytes=173 protocol=https
2022-01-17T23:44:10.922495+00:00 heroku[router]: at=error code=H12 desc="Request timeout" method=POST path="/common-words" host=web-sourcing-tools.herokuapp.com request_id=2664e65d-30a9-485f-8048-f67515d624a4 fwd="95.175.20.47" dyno=web.1 connect=0ms service=30000ms status=503 bytes=0 protocol=https
2022-01-17T23:44:15.101837+00:00 heroku[router]: at=error code=H12 desc="Request timeout" method=POST path="/common-words" host=web-sourcing-tools.herokuapp.com request_id=7c88d428-e3e5-4b0e-88f9-4769ac229c24 fwd="95.175.20.47" dyno=web.1 connect=0ms service=30000ms status=503 bytes=0 protocol=https
2022-01-17T23:42:56.000000+00:00 app[heroku-redis]: source=REDIS addon=redis-closed-93849 sample#active-connections=5 sample#load-avg-1m=0.16 sample#load-avg-5m=0.205 sample#load-avg-15m=0.215 sample#read-iops=0 sample#write-iops=0 sample#memory-total=15619140kB sample#memory-free=10414152kB sample#memory-cached=2560180kB sample#memory-redis=433568bytes sample#hit-rate=0.21569 sample#evicted-keys=0
2022-01-17T23:46:40.000000+00:00 app[heroku-redis]: source=REDIS addon=redis-closed-93849 sample#active-connections=8 sample#load-avg-1m=0.095 sample#load-avg-5m=0.15 sample#load-avg-15m=0.185 sample#read-iops=0 sample#write-iops=0 sample#memory-total=15619140kB sample#memory-free=10413852kB sample#memory-cached=2560192kB sample#memory-redis=499248bytes sample#hit-rate=0.21053 sample#evicted-keys=0
2022-01-17T23:50:40.000000+00:00 app[heroku-redis]: source=REDIS addon=redis-closed-93849 sample#active-connections=4 sample#load-avg-1m=0.175 sample#load-avg-5m=0.14 sample#load-avg-15m=0.17 sample#read-iops=0 sample#write-iops=0 sample#memory-total=15619140kB sample#memory-free=10414380kB sample#memory-cached=2560276kB sample#memory-redis=415400bytes sample#hit-rate=0.21053 sample#evicted-keys=0
2022-01-17T23:54:36.000000+00:00 app[heroku-redis]: source=REDIS addon=redis-closed-93849 sample#active-connections=4 sample#load-avg-1m=0.09 sample#load-avg-5m=0.1 sample#load-avg-15m=0.145 sample#read-iops=0 sample#write-iops=0 sample#memory-total=15619140kB sample#memory-free=10418696kB sample#memory-cached=2560544kB sample#memory-redis=415400bytes sample#hit-rate=0.21053 sample#evicted-keys=0
2022-01-17T23:58:20.000000+00:00 app[heroku-redis]: source=REDIS addon=redis-closed-93849 sample#active-connections=4 sample#load-avg-1m=0.18 sample#load-avg-5m=0.135 sample#load-avg-15m=0.145 sample#read-iops=0 sample#write-iops=0 sample#memory-total=15619140kB sample#memory-free=10418720kB sample#memory-cached=2560560kB sample#memory-redis=415400bytes sample#hit-rate=0.21053 sample#evicted-keys=0
2022-01-18T00:02:16.000000+00:00 app[heroku-redis]: source=REDIS addon=redis-closed-93849 sample#active-connections=4 sample#load-avg-1m=0.355 sample#load-avg-5m=0.315 sample#load-avg-15m=0.215 sample#read-iops=0 sample#write-iops=0.063241 sample#memory-total=15619140kB sample#memory-free=10421644kB sample#memory-cached=2560532kB sample#memory-redis=415400bytes sample#hit-rate=0.21053 sample#evicted-keys=0

在www.example.com中task.py,我从from app.agents.scraper import Scraper导入我的主脚本-该类返回dict包含的值- word和quantity。
在Heroku中,我添加了如下配置变量:

我犯错误的时候你有什么想法吗?

pkmbmrz7

pkmbmrz71#

这可能是LinkedIn阻止/限制云托管服务的IP范围。https://github.com/spinlud/linkedin-jobs-scraper/issues/10#issuecomment-692537789

bxfogqkk

bxfogqkk2#

你在heroku的资源页面激活了worker dyno吗?在我的例子中,我错过了。

你也可以像这样更新你的Procfile

web: gunicorn -w 4 -k uvicorn.workers.UvicornWorker main:app
worker: celery -A tasks worker

相关问题