postgresql 第一次执行查询时，Sqlalchemy很慢

rekjcdws 于 2023-02-22 发布在 PostgreSQL

关注(0)|答案(1)|浏览(508)

我在python3.10中使用Sqlalchemy（2.0.3），在新的容器 Boot 后，执行特定的查询需要大约2.2s，执行同一个查询的所有连续调用需要大约70 ms。我使用PostgreSQL，在DataGrip中执行原始查询需要40- 70 ms。

self._Session = async_sessionmaker(self._engine, expire_on_commit=False)

...

@property
def session(self):
   return self._Session

...

async with PostgreSQL().session.begin() as session:
    total_functions = aliased(db_models.Function)
    finished_functions = aliased(db_models.Function)
    failed_functions = aliased(db_models.Function)
    stmt = (
        select(
            db_models.Job,
            func.count(distinct(total_functions.id)).label("total"),
            func.count(distinct(finished_functions.id)).label("finished"),
            func.count(distinct(failed_functions.id)).label("failed")
        )
        .where(db_models.Job.project_id == project_id)
        .outerjoin(db_models.Job.packages)
        .outerjoin(db_models.Package.modules)
        .outerjoin(db_models.Module.functions.of_type(total_functions))
        .outerjoin(finished_functions, and_(
               finished_functions.module_id == db_models.Module.id,
               finished_functions.progress == db_models.FunctionProgress.FINISHED
            ))
        .outerjoin(failed_functions, and_(
            failed_functions.module_id == db_models.Module.id,
            or_(
                failed_functions.state == db_models.FunctionState.FAILED,
                failed_functions.state == db_models.FunctionState.TERMINATED,
            ))
           )
        .group_by(db_models.Job.id)
    )
    start = time.time()
    yappi.set_clock_type("WALL")
    with yappi.run():
        job_infos = await session.execute(stmt)
    yappi.get_func_stats().print_all()
    end = time.time()

我尝试并发现的事情：

问题与连接或查询数据库数据库无关。在服务 Boot 时，我建立连接并进行一些其他查询。
问题很可能与缓存无关。我用query_cache_size=0禁用了缓存，但是我不能100%确定它是否有效，因为文档中说：

与工作单元持久性相关的ORM函数以及一些属性加载策略将使用主缓存之外的单独的每个Map器缓存。

Profiler没有显示任何引起我注意的内容：

..urrency_py3k.py:130 greenlet_spawn  2/1    0.000000  2.324807  1.162403  
..rm/session.py:2168 Session.execute  1      0.000028  2.324757  2.324757  
..0 _UnixSelectorEventLoop._run_once  11     0.000171  2.318555  0.210778  
..syncpg_cursor._prepare_and_execute  1      0.000054  2.318187  2.318187  
..cAdapt_asyncpg_connection._prepare  1      0.000020  2.316333  2.316333  
..nnection.py:533 Connection.prepare  1      0.000003  2.316154  2.316154  
..nection.py:573 Connection._prepare  1      0.000017  2.316151  2.316151  
..n.py:359 Connection._get_statement  2/1    0.001033  2.316122  1.158061  
..ectors.py:452 EpollSelector.select  11     0.000094  2.315352  0.210487  
..y:457 Connection._introspect_types  1      0.000025  2.314904  2.314904  
..ction.py:1669 Connection.__execute  1      0.000027  2.314879  2.314879  
..ion.py:1699 Connection._do_execute  1      2.314095  2.314849  2.314849  
...py:2011 Session._execute_internal  1      0.000034  0.006174  0.006174

我还看到可以禁用每个连接的缓存：

with engine.connect().execution_options(compiled_cache=None) as conn:
    conn.execute(table.select())

然而，我的工作与ORM层，不知道如何应用于我的情况下。
你知道为什么会推迟吗？

postgresql

来源：https://stackoverflow.com/questions/75477373/sqlalchemy-is-slow-when-doing-query-the-first-time

1条答案

按热度按时间

dl5txlt91#

经过几个小时的谷歌搜索，我发现了这篇文章。简而言之，问题是与缺乏依赖关系（在一些阿尔卑斯山 Docker 图像），这是由Postgres使用的JIT所需的。有关细节，我真的建议阅读后和real-life impact作者提供。
Sqlalchemy的实际解决方案是关闭JIT：

engine = create_async_engine(
    "postgresql+asyncpg://user:password@localhost/tmp",
    connect_args={"server_settings": {"jit": "off"}},
)

参考文件

赞(0）回复(0）举报 2023-02-22

我来回答

postgresql 第一次执行查询时，Sqlalchemy很慢

1条答案

相关问题

热门标签

最新问答