PostgreSQL:如何对JSONB字段中的所有属性求和?

dhxwm5r4  于 2023-03-04  发布在  PostgreSQL
关注(0)|答案(4)|浏览(362)

我正在使用Postgres 9.4。我有一个JSONB字段:

Column      │         Type         │                             Modifiers
─────────────────┼──────────────────────┼────────────────────────────────────────────────────────────────────
 id              │ integer              │ not null default
 practice_id     │ character varying(6) │ not null
 date            │ date                 │ not null
 pct_id          │ character varying(3) │
 astro_pu_items  │ double precision     │ not null
 astro_pu_cost   │ double precision     │ not null
 star_pu         │ jsonb                │

我可以很好地查询JSONB字段的原始值:

SELECT star_pu FROM mytable limit 1;
star_pu │ {"statins_cost": 16790.692924903742, "hypnotics_adq": 18523.58385328709, "laxatives_cost": 8456.98405165182, "analgesics_cost": 48271.21822239242, "oral_nsaids_cost": 9911.336052088493, "antidepressants_adq": 186715.7, "antidepressants_cost": 26885.54622478343, "bronchodilators_cost": 26646.54899847902, "cox-2_inhibitors_cost": 2063.4652015406728, "antiplatelet_drugs_cost": 4844.798321177439, "drugs_for_dementia_cost": 3390.569564110721, "antiepileptic_drugs_cost": 44990.94756286502, "oral_antibacterials_cost": 21047.048353859234, "oral_antibacterials_item": 5096.6501798218205, "ulcer_healing_drugs_cost": 15999.05326260261, "lipid-regulating_drugs_cost": 24711.589440943662, "proton_pump_inhibitors_cost": 14545.398978447573, "inhaled_corticosteroids_cost": 50759.91062192373, "calcium-channel_blockers_cost": 11571.457036131978, "omega-3_fatty_acid_compounds_adq": 2026.0, "benzodiazepine_caps_and_tabs_cost": 1800.2581325567717, "bisphosphonates_and_other_drugs_cost": 2996.912924744617, "drugs_acting_on_benzodiazepine_receptors_cost": 2993.142806352308, "drugs_affecting_the_renin_angiotensin_system_cost": 20255.500615282508, "drugs_used_in_parkinsonism_and_related_disorders_cost": 9812.457888596877}

现在,我希望得到整个表中的JSONB值SUM,但我不知道如何实现,理想情况下,我会得到一个字典,其中的键如上所述,值是求和值。
我可以显式地对SUM的一个JSONB字段执行以下操作:

SELECT date, SUM(total_list_size) as total_list_size, 
    SUM((star_pu->>'oral_antibacterials_item')::float) AS star_pu_oral_antibac_items
    FROM mytable GROUP BY date ORDER BY date

但是如何计算JSONB字段中所有属性的总和-最好以字典的形式返回整个字段?理想情况下,我会返回如下内容:

star_pu │ {"statins_cost": very-large-number, "hypnotics_adq": very-large-number, ...

我想我可以通过显式地对每个键求和来手动获得每个字段,但是我使用JSONB字段的全部原因是有很多键,并且它们可能会改变。
可以安全地假设JSONB字段只包含键和值,即深度为1。

ru9i0ody

ru9i0ody1#

查询应完成以下工作:

select date, json_object_agg(key, val)
from (
    select date, key, sum(value::numeric) val
    from mytable t, jsonb_each_text(star_pu)
    group by date, key
    ) s
group by date;

得到的json值将按键的字母顺序排序(json_object_agg ()的副作用),我不知道这是否重要。

yuvru6vn

yuvru6vn2#

我已经编写了一个Postgres extension来实现这个功能,一旦你安装了它,你就可以:

SELECT jsonb_deep_sum(star_pu) FROM mytable;

200万行的基准为4秒,@klin的答案为11秒

ezykj2lf

ezykj2lf3#

也许有更好的办法,但至少这个管用:

WITH
  keys AS (SELECT DISTINCT jsonb_object_keys(star_pu) AS key FROM mytable),
  sums AS (SELECT key, sum((star_pu->>key)::float) AS total FROM keys, mytable GROUP BY key)
  SELECT json_object(array_agg(key), array_agg(total::text))::jsonb FROM sums

基本上,它把jsonb分解成行,从行中获取名称,将它们相加,聚合成数组,然后创建一个jsonb结构。不幸的是,没有jsonb_object()函数,所以我们必须将其转换成json,然后再转换成jsonb。

brc7rcf0

brc7rcf04#

如果你需要在PostgreSQL中对JSONB列中的值求和,你可以使用一个自定义的聚合函数,我已经写了一个名为的函数来完成这个任务。

create function custom_jsonb_add(a jsonb, b jsonb) returns jsonb as $$
    with expoded as (select *
                from jsonb_each(a) as t(key, value)
                union all
                select *
                from jsonb_each(b) as t(key, value)),
         folded as (select key, sum(value::numeric) as value
                from expoded
                group by key)
    select jsonb_object_agg(key, value)
    from folded
$$ language SQL immutable strict;

create aggregate custom_jsonb_add_agg(jsonb) (
    sfunc = custom_jsonb_add,
    stype = jsonb,
    initcond = '{}'
);

相关问题