表架构
我有一个按天分区的分区表“solutions”,如下所示:
解决方案:
- 解决方案_20230115
- 解决方案_20230116
- 解决方案_20230117
...
- 解决方案_20230314
- 解决方案_20230315
- 解决方案_旧版
每天都会为新数据创建新分区'solutuons_yyyymmdd'。分区'solutions_legacy'存储表'solutions'分区之前的时间段内的旧数据,当前为从MINVALUE到'2023-01- 15'。
我的目标
我必须为旧数据创建新分区,并将数据从分区“solutions_legacy”迁移到新创建的分区。
我的困难
问题是我必须在线完成,以便在迁移数据时最终用户可以使用表“解决方案”(至少是最近的分区)。
我的解决方案
这个想法是将'solutions_legacy'从'solutions'中分离出来,并将其作为一个单独的表使用,而'solutions'则可供最终用户使用。
- 分离“解决方案_旧版”
- 创建新表'solutions_yyyymmdd'
- 将相应的数据从“解决方案_旧版”迁移到“解决方案_yyyymmdd”
- 将两个表作为分区附加到“解决方案”
这样,我想只有在第一步分离“solutions_legacy”时才会有一些停机时间。
我的代码
create or replace PROCEDURE attach_partition(p_name text, p_start date default '2020-01-01', p_end date default '2099-01-01')
as $$
begin
RAISE NOTICE '%: Attaching partition ''%''...', clock_timestamp(), p_name;
execute format(
'ALTER TABLE solutions ATTACH PARTITION %I
FOR VALUES FROM (%L) TO (%L)', p_name, p_start, p_end);
RAISE NOTICE '% ''%'' has been attached', clock_timestamp(), p_name;
end;
$$ LANGUAGE plpgsql;
create or replace PROCEDURE detach_partition(p_name text)
as $$
begin
RAISE NOTICE '%: Detaching ''%''...', clock_timestamp(), p_name;
execute format('ALTER TABLE solutions DETACH PARTITION %I', p_name);
RAISE NOTICE '%: ''%'' has been detached', clock_timestamp(), p_name;
end;
$$ LANGUAGE plpgsql;
create or replace PROCEDURE create_table(p_name text, p_start date, p_end date)
as $$
begin
RAISE NOTICE '%: Creating new table ''%'' for values from % to %...', clock_timestamp(), p_name, p_start, p_end;
execute format('CREATE TABLE IF NOT EXISTS %I
(LIKE solutions INCLUDING DEFAULTS)', p_name);
RAISE NOTICE '%: Table ''%'' has been created', clock_timestamp(), p_name;
end;
$$ LANGUAGE plpgsql;
create or replace PROCEDURE migrate_data(p_name text, p_start date, p_end date)
as $$
begin
RAISE NOTICE '%: Migrating data to ''%''...', clock_timestamp(), p_name;
execute format('
WITH moved_rows AS (
DELETE FROM public.solutions_legacy sl
WHERE created_at >= %L
and created_at < %L
RETURNING sl.*
)
INSERT INTO %I
SELECT * FROM moved_rows', p_start, p_end, p_name);
RAISE NOTICE '%: Data has been migrated', clock_timestamp();
end;
$$ LANGUAGE plpgsql;
do
$$
declare
new_partitions_count int := 3;
rec record;
partition_start date;
partition_end date;
partition_name text;
begin
call detach_partition('solutions_legacy');
-- !!! HERE IS WHERE I TRY TO COMMIT
RAISE NOTICE '%: Querying ''solutions_legacy'' to determine % new partitions...', clock_timestamp(), new_partitions_count;
for rec in
select distinct date_trunc('day', created_at)::date d
from public.solutions_legacy
order by d desc limit new_partitions_count
loop
partition_start := rec.d;
partition_end := rec.d + 1;
partition_name := 'solutions_' || TO_CHAR(partition_start, 'yyyymmdd') ;
call create_table(partition_name, partition_start, partition_end);
call migrate_data(partition_name, partition_start, partition_end);
call attach_partition(partition_name, partition_start, partition_end);
end loop;
call attach_partition('solutions_legacy', p_end => partition_start);
end;
$$ LANGUAGE plpgsql;
下面是一个日志生成器示例,以便更好地理解:
NOTICE: 2023-03-15 10:28:02.306654+00: Detaching 'solutions_legacy'...
NOTICE: 2023-03-15 10:31:07.60235+00: 'solutions_legacy' has been detached
NOTICE: 2023-03-15 10:31:07.602576+00: Querying 'solutions_legacy' to determine 3 new partitions...
NOTICE: 2023-03-15 10:33:29.410183+00: Creating new table 'solutions_20230114' for values from 2023-01-14 to 2023-01-15...
NOTICE: 2023-03-15 10:33:29.416459+00: Table 'solutions_20230114' has been created
NOTICE: 2023-03-15 10:33:29.417196+00: Migrating data to 'solutions_20230114'...
NOTICE: 2023-03-15 10:36:14.731551+00: Data has been migrated
NOTICE: 2023-03-15 10:36:14.732235+00: Attaching partition 'solutions_20230114'...
NOTICE: 2023-03-15 10:37:51.100635+00 'solutions_20230114' has been attached
NOTICE: 2023-03-15 10:37:51.100761+00: Creating new table 'solutions_20230113' for values from 2023-01-13 to 2023-01-14...
NOTICE: 2023-03-15 10:37:51.102272+00: Table 'solutions_20230113' has been created
NOTICE: 2023-03-15 10:37:51.102315+00: Migrating data to 'solutions_20230113'...
NOTICE: 2023-03-15 10:40:10.749074+00: Data has been migrated
NOTICE: 2023-03-15 10:40:10.749182+00: Attaching partition 'solutions_20230113'...
NOTICE: 2023-03-15 10:41:47.559556+00 'solutions_20230113' has been attached
NOTICE: 2023-03-15 10:41:47.559661+00: Creating new table 'solutions_20230112' for values from 2023-01-12 to 2023-01-13...
NOTICE: 2023-03-15 10:41:47.561249+00: Table 'solutions_20230112' has been created
NOTICE: 2023-03-15 10:41:47.561291+00: Migrating data to 'solutions_20230112'...
NOTICE: 2023-03-15 10:44:07.43721+00: Data has been migrated
NOTICE: 2023-03-15 10:44:07.43729+00: Attaching partition 'solutions_20230112'...
NOTICE: 2023-03-15 10:45:41.37163+00 'solutions_20230112' has been attached
NOTICE: 2023-03-15 10:45:41.371922+00: Attaching partition 'solutions_legacy'...
NOTICE: 2023-03-15 10:47:56.991827+00 'solutions_legacy' has been attached
DO
Query returned successfully in 20 min.
我的问题
即使数据成功迁移到新创建的分区,在迁移过程中,最终用户也无法使用表“solutions”,甚至SELECT查询也在等待锁。我想这是因为整个脚本是一个事务。因此,我尝试在调用detach_partition('solutions_legacy')
之后或在此过程的最后提交更改,但在大多数情况下,我会收到错误消息
错误:事务终止无效
我试过在一些地方添加COMMIT,也试过使用开始/END语句。
解决这个问题的正确方法是什么?
基于x86_64-pc-linux-musl的PostgreSQL 15.2,由gcc编译(阿尔卑斯山12.2.1_git20220924-r4)12.2.1 20220924,64位
1条答案
按热度按时间t30tvxxf1#
DO
块允许事务控制当且仅当您没有将其 Package 在事务中时。我建议在事务的外部执行do块,
commit
语句应该按计划工作。编辑:参见https://www.postgresql.org/docs/current/sql-do.html#id-1.9.3.102.8获取文档。