SQLite FTS5触发器对行的子集进行索引

wmomyfyw  于 2023-05-18  发布在  SQLite
关注(0)|答案(1)|浏览(256)

我的目标是一种类似wiki的笔记本,使用SQLite作为数据存储。这个想法是有一个大的页面表,每个页面的每个修订都存储在其中。每个页面只有三个属性:
1.页面标题
1.时间戳(timestamp)
1.页面的文本
(It将是非常罕见的,一个页面被删除,但我想允许它,如果可能的话。

我只关心索引和搜索每个页面的最新版本,所以只有每个页面的最新版本应该在FTS5索引中。我正在尝试使用TRIGGER来管理索引。然而,我得到的结果是所有的修订,或以前的修订。

测试用例:

-- The pages table
CREATE TABLE IF NOT EXISTS pages (
    id INTEGER PRIMARY KEY,
    revision TEXT NOT NULL,
    title TEXT NOT NULL,
    body TEXT
);

-- The FTS5 table
CREATE virtual TABLE pages_fts USING FTS5(
    id,
    title,
    body,
    content='pages',
    content_rowid=id
);

-- Before a page is inserted into the pages table,
-- delete the old matching entry in the FTS table
CREATE TRIGGER pages_before_insert BEFORE INSERT ON pages
BEGIN
    INSERT INTO pages_fts (pages_fts, id, title, body)
        SELECT 'delete', id, title, body
        FROM pages
        WHERE title = new.title
        ORDER BY revision
        DESC
        LIMIT 1;
END;

-- After a page is inserted into the pages table,
-- insert a matching entry into the FTS table
CREATE TRIGGER pages_after_insert AFTER INSERT ON pages
BEGIN
    INSERT INTO pages_fts (id, title, body)
    VALUES (new.id, new.title, new.body);
END;

-- After a page is deleted from the pages table,
-- delete the matching entry from the FTS table
CREATE TRIGGER pages_after_delete AFTER DELETE ON pages
BEGIN
    INSERT INTO pages_fts (pages_fts, id, title, body)
    VALUES ('delete', old.id, old.title, old.body);
END;

INSERT INTO pages (revision, title, body) VALUES ('2023-04-26T22:48:35.582797', 'home', 'body version one');
INSERT INTO pages (revision, title, body) VALUES ('2023-04-26T22:48:40.250981', 'home', 'body version two');
INSERT INTO pages (revision, title, body) VALUES ('2023-04-26T22:48:46.205782', 'home', 'body version one');

-- should return no errors
INSERT INTO pages_fts(pages_fts, rank) VALUES ('integrity-check', 1);

-- title search
-- should return only one result
SELECT title, snippet(pages_fts, 1, '>', '<', '...', 10)
AS snippet
FROM pages_fts
WHERE pages_fts
MATCH 'title:home'
ORDER BY rank
LIMIT 50;

-- body search
-- should only return one result
SELECT title, snippet(pages_fts, 2, '>', '<', '...', 64)
AS snippet
FROM pages_fts
WHERE pages_fts
MATCH 'body:version'
ORDER BY rank
LIMIT 50;

最后两个搜索中的每一个都应该只显示一个结果,但是我得到了多个结果,如下所示:
输出:

home|>home<
home|>home<
home|>home<
home|body >version< one
home|body >version< two
home|body >version< one

我以前在错误的帐户下发布了这个问题,后来删除了它。所以仍然认为这是一个重复,所以我不得不添加这段文字,以便张贴问题。

4jb9z9bj

4jb9z9bj1#

事实证明,当将索引链接到外部内容(使用content=<table_name>)时,您无法挑选要索引的内容表的行。对于外部内容索引,您必须 * 始终 * 保持索引和外部表同步,否则会导致数据库损坏。
为此,只需从FTS5表定义中省略contentcontent_rowid列。这允许您使用常规的SQL语句以任何方式构建搜索索引。(SQLite文档花了很多时间讨论外部内容和无内容模式,但没有命名甚至似乎没有定义这种操作模式,我只是称之为“正常”模式。

CREATE TABLE IF NOT EXISTS pages (
    revision TEXT NOT NULL,
    title TEXT NOT NULL,
    body TEXT
);

CREATE VIRTUAL TABLE pages_fts USING FTS5(
    title,
    body
);

CREATE TRIGGER pages_after_insert AFTER INSERT ON pages
BEGIN
    DELETE FROM pages_fts WHERE title = new.title;
    INSERT INTO pages_fts (title, body)
    VALUES (new.title, new.body);
END;

相关问题