regex 一般表示式:将匹配方法从OR更改为AND

qaxu7uf2  于 2022-11-18  发布在  其他
关注(0)|答案(3)|浏览(355)

下面是一个正则表达式:(在Oracle的regexp_like()上运行,尽管该问题并非特定于Oracle)

abc|bcd|def|xyz

这基本上匹配数据库上的tags字段,以查看当用户输入搜索查询“abc bcd def xyz“时,tags字段是否包含abcORbcdORdefORxyz。数据库上的tags字段保存由空格分隔的关键字,例如“cdefg abcd xyz”
在Oracle上,这类似于:

select ... from ... where 
   regexp_like(tags, 'abc|bcd|def|xyz');

它现在运行得很好,但是我想为用户添加一个额外的选项来搜索与所有关键字匹配的结果。我应该如何更改正则表达式,使其与abcANDbcdANDdefANDxyz匹配?
注意:因为我不知道用户将输入哪些确切的关键字,所以我无法在PL/SQL中预先构建查询,如下所示:

select ... from ... where 
   tags like '%abc%' AND
   tags like '%bcd%' AND
   tags like '%def%' AND
   tags like '%xyz%';
wfsdck30

wfsdck301#

您可以拆分输入模式并检查模式的所有部分是否匹配:

SELECT t.*
FROM   table_name t
       CROSS APPLY(
         WITH input (match) AS (
           SELECT 'abc bcd def xyz' FROM DUAL
         )
         SELECT 1
         FROM   input
         CONNECT BY LEVEL <= REGEXP_COUNT(match, '\S+')
         HAVING COUNT(
                  REGEXP_SUBSTR(
                    t.tags,
                    REGEXP_SUBSTR(match, '\S+', 1, LEVEL)
                  )
                ) = REGEXP_COUNT(match, '\S+')
       )

或者,如果在数据库中启用了Java,则可以创建一个Java函数来匹配正则表达式:

CREATE AND COMPILE JAVA SOURCE NAMED RegexParser AS
import java.util.regex.Pattern;

public class RegexpMatch {
  public static int match(
    final String value,
    final String regex
  ){
    final Pattern pattern = Pattern.compile(regex);

    return pattern.matcher(value).matches() ? 1 : 0;
  }
}
/

然后将其 Package 在SQL函数中:

CREATE FUNCTION regexp_java_match(value IN VARCHAR2, regex IN VARCHAR2) RETURN NUMBER
AS LANGUAGE JAVA NAME 'RegexpMatch.match( java.lang.String, java.lang.String ) return int';
/

然后在SQL中使用它:

SELECT *
FROM   table_name
WHERE  regexp_java_match(tags, '(?=.*abc)(?=.*bcd)(?=.*def)(?=.*xyz)') = 1;
v2g6jxz6

v2g6jxz62#

尝试这样做,其思想是计算匹配的数量==模式的数量:

with data(val) AS (
    select 'cdefg abcd xyz' from dual union all
    select 'cba lmnop xyz' from dual
),
targets(s) as (
    select regexp_substr('abc bcd def xyz', '[^ ]+', 1, LEVEL)  from dual
    connect by regexp_substr('abc bcd def xyz', '[^ ]+', 1, LEVEL) is not null
)
select val from data d
join targets t on 
    regexp_like(val,s)
group by val having(count(*) = (select count(*) from targets))
;

结果:

cdefg abcd xyz
jmp7cifd

jmp7cifd3#

我认为需要dynamic SQL。matchall选项需要单独匹配逻辑以确保找到每个单独的匹配。
一个简单的方法是为每个关键字建立一个联接条件。将联接语句连接成一个字符串。使用动态SQL将该字符串作为查询执行。
下面得示例使用sample schemas provided by Oracle中得customer表.

DECLARE

  -- match string should be just the values to match with spaces in between
  p_match_string           VARCHAR2(200) := 'abc bcd def xyz';
  -- need logic to determine match one (OR) versus match all (AND)
  p_match_type             VARCHAR2(3) := 'OR';
  l_sql_statement          VARCHAR2(4000);
  -- create type if bulk collect is needed
  TYPE t_email_address_tab IS TABLE OF customers.EMAIL_ADDRESS%TYPE INDEX BY PLS_INTEGER;
  l_email_address_tab      t_email_address_tab;

BEGIN

  WITH sql_clauses(row_idx,sql_text) AS
  (SELECT 0 row_idx -- build select plus beginning of where clause
         ,'SELECT email_address '
         || 'FROM customers '
         || 'WHERE 1 = '
         || DECODE(p_match_type, 'AND', '1', '0') sql_text
   FROM DUAL
   UNION
   SELECT LEVEL row_idx -- build joins for each keyword
         ,DECODE(p_match_type, 'AND', ' AND ', ' OR ') 
          || 'email_address'
          || ' LIKE ''%' 
          || REGEXP_SUBSTR( p_match_string,'[^ ]+',1,level) 
          || '%''' sql_text
   FROM   DUAL
   CONNECT BY LEVEL <= LENGTH(p_match_string) - LENGTH(REPLACE( p_match_string, ' ' )) + 1
  )
  -- put it all together by row_idx
  SELECT LISTAGG(sql_text, '') WITHIN GROUP (ORDER BY row_idx)
  INTO l_sql_statement
  FROM sql_clauses;

  dbms_output.put_line(l_sql_statement);

  -- can use execute immediate (or ref cursor) for dynamic sql
  EXECUTE IMMEDIATE l_sql_statement
  BULK COLLECT
  INTO   l_email_address_tab;

END;

| 变量|数值|
| - -|- -|
| p匹配字符串|abc bcd定义xyz|
| p匹配类型|以及|
| l_sql_语句|SELECT电子邮件地址FROM客户WHERE 1 = 1 AND电子邮件地址LIKE '%abc%' AND电子邮件地址LIKE '%bcd%' AND电子邮件地址LIKE '%def%' AND电子邮件地址LIKE '%xyz%'|
| 变量|数值|
| - -|- -|
| p匹配字符串|abc bcd定义xyz|
| p匹配类型|或|
| l_sql_语句|SELECT电子邮件地址FROM客户WHERE 1 = 0 OR电子邮件地址LIKE '%abc%' OR电子邮件地址LIKE '%bcd%' OR电子邮件地址LIKE '%def%' OR电子邮件地址LIKE '%xyz%'|

相关问题