从Oracle SQL到大查询的正则表达式

wkyowqbh 于 2022-11-03 发布在 Oracle

关注(0)|答案(3)|浏览(109)

我以前在这里有一个oracle sql中的Regexp表达式的帮助，它工作得很好。但是，我们的地方正在转换到大查询，regexp似乎不再工作了。
在我的表中，有以下数据：

WC 12/10 change FC from 24 to 32
W/C 12/10 change fc from 401 to 340
W/C12/10 18-26

这个oracle sql会把表拆分成之前的数字（24）、（32）和（12/10）。

cast(REGEXP_SUBSTR(Line_Comment, '((\d+ |\d+)(change )?(- |-|to |to|too|too )(\d+))', 1, 1, 'i',2) as Int) as Before,
cast(REGEXP_SUBSTR(Line_Comment, '((\d+ |\d+)(change )?(- |-|to |to|too|too )(\d+))', 1, 1, 'i', 5) as Int) as After,
REGEXP_SUBSTR(Line_Comment, '((\d+)(\/|-|.| )(\d+)(\/|-|.| )(\d+))|(\d+)(\/|-|.| )(\d+)', 1, 1, 'i') as WC_Date,

完全理解的意见是不一致的，可能不工作，但如果它的工作超过80%的时间，它有那么我们是罚款与此。
自从转到大查询后，我收到了这个错误消息。在oracle中，表是varchar，但在大查询中，当他们迁移它时，它现在是strings。这可能是它损坏的原因吗？有人能帮助解决这个问题吗？这超出了我的理解范围。
函数REGEXP_SUBSTR没有参数类型的匹配签名：字符串、字符串、INT 64、INT 64、字符串、INT 64。支持的签名：如果是，则将其设置为“0”。在[69：12]时的REGEXP_SUBSTR（字节，字节，[INT 64]，[INT 64]）

oracle

来源：https://stackoverflow.com/questions/73966299/regexp-expression-from-oracle-sql-to-big-query

3条答案

按热度按时间

irlmq6kh1#

由于google bigquery REGEXP_SUBSTR不支持Oracle的REGEXP_SUBSTR的subexpr参数，因此您需要修改正则表达式以利用以下事实：
如果正则表达式包含撷取群组，则函数会传回与该撷取群组相符的子字串。
因此，对于要提取的每个值，需要使其成为正则表达式中唯一的捕获组：

cast(REGEXP_SUBSTR(Line_Comment, '(?:(\d+ |\d+)(?:change )?(?:- |-|to |to|too|too )(?:\d+))', 1, 1) as Int) as Before,
cast(REGEXP_SUBSTR(Line_Comment, '(?:(?:\d+ |\d+)(?:change )?(?:- |-|to |to|too|too )(\d+))', 1, 1) as Int) as After,
REGEXP_SUBSTR(Line_Comment, '((?:\d+)(?:\/|-|.| )(?:\d+)(?:\/|-|.| )(?:\d+))|((?:\d+)(?:\/|-|.| )(?:\d+))', 1, 1) as WC_Date,

请注意，您可以像下面这样实质性地简化您的正则表达式：

(\d+) ?(?:change )?(?:-|too?) ?(?:\d+)
(?:\d+) ?(?:change )?(?:-|too?) ?(\d+)
(?:\d+)(?:[\/.-](?:\d+)){1,2}

regex101上的正则表达式演示：numbers、date

赞(0）回复(0）举报 2022-11-03

tvokkenx2#

根据您在评论部分提供的示例数据，您可以尝试以下查询：

with t1 as (
  select 'WC 12/10 change FC from 24 to 32' as Comment
  union all select 'W/C 12/10 change fc from 401 to 340' as Comment,
  union all select 'W/C12/10 18-26' as Comment
)

select Comment,
regexp_extract(t1.Comment, r'(\d+\/\d+)') as WC,
regexp_extract(t1.Comment, r'.+\s(\d{1,3})[\s|\-]') as Before,
regexp_extract(t1.Comment, r'.+[\sto\s|\-](\d{1,3})$') as After
from t1

输出量：

赞(0）回复(0）举报 2022-11-03

zzzyeukh3#

考虑以下超简单的方法

select Comment, 
  format('%s/%s', arr[offset(0)], arr[safe_offset(1)]) as wc,
  arr[safe_offset(2)] as before,
  arr[safe_offset(3)] as after
from your_table, unnest([struct(regexp_extract_all(Comment, r'\d+') as arr)])

如果应用于问题中的示例数据，则输出为

赞(0）回复(0）举报 2022-11-03

我来回答

从Oracle SQL到大查询的正则表达式

3条答案

相关问题

热门标签

最新问答