RegEx查找字符出现之间的字符串

s71maibg 于 2023-02-25 发布在其他

关注(0)|答案(3)|浏览(161)

我有管道分隔文件，类似这样的东西：

col1|col2|col3||col5|col6||||col10

(some列可能为空，如上面所示）
我想获取第5次和第6次出现pipe之间的字符串。在本例中为“col6”。
如何使用RegEx实现这一点？
我想把这样的文件放在Oracle数据库中，然后通过使用REGEXP_SUBSTR来完成，但我也可以通过不同的工具（例如记事本++）来完成，只需要知道RegEx模式。

regex

来源：https://stackoverflow.com/questions/75558335/regex-to-find-string-between-occurences-of-character

3条答案

按热度按时间

1dkrff031#

您可以使用模式'(.*?)(\||$)'查找任何字符（.*）以非贪婪的方式（?）后跟管道符号（必须转义为\|）或（未转义的|）字符串的结尾（$）.如果你不包含行尾，那么它仍然会在位置6工作，但是如果你需要的话，它不会找到最后一个元素，因为col10后面没有管道分隔符。
然后，您可以将其用作：

select regexp_substr('col1|col2|col3||col5|col6||||col10',
  '(.*?)(\||$)', 1, 6, null, 1) as col6
from dual;

| COL6|
| - ------|
| 第6列|
其中6表示您需要第六次匹配。
使用CTE稍微简化一下，您可以看到它通过更改出现次数提取所有元素（包括空值）的效果：

-- cte for sample data
with your_table (str) as (
  select 'col1|col2|col3||col5|col6||||col10' from dual
)
  -- actual query
select
  regexp_substr(str, '(.*?)(\||$)', 1, 1, null, 1) as col1,
  regexp_substr(str, '(.*?)(\||$)', 1, 2, null, 1) as col2,
  regexp_substr(str, '(.*?)(\||$)', 1, 3, null, 1) as col3,
  regexp_substr(str, '(.*?)(\||$)', 1, 4, null, 1) as col4,
  regexp_substr(str, '(.*?)(\||$)', 1, 5, null, 1) as col5,
  regexp_substr(str, '(.*?)(\||$)', 1, 6, null, 1) as col6,
  regexp_substr(str, '(.*?)(\||$)', 1, 7, null, 1) as col7,
  regexp_substr(str, '(.*?)(\||$)', 1, 8, null, 1) as col8,
  regexp_substr(str, '(.*?)(\||$)', 1, 9, null, 1) as col9,
  regexp_substr(str, '(.*?)(\||$)', 1, 10, null, 1) as col10
from your_table;

| COL1|COL2|COL3|COL4|COL5|COL6|COL7|COL8|COL9|COL10|
| - ------|- ------|- ------|- ------|- ------|- ------|- ------|- ------|- ------|- ------|
| 列1|列2|第3栏|* 无效 |第5栏|第6列| 无效 | 无效 | 无效 *|第10栏|
fiddle
这种模式也经常用于将分隔字符串拆分为多行。

赞(0）回复(0）举报 2023-02-25

a7qyws3x2#

我不是OracleMaven，所以可能有更好的方法，但您应该能够使用以下表达式：

(\w*)\|

它匹配单词字符的所有组（\w，*也捕获空组），然后匹配管道符（\|，转义，因为管道符在正则表达式中有特殊含义），然后您可以简单地提取第6组。
工作小提琴：

select
  regexp_substr('col1|col2|col3||col5|col6||||col10', '(\w*)\|', 1, 6, NULL, 1)
from dual;

赞(0）回复(0）举报 2023-02-25

rekjcdws3#

如果它不一定是正则表达式，我建议使用 * 老式的 * substr + instr方法：

SQL> with test (col) as
  2    (select 'col1|col2|col3||col5|col6||||col10' from dual)
  3  select substr(col, instr(col, '|', 1, 5) + 1,
  4                     instr(col, '|', 1, 6) - instr(col, '|', 1, 5) - 1
  5               ) result
  6  from test;

RESULT
----------
col6

SQL>

赞(0）回复(0）举报 2023-02-25

我来回答

RegEx查找字符出现之间的字符串

3条答案

相关问题

热门标签

最新问答