regex 正则表达式匹配不在引号内的字符

r9f1avp5  于 2023-08-08  发布在  其他
关注(0)|答案(1)|浏览(127)

我有一个包含自定义代码的文件,例如:

#include "folder/file.txt"
#include "folder with spaces/file.txt"
#include "$variable/file.txt"
#define $foo joe
#define $bar 34

字符串
我想使用RegEx匹配从每一行中提取值。JavaScript RegEx(如[^#"\s]+)可以正确匹配大多数值,除了引号内的空格:

include
folder/file.txt
include
folder
with
spaces/file.txt
include
$variable/file.txt
define
$foo
joe
define
$bar
34


https://regex101.com/r/AmAh3T/1
我想忽略引号中的空格,这样匹配值的列表就变成了:

include
folder/file.txt
include
folder with spaces/file.txt
include
$variable/file.txt
define
$foo
joe
define
$bar
34


我尝试过使用lookahead来忽略引号内的空格[^\s]+(?=(?:[^"]|["][^"]*["])*$),但这会忽略引号内空格之前的任何字符串:

include
"folder/file.txt"
include
spaces/file.txt"
include
"$variable/file.txt"
define
$foo
joe
define
$bar
34


https://regex101.com/r/WbU0X2/2
我怎么能以更好的方式做到这一点?

r55awzrz

r55awzrz1#

考虑匹配正则表达式

(?<=")[^#"]+(?=")|[^# \r\n"]+

字符串
g(“global”,第一次匹配后不返回)标志已设置。
Demo
该表达式可以分解如下。

(?<=")       # positive lookbehind asserts that the match is preceded
             # by a double quote
[^#"]+       # match one or more (`+`) characters other than (`^`) those
             # listed in the character class
(?=")        # positive lookahead asserts that the match is followed
             # by a double quote
|            # or
[^# \r\n"]+  # match one or more (`+`) characters other than (`^`) those
             # listed in the character class


或者(或另外),将光标悬停在链接处表达式的每个部分上,以获得其功能的解释。

相关问题