我编写了一些正则表达式代码,试图与嵌套的Begin-End代码块匹配,就像在Pascal语言中一样。示例:
begin
begin
stuff1
end
end
begin
stuff2
end
当我通过下面的代码运行时,我希望匹配:
begin
begin
stuff1
end
end
去掉最后一对“Begin Stuff2 End”...
以下是我使用Windows PowerShell作为脚本语言的尝试:
### POWERSHELL
$ErrorActionPreference = 'Stop';
# Want to match balanced begin-end block include nested begin-end:
# begin begin stuff1 end end
$t1 = ""
$t1 += " begin"
$t1 += " begin"
$ti += " stuff1"
$t1 += " end"
$t1 += " end"
$t1 += ""
$t1 += " begin"
$ti += " stuff2"
$t1 += " end"
write-host "LINE: $t1"
$rxop = [Text.RegularExpressions.RegexOptions]::IgnorePatternWhitespace -bor
[Text.RegularExpressions.RegexOptions]::IgnoreCase
$rx = [Regex]::new(
"^(
\bbegin\b
(?:
(?<openp> \bbegin\b )
|
(?<-openp> \bend\b )
|
[^\t]+
)+
(?(openp)(?!))
\bend\b)
",
$rxop
)
$match = $rx.Match($t1)
if ($match.Success) {
$name = $match.Groups[1].Value
write-host "matched: $name"
}
else {
write-host "no-match"
}
基本上它是行不通的。
1条答案
按热度按时间jtjikinw1#
正如**@jdweng**所评论的:
正则表达式不是为在一个块中处理递归结构而设计的。使用适当的工具处理递归解析。
但幸运的是,你有PowerShell:
使用这个
SelectString
原型,您可以创建一个递归函数,或者根据深度简单地调用它几次: