regex PowerShell -获取多行中两个字符串之间的内容

xtfmy6hx  于 2023-06-25  发布在  Shell
关注(0)|答案(3)|浏览(98)

文件1.wpl:

<?wpl version="1.0"?>
<smil>
    <head>
        <meta name="Generator" content="Microsoft Windows Media Player -- 12.0.22621.1"/>
        <meta name="ItemCount" content="2"/>
        <title>Untitled playlist</title>
    </head>
    <body>
        <seq>
            <media src="C:\Users\user\Downloads\Katzianerova vojna.mp3"/>
            <media src="C:\Users\user\Downloads\Rat i mir u povijesti III- dio.mp3"/>
        </seq>
    </body>
</smil>

我想在多行中获取<seq></seq>之间的内容:

期望输出:

<media src="C:\Users\user\Downloads\Katzianerova vojna.mp3"/>
<media src="C:\Users\user\Downloads\Rat i mir u povijesti III- dio.mp3"/>

有这样的代码,它给我在单行输出:

$fileName = "C:\Users\user\Music\Playlists\1.wpl"
 #Get content from file
$file = Get-Content $fileName
   
#Regex pattern to compare two strings
$pattern = "<seq>(.*?)</seq>"

#Perform the opperation
$results = [regex]::Match($file,$pattern).Groups[1].Value -split [System.Environment]::NewLine

return $results

实际产量:

<media src="C:\Users\user\Downloads\Katzianerova vojna.mp3"/>             <media src="C:\Users\user\Downloads\Rat i mir u povijesti III- dio.m
p3"/>
bvjxkvbb

bvjxkvbb1#

当你拥有的是有效的XML时,没有理由使用正则表达式:

($xml = [xml]::new()).Load('C:\Users\user\Music\Playlists\1.wpl')
$xml.SelectNodes('smil/body/seq/media') | ForEach-Object OuterXml

# Outputs:
# <media src="C:\Users\user\Downloads\Katzianerova vojna.mp3" />
# <media src="C:\Users\user\Downloads\Rat i mir u povijesti III- dio.mp3" />

或者在XPath中使用通配符:

($xml = [xml]::new()).Load('C:\Users\user\Music\Playlists\1.wpl')
$xml.SelectNodes("//seq/*") | ForEach-Object OuterXml
qgzx9mmu

qgzx9mmu2#

你可以试着用一个“开关”

Get-Content -Path "C:\Users\user\Music\Playlists\1.wpl" |
    ForEach-Object {
        switch -Regex ($_) {
            '\<seq\>$' {
                $break = 1
            }
            '\<\/seq\>$' {
                $break = 0
            }
            default {
                if ($break -eq 1) {
                    $_ -replace '^\s+'
                }
            }
        }
    }
py49o6xq

py49o6xq3#

发现溶液:

$fileContent = Get-Content -Path "C:\Users\user\Music\Playlists\1.wpl" -Raw
$regexPattern = "(?s)<seq>(.*?)</seq>"
$matches = [regex]::Match($fileContent, $regexPattern)

if ($matches.Success) {
    $seqContent = $matches.Groups[1].Value
    $lines = $seqContent -split "`n"
    $output = ($lines | Where-Object { $_.Trim() -ne '' }) -join "`n"
    Write-Output $output
} else {
    Write-Output "No <seq> content found in the file."
}

相关问题