powershell 从多个相似/相同XML节点中选择不同信息

nxagd54h  于 2023-03-12  发布在  Shell
关注(0)|答案(2)|浏览(114)

我有一个XML文件,看起来像下面这样--它构成了一个我不能改变的固定格式,每个文件的<OrderLine>值的数量不是固定的。

<?xml version="1.0" encoding="utf-8"?>
<DTD_ORDER>
   <OrderHead>
      <OrderReferences>12345</OrderReferences>
      <OrderRecipient>
         <OrderLine>
            <LineNumber>1</LineNumber>
            <Product>ProductA</Product>
            <Quantity>1</Quantity>
            <Price>17.50</Price>
            <Discount>0.00</Discount>
            <LineGross>17.50</LineGross>
            <LineNet>14.58</LineNet>
            <LineTax>2.92</LineTax>
         </OrderLine>
         <OrderLine>
            <LineNumber>2</LineNumber>
            <Product>ProductB</Product>
            <Quantity>1</Quantity>
            <Price>27.50</Price>
            <Discount>10.00</Discount>
            <LineGross>27.50</LineGross>
            <LineNet>22.92</LineNet>
            <LineTax>4.58</LineTax>
         </OrderLine>
      </OrderRecipient>
   </OrderHead>
</DTD_ORDER>

这些XML文件当前通过[XML] (Get-Content $XMLFile)导入到相关PowerShell脚本中,同时对文件执行其他操作(重复数据删除、删除值等)。
每个<OrderLine>块的数据都需要附加到在相关日期首次运行脚本时创建的CSV文件中。当前正在使用-f Format运算符和add-content将XML中其他位置的信息附加到CSV文件中。
最终结果应该是如下所示的CSV....

OrderReferences, LineNumber, Product, DateCreated, Price, LineGross, LineNet, LineTax, OrderLineVoucherValue, Discount
12345, 1, ProductA, 20/02/2023, 17.50, 17.50, 14.58, 2.92, 0.00, 0.00
12345, 2, ProductB, 20/02/2023, 27.50, 27.50, 22.92, 2.92, 0.00, 0.00

我需要在Powershell中分析该文件,查询每个<OrderLine>节点,读取其中的所有标记,将它们提取到CSV文件中。
我很感激我可以使用Powershell的SelectNodes("//OrderLine")命令,但据我所知,这会将所有标签(即LineNumber、Product)返回到它们自己的数组中,可能是无序的,而实际上我需要循环遍历orderline标签并处理子标签。PowerShell可能有一个 * 非常 * 简单的方法来完成这一任务,但我很难找到答案...
感谢所有帮助!

ulydmbyx

ulydmbyx1#

以下是使用Select-Xml的方法:

$dateCreated = '20/02/2023'

Select-Xml -Path input.xml -XPath '//OrderLine' | ForEach-Object Node | 
    Select-Object @(
        @{ n='OrderReferences'; e={ $_.ParentNode.ParentNode.OrderReferences }}
        'LineNumber'
        'Product'
        @{ n='DateCreated'; e={ $dateCreated }}
        'Quantity', 'Price', 'Discount', 'LineGross', 'LineNet', 'LineTax'
    ) | Export-Csv output.csv -NoTypeInformation

Select-Object调用接受一个数组,该数组指定输出中应显示哪些属性(列)。第一个数组是 calculated property,它从上两级的父节点获取OrderReferences值。DateCreated列还有另一个引用变量的计算属性。
其他属性直接从OrderLine元素复制而来,因此可以简单地通过名称指定。您可以根据自己的喜好更改属性顺序。您可以将每个名称放在单独的行中,也可以使用,分隔它们。
如果变量中已经包含XML文档,只需删除-Path参数并将变量通过管道传输到Select-Xml

$xml | Select-Xml -XPath '//OrderLine' …

要附加到现有CSV,只需将-Append参数添加到Export-Csv调用。

输出:

"OrderReferences","LineNumber","Product","Quantity","Price","Discount","LineGross","LineNet","LineTax"
"12345","1","ProductA","1","17.50","0.00","17.50","14.58","2.92"
"12345","2","ProductB","1","27.50","10.00","27.50","22.92","4.58"
bxjv4tth

bxjv4tth2#

使用PowerShell Xml点标记法和Member-Access Enumeration,可能比您想象的要简单:

$Xml = [xml]@'
<?xml version="1.0" encoding="utf-8"?>
<DTD_ORDER>
   <OrderHead>
      <OrderReferences>12345</OrderReferences>
      <OrderRecipient>
         <OrderLine>
            <LineNumber>1</LineNumber>
            <Product>ProductA</Product>
            <Quantity>1</Quantity>
            <Price>17.50</Price>
            <Discount>0.00</Discount>
            <LineGross>17.50</LineGross>
            <LineNet>14.58</LineNet>
            <LineTax>2.92</LineTax>
         </OrderLine>
         <OrderLine>
            <LineNumber>2</LineNumber>
            <Product>ProductB</Product>
            <Quantity>1</Quantity>
            <Price>27.50</Price>
            <Discount>10.00</Discount>
            <LineGross>27.50</LineGross>
            <LineNet>22.92</LineNet>
            <LineTax>4.58</LineTax>
         </OrderLine>
      </OrderRecipient>
   </OrderHead>
</DTD_ORDER>
'@
$Xml.DTD_ORDER.OrderHead.OrderRecipient.OrderLine |ConvertTo-Csv # | or: ... |Export-Csv .\My.csv    "LineNumber","Product","Quantity","Price","Discount","LineGross","LineNet","LineTax"
"1","ProductA","1","17.50","0.00","17.50","14.58","2.92"
"2","ProductB","1","27.50","10.00","27.50","22.92","4.58"

如果您想要自己自定义或格式化,可以使用沿着方法:

$Xml.DTD_ORDER.OrderHead.OrderRecipient.OrderLine |Foreach-Object { $_.Product }
ProductA
ProductB

或者使用Select-Object(并将其通过管道传输到Export-Csv):

$Xml.DTD_ORDER.OrderHead.OrderRecipient.OrderLine |Select-Object LineNumber, Product

LineNumber Product
---------- -------
1          ProductA
2          ProductB

要包含公共OrderReference值,您可以从祖父级开始并使用calculated property

$Xml.DTD_ORDER.OrderHead |ForEach-Object {
    $OrderReferences = $_.OrderReferences
    $_.OrderRecipient.OrderLine |Select-Object `
        @{ l='OrderReferences'; e={ $OrderReferences } },
        LineNumber,
        Product,
        DateCreated,
        Price,
        LineGross
} |ConvertTo-Csv

"OrderReferences","LineNumber","Product","DateCreated","Price","LineGross"
"12345","1","ProductA",,"17.50","17.50"
"12345","2","ProductB",,"27.50","27.50"

相关问题