如何使用Python删除/编辑XML文件中的元素

xe55xuns  于 2023-01-29  发布在  Python
关注(0)|答案(1)|浏览(144)

我有下面的XML文件:

<annotation>
    <folder>JPEGImages</folder>
    <filename>01FQ0YY92XRX5MDWGYC2RJ1CP4.jpeg</filename>
    <path>D:\aVisionData\PVL Pilot Project\test\Annotation\JPEGImages\01FQ0YY92XRX5MDWGYC2RJ1CP4.jpeg</path>
    <source>
        <database>Unknown</database>
    </source>
    <size>
        <width>601</width>
        <height>844</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>smallObject</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>329</xmin>
            <ymin>199</ymin>
            <xmax>376</xmax>
            <ymax>242</ymax>
        </bndbox>
    </object>
</annotation>

我想删除<path>,还想编辑<source> </source>,如下所示

<annotation>
    <folder>JPEGImages</folder>
    <filename>01FQ0YY92XRX5MDWGYC2RJ1CP4.jpeg</filename>
    <source>
        <database>objects</database>
        <annotation>custom</annotation>
        <image>custom</image>
    </source>
    <size>
        <width>601</width>
        <height>844</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>smallObject</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>329</xmin>
            <ymin>199</ymin>
            <xmax>376</xmax>
            <ymax>242</ymax>
        </bndbox>
    </object>
</annotation>

要删除<path>,我使用了以下代码:

import xml.etree.ElementTree as Et

file_path = os.path.join(inputAnnotationPath, annotation)
tr = Et.parse(file_path)
for element in tr.iter():
    for subElement in element:
        print(subElement)
        if subElement.tag == "path":
            se = subElement.get("path")
            element.remove(subElement)
tr.write(sys.stdout)

它运行正常,但无法删除path。我应该做什么更改来删除path并修改source

cczfrluj

cczfrluj1#

如果您可以使用lxml,这将非常简单:

from lxml import etree
parser = etree.XMLParser(recover=True)
tr = etree.parse(file_path, parser=parser)

#select both <path> and <source> for removal
targets = [tr.xpath('//path')[0], tr.xpath('//source')[0]]

#select the destination for the new <source> element
destination = tr.xpath('//filename')[0]

#recreate <source>
new_source = """
     <source>
        <database>objects</database>
        <annotation>custom</annotation>
        <image>custom</image>
    </source>"""

#remove what needs to be removed
for target in targets:
    target.getparent().remove(target)

#insert the new <source> element
destination.addnext(etree.fromstring(new_source))

#save to file
with open("output.xml", "wb") as f:
    f.write(etree.tostring(tr))

相关问题