使用Python阅读和解析嵌套XML标记并保存为CSV，无法读取嵌套标记

0g0grzrc 于 2022-12-15 发布在 Python

关注(0)|答案(1)|浏览(141)

我已经看了，并尝试了很多东西，但一直无法读取我的XML文件中的嵌套标记。我已经提取了outter标记值，而不是地址标记下的嵌套街道和城市标记。我的时间紧迫，我无法读取嵌套标记后，尝试了一堆的东西。请帮助！！！
我试图得到的预期结果是----〉
常见植物区光价街城市
血根草加拿大血根草4大部分遮阴2.44 1多伦多
等等----〉
但是，我无法检索street和city列，因为我的代码没有拾取嵌套标记。
通过删除涉及城市和街道标记的代码，我已经能够实现以下输出。
普通植物区灯价
血根植物加拿大血根草4株多数阴生2.44
下面是我的xml文件，有2个条目，仅用于测试目的。我试图在上面提到的植物标签下创建每个文本信息的列。我正在使用databricks文件系统阅读。我打开并创建了一个csv，写入其中，然后关闭它。缩进是正确的，它可能在我复制粘贴时被弄混了。

<?xml version="1.0" encoding="UTF-8"?>

<CATALOG>

  <PLANT>

    <COMMON>Bloodroot</COMMON>

    <BOTANICAL>Sanguinaria canadensis</BOTANICAL>

    <ZONE>4</ZONE>

    <LIGHT>Mostly Shady</LIGHT>

    <PRICE>$2.44</PRICE>

    <ADDRESS>

         <STREET>1</STREET>

         <CITY>toronto</CITY>

    </ADDRESS>

    <AVAILABILITY>031599</AVAILABILITY>

  </PLANT>

  <PLANT>

    <COMMON>Columbine</COMMON>

    <BOTANICAL>Aquilegia canadensis</BOTANICAL>

    <ZONE>3</ZONE>

    <LIGHT>Mostly Shady</LIGHT>

    <PRICE>$9.37</PRICE>

    <ADDRESS>

         <STREET>2</STREET>

         <CITY>montreal</CITY>

    </ADDRESS>

    <AVAILABILITY>030699</AVAILABILITY>

  </PLANT>

</CATALOG>

-----------This is the code I have used ---------------
from xml.etree import ElementTree

import csv

import os

xml = ElementTree.parse("/dbfs/mnt/ods-outbound/xml_test/plant_catalog.xml")

#creating a file

csvfile= open("/dbfs/mnt/ods-outbound/xml_test/plant_catalog.csv",'w',encoding='utf-8')

csvfile_writer=csv.writer(csvfile)

# ADD THE HEADER TO CSV FILE

csvfile_writer.writerow(["common","botanical","zone","light","price","availability","street","city"])

# FOR EACH PLANT

for plant in xml.findall("PLANT"):
    if(plant)

      # EXTRACT PLANT DETAILS 

      common = plant.find("COMMON")

      botanical = plant.find("BOTANICAL")

      zone = plant.find("ZONE")

      light = plant.find("LIGHT")

      price = plant.find("PRICE")

      availability = plant.find("AVAILABILITY")

      street = plant.find("STREET")

      city = plant.find("CITY")

      csv_line = [common.text, botanical.text, zone.text, light.text, price.text, availability.text,street.text,city.text]

      # ADD A NEW ROW TO CSV FILE

      csvfile_writer.writerow(csv_line)

csvfile.close()

csv

来源：https://stackoverflow.com/questions/74766278/reading-and-parsing-nested-xml-tags-and-save-to-csv-using-python-unable-to-read

1条答案

按热度按时间

q1qsirdb1#

根据xml文件，街道和城市值位于地址标记内。
changes：

在页面中查找地址标签
如果是地址标签，则查找街道和城市

a.xml（输入文件）：

<?xml version="1.0" encoding="UTF-8" ?>
<CATALOG>
  <PLANT>
    <COMMON>Bloodroot</COMMON>
    <BOTANICAL>Sanguinaria canadensis</BOTANICAL>
    <ZONE>4</ZONE>
    <LIGHT>Mostly Shady</LIGHT>
    <PRICE>$2.44</PRICE>
    <ADDRESS>
         <STREET>1</STREET>
         <CITY>toronto</CITY>
    </ADDRESS>
    <AVAILABILITY>031599</AVAILABILITY>
  </PLANT>
  <PLANT>
    <COMMON>Columbine</COMMON>
    <BOTANICAL>Aquilegia canadensis</BOTANICAL>
    <ZONE>3</ZONE>
    <LIGHT>Mostly Shady</LIGHT>
    <PRICE>$9.37</PRICE>
    <ADDRESS>
         <STREET>2</STREET>
         <CITY>montreal</CITY>
    </ADDRESS>
    <AVAILABILITY>030699</AVAILABILITY>
  </PLANT>
</CATALOG>

main.py：

import csv
from xml.etree import ElementTree

xml = ElementTree.parse("a.xml")

csvfile= open("plant_catalog.csv",'w',encoding='utf-8',newline="")
csvfile_writer=csv.writer(csvfile)
csvfile_writer.writerow(["common","botanical","zone","light","price","availability","street","city"])

for plant in xml.findall("PLANT"):
    if(plant):
      common = plant.find("COMMON").text
      botanical = plant.find("BOTANICAL").text
      zone = plant.find("ZONE").text
      light = plant.find("LIGHT").text
      price = plant.find("PRICE").text
      availability = plant.find("AVAILABILITY").text
      for addr in plant.findall("ADDRESS"):
        if(addr):
          street = addr.find("STREET").text
          city = addr.find("CITY").text
      csv_line = [common, botanical, zone, light, price, availability,street,city]

      csvfile_writer.writerow(csv_line)
csvfile.close()

plant_catalog.csv（输出文件）：

common,botanical,zone,light,price,availability,street,city
Bloodroot,Sanguinaria canadensis,4,Mostly Shady,$2.44,031599,1,toronto
Columbine,Aquilegia canadensis,3,Mostly Shady,$9.37,030699,2,montreal

赞(0）回复(0）举报 2022-12-15

我来回答

使用Python阅读和解析嵌套XML标记并保存为CSV，无法读取嵌套标记

1条答案

相关问题

热门标签

最新问答