python解析xml,如何获取值?

z0qdvdin  于 2023-03-31  发布在  Python
关注(0)|答案(1)|浏览(107)

我尝试用Python解析XML数据,并在提取值时遇到困难。数据看起来像这样:

[<generic:Obs>
<generic:ObsDimension value="2020-01-02"/>
<generic:ObsValue value="1.1193"/>
<generic:Attributes>
<generic:Value id="OBS_STATUS" value="A"/>
<generic:Value id="OBS_CONF" value="F"/>
</generic:Attributes>
</generic:Obs>, <generic:Obs>
<generic:ObsDimension value="2020-01-03"/>
<generic:ObsValue value="1.1147"/>
<generic:Attributes>
<generic:Value id="OBS_STATUS" value="A"/>
<generic:Value id="OBS_CONF" value="F"/>
</generic:Attributes>
</generic:Obs>]

我想创建一个Pandas DF,列为['Date','Value']。日期应该是<generic:ObsDimension value="2020-01-03"/>的值,<generic:ObsValue value="1.1147"/>的值。当我运行代码时:

soup = BeautifulSoup(response.text, 'xml')
dates = soup.find_all("ObsDimension")

我得到的结果是:

[<generic:ObsDimension value="2020-01-02"/>,
 <generic:ObsDimension value="2020-01-03"/>,
 <generic:ObsDimension value="2020-01-06"/>,
 <generic:ObsDimension value="2020-01-07"/>,
 <generic:ObsDimension value="2020-01-08"/>,
 <generic:ObsDimension value="2020-01-09"/>]

但是我怎样才能得到日期和相应的值呢?

ccrfmcuu

ccrfmcuu1#

试试看:

import pandas as pd
from bs4 import BeautifulSoup

xml_doc = '''\
<data>
<generic:Obs>
<generic:ObsDimension value="2020-01-02"/>
<generic:ObsValue value="1.1193"/>
<generic:Attributes>
<generic:Value id="OBS_STATUS" value="A"/>
<generic:Value id="OBS_CONF" value="F"/>
</generic:Attributes>
</generic:Obs>

<generic:Obs>
<generic:ObsDimension value="2020-01-03"/>
<generic:ObsValue value="1.1147"/>
<generic:Attributes>
<generic:Value id="OBS_STATUS" value="A"/>
<generic:Value id="OBS_CONF" value="F"/>
</generic:Attributes>
</generic:Obs>

</data>'''

soup = BeautifulSoup(xml_doc, 'xml')

all_data = []
for obs in soup.select('Obs'):
    date = obs.ObsDimension['value']
    value = obs.ObsValue['value']
    all_data.append({'Date': date, 'Value': value})

df = pd.DataFrame(all_data)
print(df)

图纸:

Date   Value
0  2020-01-02  1.1193
1  2020-01-03  1.1147

相关问题