xpath如何检查给定父节点的子元素是否存在?

ilmyapht  于 2021-08-20  发布在  Java
关注(0)|答案(2)|浏览(721)

我想在父节点上循环,检查父节点是否有某个子节点,并从中提取数据。
网站的脚本如下所示:

<div @class="reviews">
  <div @id = "user1">
    <div @class="name"> Will </div>
    <div @class="weight"> 50kg </div>
    <div @class="height"> 160cm </div>
  </div>
  <div @id = "user2">
    <div @class="weight"> 55kg </div>
    <div @class="height"> 170cm </div>
  </div>
  <div @id = "user3">
    <div @class="name"> Ben </div>
    <div @class="height"> 180cm </div>
  </div>
</div>

到目前为止,我的代码如下所示:

import csv
import os
import pandas as pd
from selenium import webdriver

chromedriver = "path to chromedriver"
driver = webdriver.Chrome(chromedriver)
driver.get( 'url of a website')

name_row = []
weight_row = []
height_row = []
for i in range(len(driver.find_element_by_xpath('//div[@class="reviews"/div'):
    # Get the first parent (user1)
    driver.find_element_by_xpath('(//div[@class="reviews"/div)' + '[' + str(i + 1) + ']')

    # Check if it has elements like name, weight, and height and add it to appropriate list. 
    # For example, name_row.append(driver.find_element_by_xpath("xpath to name if it exists")
    # If missing any element return "None"
    # Then move on to the second parent (user2) and so on

df = pd.DataFrame({'Names': name_row, 'Weight': weight_row, 'Height': height_row})

我希望我的最终结果是
名称重量高度为50kg160cm非55kg170cmbennone180cm
我也看过其他帖子,但似乎找不到我想要的答案。
我尝试通过xpath查找元素,并将每个名称、权重和高度值放在各自的列表中,但这不会在这些列表中包含“none”值,并最终导致数组长度不相同等错误。

zpf6vheq

zpf6vheq1#

现在您得到了parents元素,现在您可以再次为parent元素中的子元素创建一个

parents=driver.find_element_by_xpath('(//div[@class="reviews"/div).find_elements_by_tag_name("div")

for do循环使用父元素,并检查内部是否存在其他元素。

for parent in parents:
    parent.find_element_by_class_name('name')
    ....weight 
    Height

这是文件
如果要按类名查找元素,请使用此选项。使用此策略,将返回具有匹配的class name属性的第一个元素。如果没有元素具有匹配的类名属性,则将引发NoTouchElementException。
那么,现在你可以做了
try except 检查名称、重量时。。如果找到,则获取数据,如果没有,则写入无

bqjvbblv

bqjvbblv2#

我建议这样做:

import csv
import os
import pandas as pd
from selenium import webdriver

chromedriver = "path to chromedriver"
driver = webdriver.Chrome(chromedriver)
driver.get( 'url of a website')

names = []
weights = []
heights = []

# wait for first parent element to be visible

wait.until(EC.visibility_of_element_located((By.XPATH, "//div[@class='reviews']")))

# let all the elements loaded

time.sleep(0.5)

# get all the reviews list

reviews = driver.find_elements_by_xpath("//div[@class='reviews']")

# iterate over reviews and get each inner element per each review

for i in range(len(reviews)):
    #get name element for current review. This returns list of web elements
    name = review[i].find_elements_by_xpath(".//div[@class='name']")
    #in case the element exists the list is non-empty so it is interpreted as Boolean True
    if(name):
        #extract actual element value and append it to the list of names
        names.append(name[0].text)
    #otherwise append "None"
    else:
        names.append("None")
    #the same for other parameters
    weight = review[i].find_elements_by_xpath(".//div[@class='weight']")
    if(weight):
        weights.append(weight[0].text)
    else:
        weights.append("None")
    height = review[i].find_elements_by_xpath(".//div[@class='height']")
    if(height):
        heights.append(height[0].text)
    else:
        heights.append("None")

相关问题