selenium 使用Python Selify遍历表行并打印列文本

j8yoct9x  于 2022-11-10  发布在  Python
关注(0)|答案(3)|浏览(177)

我有一个表(<table>),表主体(<tbody>)的每一行(<tr>)都有值。
我想打印出来的值在<div>标记内的<span>中。
查看html,我看到了值,例如“姓名”在第1行(tr[1]),第2列(td[2]):

<tr class="GAT4PNUFG GAT4PNUMG" __gwt_subrow="0" __gwt_row="0">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUHG GAT4PNUNG">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUNG">
                <div __gwt_cell="cell-gwt-uid-324" style="outline-style:none;">
                    <span class="linkhover" title="Name" style="white-space:nowrap;overflow:hidden;text-overflow:ellipsis;empty-cells:show;display:block;color:#00A;cursor:pointer;">Name</span>
                </div>
            </td>

我想遍历表格的每一行,并打印出列2,TD[2]中的值
我使用的是带有Selenie WebDriver的Python
表第1行、第2列的完整XPath为:

html/body/div[2]/div[2]/div/div[4]/div/div[2]/div/div[3]/div/div[5]/div/div[3]/div/div[4]/div/div[2]/div/div[4]/div/div[3]/div/div[2]/div/div/table/tbody/tr[1]/td[2]/div/span

我在想,如果我可以从表开始,XPath如下所示:

html/body/div[2]/div[2]/div/div[4]/div/div[2]/div/div[3]/div/div[5]/div/div[3]/div/div[4]/div/div[2]/div/div[4]/div/div[3]/div/div[2]/div/div/table/tbody

然后,我可以使用for循环并对tr和td使用索引,例如,对于row1使用tr[i],对于col2使用td[2]。

html/body/div[2]/div[2]/div/div[4]/div/div[2]/div/div[3]/div/div[5]/div/div[3]/div/div[4]/div/div[2]/div/div[4]/div/div[3]/div/div[2]/div/div/table/tbody/tr[i]/td[2]/div/span

如何遍历该表并打印表中始终位于第2列的Span类标记的值?
我尝试将表的开始部分放入一个变量中,然后也许可以使用它来遍历行和列。我需要一些帮助。

table = self.driver.find_element(By.XPATH, 'html/body/div[2]/div[2]/div/div[4]/div/div[2]/div/div[3]/div/div[5]/div/div[3]/div/div[4]/div/div[2]/div/div[4]/div/div[3]/div/div[2]/div/div/table/tbody')

以下是完整的HTML语言:

<table cellspacing="0" style="table-layout: fixed; width: 100%;">
    <colgroup>
    <tbody>
        <tr class="GAT4PNUFG GAT4PNUMG" __gwt_subrow="0" __gwt_row="0">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUHG GAT4PNUNG">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUNG">
                <div __gwt_cell="cell-gwt-uid-324" style="outline-style:none;">
                    <span class="linkhover" title="Name" style="white-space:nowrap;overflow:hidden;text-overflow:ellipsis;empty-cells:show;display:block;color:#00A;cursor:pointer;">Name</span>
                </div>
            </td>
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUNG">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUNG">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUNG">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUBH GAT4PNUNG">
        </tr>
        <tr class="GAT4PNUEH" __gwt_subrow="0" __gwt_row="1">
            <td class="GAT4PNUEG GAT4PNUFH GAT4PNUHG">
            <td class="GAT4PNUEG GAT4PNUFH">
                <div __gwt_cell="cell-gwt-uid-324" style="outline-style:none;">
                    <span class="linkhover" title="Address" style="white-space:nowrap;overflow:hidden;text-overflow:ellipsis;empty-cells:show;display:block;color:#00A;cursor:pointer;">Address</span>
                </div>
            </td>
            <td class="GAT4PNUEG GAT4PNUFH">
            <td class="GAT4PNUEG GAT4PNUFH">
            <td class="GAT4PNUEG GAT4PNUFH">
            <td class="GAT4PNUEG GAT4PNUFH GAT4PNUBH">
        </tr>
        <tr class="GAT4PNUFG" __gwt_subrow="0" __gwt_row="2">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUHG">
            <td class="GAT4PNUEG GAT4PNUGG">
                <div __gwt_cell="cell-gwt-uid-324" style="outline-style:none;">
                    <span class="linkhover" title="DOB" style="white-space:nowrap;overflow:hidden;text-overflow:ellipsis;empty-cells:show;display:block;color:#00A;cursor:pointer;">DOB</span>
                </div>
            </td>
            <td class="GAT4PNUEG GAT4PNUGG">
            <td class="GAT4PNUEG GAT4PNUGG">
            <td class="GAT4PNUEG GAT4PNUGG">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUBH">
        </tr>
        <tr class="GAT4PNUEH" __gwt_subrow="0" __gwt_row="3">
            ---
        <tr class="GAT4PNUFG" __gwt_subrow="0" __gwt_row="4">       
            ---
    </tbody>
</table>
2ic8powd

2ic8powd1#

开发人员已将ID放入表中。我现在让它起作用了。它将打印第2列中的所有单元格值。代码为:

table_id = self.driver.find_element(By.ID, 'data_configuration_feeds_ct_fields_body0')
rows = table_id.find_elements(By.TAG_NAME, "tr") # get all of the rows in the table
for row in rows:
    # Get the columns (all the column 2)        
    col = row.find_elements(By.TAG_NAME, "td")[1] #note: index start from 0, 1 is col 2
    print col.text #prints text from the element
6ljaweal

6ljaweal2#

您当前使用的XPath非常脆弱,因为它依赖于完整的文档结构和元素的相对位置。它很容易在未来崩溃。
相反,使用行的class或其他属性定位行。例如:

for row in driver.find_elements_by_css_selector("tr.GAT4PNUFG.GAT4PNUMG"):
    cell = row.find_elements_by_tag_name("td")[1]
    print(cell.text)
3zwtqj6y

3zwtqj6y3#

可能有点晚了。但这是我的代码,运行起来非常棒。

def find_in_table(self, name):
        check_table = self.isElementPresent("//table[@class='assessment_list_table_tableStyle__Qw-rz']",
                                            locatorType="xpath")
        while not check_table:
            time.sleep(10)
            check_table = self.isElementPresent("//table[@class='assessment_list_table_tableStyle__Qw-rz']",
                                                locatorType="xpath")

        table_id = self.driver.find_element(By.XPATH, "//table[@class='assessment_list_table_tableStyle__Qw-rz']")
        rows = table_id.find_elements(By.TAG_NAME, "tr")
        for x in range(1, len(rows)):
            col = rows[x].find_elements(By.TAG_NAME, "td")[0]
            s = col.text
            if s == name:
                return x

1.检查表是否存在
1.使用FIND_ELEMENTS获取表ID
1.使用表ID查找表中的行
1.遍历表并找到第一列中的文本(0)
1.当文本与列中的文本匹配时,返回行值
表格元素的XPath可以使用IntelliJ中的Selify插件获得。该插件对查找元素非常有用,而且比浏览器中的扩展插件更准确。
(isElementPresent方法是我用来使用selenium getElement方法检查元素是否存在的方法,如果元素存在则返回boolean)

相关问题