如何使用Java/HtmlUnit有效地单击HtmlCell?

6rqinv9w  于 2023-01-15  发布在  Java
关注(0)|答案(1)|浏览(195)

如果您转到https://taxtest.navajocountyaz.gov/Pages/WebForm1.aspx?p=1&apn=205-27-014查看页面源代码并在页面源代码中搜索grdCPhist,则不会找到它。
但是,如果你点击Taxes,然后点击第5行第7列中的CP,然后查看页面源代码,你会发现一个grdCPhist。有一个id=“grdCPhist”的表。
我想使用HtmlUnit从Java代码访问该表。
为此,我开发了下面的程序:

import com.gargoylesoftware.htmlunit.*;
import com.gargoylesoftware.htmlunit.html.*;
import com.gargoylesoftware.htmlunit.javascript.*;
import java.io.*;

public class ClickOnCell {

    public static void ClickOnCell () {
        try (final WebClient webClient = new WebClient()) {
            System.getProperties().put("org.apache.commons.logging.simplelog.defaultlog", "fatal");
            webClient.getOptions().setThrowExceptionOnScriptError(false);
            webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);

            webClient.getOptions().setCssEnabled(false);
            webClient.setJavaScriptErrorListener(new SilentJavaScriptErrorListener());
            webClient.setCssErrorHandler(new SilentCssErrorHandler());
            HtmlPage page = webClient.getPage("http://taxtest.navajocountyaz.gov/Pages/WebForm1.aspx?p=1&apn=205-27-014");
            webClient.waitForBackgroundJavaScriptStartingBefore(10000);
            page = (HtmlPage) page.getEnclosingWindow().getEnclosedPage();
            webClient.getOptions().setThrowExceptionOnScriptError(false);
            webClient.setJavaScriptErrorListener(new SilentJavaScriptErrorListener());
            HtmlTable grdTaxHistory = (HtmlTable) page.getElementById("grdTaxHistory");
            HtmlTableDataCell cpCell = (HtmlTableDataCell) grdTaxHistory.getCellAt(4,6);
            System.out.println("cpCell.getTextContent() = " + cpCell.getTextContent());
            cpCell.click();
            webClient.waitForBackgroundJavaScriptStartingBefore(1000000000);
            page = (HtmlPage) page.getEnclosingWindow().getEnclosedPage();
            HtmlTable grdCPHistory = (HtmlTable) page.getElementById("grdCPhist");
            System.out.println("grdCPHistory = " + grdCPHistory);
        }

        catch (Exception e) {
            System.out.println("Error: "+ e);
        }

    }

    public static void main(String[] args) {
        File file = new File("validParcelIDs.txt");
        ClickOnCell();
    }

}

我使用以下两个命令编译并运行了该程序:

javac -classpath ".:/opt/htmlunit_2.69.0/*" ClickOnCell.java
java -classpath ".:/opt/htmlunit_2.69.0/*" ClickOnCell

程序编译得很好。没有错误或警告。但是,当我运行程序时,我在屏幕上得到了以下输出:

WARNING: Obsolete content type encountered: 'text/javascript'.
Jan 13, 2023 5:51:29 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
Jan 13, 2023 5:51:29 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
Jan 13, 2023 5:51:30 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
Jan 13, 2023 5:51:30 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
Jan 13, 2023 5:51:30 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
cpCell.getTextContent() =                                                                     
                                         CP
                                            
grdCPHistory = null

我不满意的是上面的grdCPHistory表等于null,这说明HtmlUnit无法找到id=“grdCPhist”的表,就好像我没有cpCell.click在代码中放入www.example.com();一样。
如何更改上面的代码,以便能够从Java程序访问id=“grdCPhist”的表?
因为StackOverflow不允许我在你建议我尝试什么之后感谢你,所以先谢谢你。

wbgh16ku

wbgh16ku1#

单击单元格没有任何效果,因为操作是从包含锚触发的。

<td align="right">                                                                    
    <a onclick="$('#');" id="grdTaxHistory_lnkViewPayments_4" href="javascript:__doPostBack('grdTaxHistory$ctl06$lnkViewPayments','')">$2948.15</a>
</td>

您必须选择锚并单击它。

String url = "http://taxtest.navajocountyaz.gov/Pages/WebForm1.aspx?p=1&apn=205-27-014";

    try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX)) {
        webClient.getOptions().setThrowExceptionOnScriptError(false);
        webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);

        webClient.getOptions().setCssEnabled(false);

        // don't disable logging if your are hunting for problems
        webClient.setJavaScriptErrorListener(new SilentJavaScriptErrorListener());

        HtmlPage page = webClient.getPage(url);
        webClient.waitForBackgroundJavaScriptStartingBefore(1_000);
        page = (HtmlPage) page.getEnclosingWindow().getEnclosedPage();

        // System.out.println("-----------------------------------------------------");
        // System.out.println(page.asNormalizedText());
        // System.out.println("-----------------------------------------------------");

        // no need to click - the table is already there, only not visible

        // page.getAnchorByText("Taxes").click();
        // webClient.waitForBackgroundJavaScriptStartingBefore(1_000);
        // page = (HtmlPage) page.getEnclosingWindow().getEnclosedPage();

        // System.out.println("-----------------------------------------------------");
        // System.out.println(page.asNormalizedText());
        // System.out.println("-----------------------------------------------------");

        HtmlTable grdTaxHistory = (HtmlTable) page.getElementById("grdTaxHistory");
        HtmlTableDataCell cpCell = (HtmlTableDataCell) grdTaxHistory.getCellAt(4,6);
        // System.out.println("cpCell.getTextContent() = " + cpCell.getTextContent().trim());

        // the action is triggered by the anchor inside the cell
        // todo this is a hack - make finding the enclosing anchor more robust
        ((HtmlAnchor) cpCell.getFirstChild().getNextSibling()).click();

        webClient.waitForBackgroundJavaScriptStartingBefore(1_000);
        page = (HtmlPage) page.getEnclosingWindow().getEnclosedPage();

        HtmlTable grdCPHistory = (HtmlTable) page.getElementById("grdCPhist");

        System.out.println("-----------------------------------------------------");
        System.out.println(grdCPHistory.asNormalizedText());
        System.out.println("-----------------------------------------------------");
    }

相关问题