excel VBA刮网页表,错误对象或块未设置

o2gm4chl  于 2023-05-30  发布在  其他
关注(0)|答案(1)|浏览(115)

我最近才注意到VBA工作得很好,这是IE浏览器的代码1。但是当试图运行更快的代码2时,它会不断出现错误对象或块未设置。任何帮助或正确的方向指针将不胜感激,谢谢.
这是两个密码。
代码1

Option Explicit

Public Sub GetTableB()

    Worksheets("Sheet1").Range("A1:O1500").Clear

    Dim ig As Object
    Dim urlc As String
    urlc = "https://www.totalcorner.com/match/today"

    Set ig = CreateObject("InternetExplorer.Application")
    ig.Visible = True
    ig.navigate urlc
    Do While ig.busy: DoEvents: Loop
    Do Until ig.readyState = 4: DoEvents: Loop

    Dim tb As HTMLTable

    Set tb = ig.document.getElementById("content_container")
    'Set tb = ig.document.getElementsByClassName("main_content")

    Dim rowcounter As Integer
    Dim columncounter As Integer
    rowcounter = 4
    columncounter = 2
    Dim tro As HTMLTableRow
    Dim tdc As HTMLTableCell
    Dim thu

    Dim mys As Worksheet
    Set mys = ThisWorkbook.Sheets("Sheet1")
    For Each tro In tb.getElementsByTagName("tr")
   'loop thru table header
    For Each thu In tro.getElementsByTagName("th")
    mys.Cells(rowcounter, columncounter).Value = thu.textContent
    columncounter = columncounter + 1
    Next thu

    'loop thru table cells
    For Each tdc In tro.getElementsByTagName("td")
    mys.Cells(rowcounter, columncounter).Value = tdc.textContent
    columncounter = columncounter + 1
    Next tdc
    columncounter = 1
    rowcounter = rowcounter + 1
    Next tro

    ig.Quit

End Sub

代码2

Option Explicit
Public Sub GetTableNB()

    Worksheets("Sheet1").Range("A1:O1500").Clear

    Dim html As MSHTML.HTMLDocument, hTable As Object, ws As Worksheet
    Set ws = ThisWorkbook.Worksheets("Sheet1")
    Set html = New MSHTML.HTMLDocument         '<  VBE > Tools > References > Microsoft Scripting  Runtime
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://www.totalcorner.com/match/today", False
        .send
        html.body.innerHTML = .responseText
    End With
    Set hTable = html.getElementById("content_container")
    Dim td As Object, tr As Object, th As Object, r As Long, c As Long
    For Each tr In hTable.getElementsByTagName("tr")
        r = r + 1: c = 1
        For Each th In tr.getElementsByTagName("th")
            ws.Cells(r, c) = th.innerText
        Next
        For Each td In tr.getElementsByTagName("td")
            ws.Cells(r, c) = td.innerText
            c = c + 1
        
       
       Debug.Print hTable.innerText
            
            
        Next
    Next
End Sub

我最近注意到,如果你先运行代码1,这是IE浏览器的代码,然后运行代码2后,它然后工作与出错误,仍在试图弄清楚它。

ie3xauqp

ie3xauqp1#

更多的是一个评论,而不是一个答案:
您不应该再使用Internet Explorer。它过时了,不受支持,不可靠。
正如已经写的评论:使用MSXML2.XMLHTTP不会加载整个页面,因为页面的原始HTML只包含一些基本内容。真实的数据是通过JavaScript动态加载的,MSXML2.XMLHTTP在后台没有JavaScript引擎。我无法解释为什么它会在你第一次通过IE加载它之后获得整个页面-也许它访问了它的缓存版本。我不能测试这一点,因为我的电脑上没有IE是可用的任何更多(有原因)。
如果你真的想使用VBA抓取网页,你可以使用Selenium,例如Excel VBA using Selenium
但是,对于您提供的URL(https://www.totalcorner.com/match/today),使用PowerQuery有一种简单得多的方法。在Excel中,转到“数据”功能区,使用“来自Web”并输入URL。选择“表0”并按“加载”。现在你有Excel中的数据,你可以做任何你喜欢的数据。

要刷新数据,只需按下数据功能区上的“全部刷新”。或者使用VBA执行以下操作:

Dim conn As WorkbookConnection
For Each conn In ThisWorkbook.Connections
    conn.refresh
Next

相关问题