excel 使用VBA宏从一个URL的多个页面中抓取数据

xtfmy6hx  于 2022-11-26  发布在  其他
关注(0)|答案(1)|浏览(132)

我写了Excel宏从多个页面获取数据(这里约22-25页),但我只能提取第一页。我如何可以获取多个页面?

Public Sub Extract_Data()

    Dim httpReq As Object
    Dim HTMLdoc As Object
    Dim resultsTable As Object
    Dim tRow As Object, tCell As Object
    Dim destCell As Range
    
    With ActiveSheet
        .Cells.ClearContents
        Set destCell = .Range("A1")
    End With

    Set httpReq = CreateObject("MSXML2.XMLHTTP")
    With httpReq
        .Open "GET", "https://www.bseindia.com/markets/debt/TradenSettlement.aspx", False
        .send
        Set HTMLdoc = CreateObject("HTMLfile")
        HTMLdoc.body.innerHTML = .responseText
    End With

    Set resultsTable = HTMLdoc.getElementById("ContentPlaceHolder1_GridViewrcdsFC")
    For Each tRow In resultsTable.Rows
        For Each tCell In tRow.Cells
            destCell.Offset(tRow.RowIndex, tCell.cellIndex).Value = tCell.innerText
        Next
    Next

    MsgBox "Done"
    
End Sub

我尝试以不同的方式编写程序,但仍然没有运气,它只获取第一页(参考-微软HTML对象库,微软互联网控件)

Sub Fetch_Data()
   Dim HTMLDoc As New HTMLDocument
    Dim objTable As Object
    Dim lRow As Long
    Dim lngTable As Long
    Dim lngRow As Long
    Dim lngCol As Long
    Dim ActRw As Long
    Dim objIE As InternetExplorer
    Set objIE = New InternetExplorer
    objIE.Navigate "https://www.bseindia.com/markets/debt/TradenSettlement.aspx"

    Do Until objIE.ReadyState = 4 And Not objIE.Busy
        DoEvents
    Loop
    Application.Wait (Now + TimeValue("0:00:03")) 'wait for java script to load
    HTMLDoc.body.innerHTML = objIE.Document.body.innerHTML
    With HTMLDoc.body
        Set objTable = .getElementsByTagName("table")
        For lngTable = 0 To objTable.Length - 1
            For lngRow = 0 To objTable(lngTable).Rows.Length - 1
                For lngCol = 0 To objTable(lngTable).Rows(lngRow).Cells.Length - 1
                    ThisWorkbook.Sheets("Sheet1").Cells(ActRw + lngRow + 1, lngCol + 1) = objTable(lngTable).Rows(lngRow).Cells(lngCol).innerText
                Next lngCol
            Next lngRow
            ActRw = ActRw + objTable(lngTable).Rows.Length + 1
        Next lngTable
    End With
    objIE.Quit
End Sub

想要撷取或抓取单一URL的多个页面
我写的程序工作,但我不知道如何循环它

i = 2

For i = 2 To 50

If i = 2 Then
Url = "javascript:__doPostBack('ctl00$ContentPlaceHolder1$GridViewrcdsFC','Page$2')"
On Error GoTo ErrorHandler
ElseIf i = 3 Then
Url = "javascript:__doPostBack('ctl00$ContentPlaceHolder1$GridViewrcdsFC','Page$3')"
On Error GoTo ErrorHandler

等等,直到40-50页,我写的网址。我只是想帮助改变页$X....X必须改变,我尝试了“页$”& i,但它不工作

lh80um4z

lh80um4z1#

你必须查看当前页面上是否有到其他页面的url链接,找到一个标签并循环所有的网页。你也可以查看每个页面的url并对其进行硬编码。
标签“a”下的URL示例:

Set AElements = HTMLDoc.getElementsByTagName("a")
    For Each AElement In AElements
        If AElement.id = "xxxxxxxxx" Then
            Cells(Cell, 27) = AElement.src     'I write URL in the 27th column
           'AElement.href
        End If
    Next AElement

相关问题