pandas 无法提取表使用美丽的汤

idfiyjo8  于 2023-05-27  发布在  其他
关注(0)|答案(1)|浏览(543)

我试图从下面给出的网站中提取此表:
https://bills.parliament.nz/bills-proposed-laws?Tab=All&Period=0&To=2023-05-22&From=2018-01-01&SelectCommittee=a0f103be-0902-4480-9778-93a5229dca71
下面是我的代码:

import requests
from bs4 import BeautifulSoup
import pandas as pd
a = requests.get('https://bills.parliament.nz/bills-proposed-laws?Tab=All&Period=0&To=2023-05-22&From=2018-01-01&SelectCommittee=a0f103be-0902-4480-9778-93a5229dca71')
a.status_code
soup = BeautifulSoup(a.content, 'html.parser')
table2 = soup.find("div", class_="d-none d-lg-block")
print(table2)
# Output --> 
#[]

使用bs 4,我尝试提取表。

我也试过使用xpath,但没有成功。
任何帮助将不胜感激!

odopli94

odopli941#

您不能直接使用requests进行提取,因为内容是动态加载的。但是,您可以使用他们的API:

payload = {  # all parameters
   'id': None,
   'documentPreset': 1,
   'keyword': None,
   'selectCommittee': 'a0f103be-0902-4480-9778-93a5229dca71',
   'itemType': None,
   'itemSubType': None,
   'status': [],
   'documentTypes': [],
   'beforeCommittee': None,
   'billStage': None,
   'billStages': [],
   'billTab': 'All',
   'billId': None,
   'includeBillStages': True,
   'subject': None,
   'person': None,
   'parliament': None,
   'dateFrom': '2018-01-01T00:00:00',
   'dateTo': '2023-05-22T00:00:00',
   'datePeriod': '0',
   'restrictedFrom': None,
   'restrictedTo': None,
   'terminatedReason': None,
   'terminatedReasons': [],
   'column': 4,
   'direction': 1,
   'pageSize': 10,
   'page': 1
}

url = 'https://bills.parliament.nz/api/data/search'

data = requests.post(url, json=payload).json()

输出:

# The first row
>>> data['results'][0]
{'id': '691e5c66-aa17-43f0-208d-08db57320db1',
 'title': 'Taxation Principles Reporting Bill',
 'shortTitle': None,
 'status': 'Active',
 'reference': None,
 'billStage': None,
 'billNumber': '253-1',
 'billTitle': None,
 'partyName': None,
 'memberName': None,
 'shoulder': None,
 'documentType': 'Bill',
 'itemType': 'Government',
 'itemSubType': None,
 'selectCommittee': 'Finance and Expenditure',
 'parliamentNumber': 53,
 'sop': None,
 'attachmentId': None,
 'attachmentName': None,
 'publicationDate': '2023-05-18T00:00:00Z',
 'lastModified': '2023-05-19T17:20:03Z',
 'billCurrentStageId': 3,
 'billCurrentStageCode': 'SelectCommittee',
 'billCurrentStageName': 'Select Committee',
 'billStages': [{'id': 'ea6ed7dd-77fb-4325-a134-e4055ed970f9',
   'documentType': 'BillStage',
   'publicationDate': '2023-05-18T00:00:00Z',
   'billStage': 'Select Committee',
   'billStageId': 3,
   'parentIds': ['691e5c66-aa17-43f0-208d-08db57320db1']},
  {'id': '5ff829ff-5c2a-4763-bb64-49a2894749f5',
   'documentType': 'BillStage',
   'publicationDate': '2023-05-18T00:00:00Z',
   'billStage': 'Introduced',
   'billStageId': 1,
   'parentIds': ['691e5c66-aa17-43f0-208d-08db57320db1']},
  {'id': 'b425eef2-380e-4944-ae5a-dee344fbc4a2',
   'documentType': 'BillStage',
   'publicationDate': '2023-05-19T12:03:00Z',
   'billStage': 'First Reading',
   'billStageId': 2,
   'parentIds': ['691e5c66-aa17-43f0-208d-08db57320db1']}],
 'orderNumber': 0,
 'titleUrl': None,
 'date': None,
 'duration': None,
 'subject': None,
 'people': None,
 'state': None,
 'vimeoUrl': None,
 'thumbnailUrl': None,
 'author': None,
 'iobClosingDate': None,
 'url': '/691e5c66-aa17-43f0-208d-08db57320db1'}

Pandas dataframe:

import pandas as pd

cols = ['title', 'billNumber', 'selectCommittee', 'billCurrentStageName', 'lastModified']
df = pd.json_normalize(data['results'])[cols]

输出:

>>> df
                                               title billNumber          selectCommittee billCurrentStageName          lastModified
0                 Taxation Principles Reporting Bill      253-1  Finance and Expenditure     Select Committee  2023-05-19T17:20:03Z
1  Taxation (Annual Rates for 2023-24, Multinatio...      255-1  Finance and Expenditure     Select Committee  2023-05-23T07:47:35Z
2  New Zealand Superannuation and Retirement Inco...      227-1  Finance and Expenditure     Select Committee  2023-05-11T23:50:56Z
3                    Water Services Legislation Bill      210-1  Finance and Expenditure     Select Committee  2023-05-11T23:41:01Z
4  Water Services Economic Efficiency and Consume...      192-1  Finance and Expenditure     Select Committee  2023-05-12T01:44:38Z
5                                Deposit Takers Bill      162-2  Finance and Expenditure       Second Reading  2023-05-12T01:25:12Z
6  Taxation (Annual Rates for 2022-23, Platform E...      164-3  Finance and Expenditure         Royal Assent  2023-05-11T23:45:17Z
7                  Companies (Levies) Amendment Bill      133-2  Finance and Expenditure         Royal Assent  2023-05-12T01:08:11Z
8                       Water Services Entities Bill      136-4  Finance and Expenditure         Royal Assent  2023-05-12T01:48:21Z
9      Overseas Investment (Forestry) Amendment Bill      134-2  Finance and Expenditure         Royal Assent  2023-05-12T00:01:12Z

要提取子数据,请执行以下操作:

cols = ['title', 'billNumber', 'selectCommittee', 'billCurrentStageName', 'lastModified']
df = pd.json_normalize(data['results'], 'billStages', meta=cols)

输出:

>>> df
                                      id documentType       publicationDate                 billStage  ...  billNumber          selectCommittee billCurrentStageName          lastModified
0   ea6ed7dd-77fb-4325-a134-e4055ed970f9    BillStage  2023-05-18T00:00:00Z          Select Committee  ...       253-1  Finance and Expenditure     Select Committee  2023-05-19T17:20:03Z
1   5ff829ff-5c2a-4763-bb64-49a2894749f5    BillStage  2023-05-18T00:00:00Z                Introduced  ...       253-1  Finance and Expenditure     Select Committee  2023-05-19T17:20:03Z
2   b425eef2-380e-4944-ae5a-dee344fbc4a2    BillStage  2023-05-19T12:03:00Z             First Reading  ...       253-1  Finance and Expenditure     Select Committee  2023-05-19T17:20:03Z
3   343624e4-673f-457e-9e7b-ab69f6df4fb5    BillStage  2023-05-18T00:00:00Z          Select Committee  ...       255-1  Finance and Expenditure     Select Committee  2023-05-23T07:47:35Z
4   5ad28e13-4f2a-4fe8-9c02-369d0b6223a4    BillStage  2023-05-18T00:00:00Z                Introduced  ...       255-1  Finance and Expenditure     Select Committee  2023-05-23T07:47:35Z
5   84eafb5f-c111-46c0-aaa2-39cac1025226    BillStage  2023-05-19T10:39:00Z             First Reading  ...       255-1  Finance and Expenditure     Select Committee  2023-05-23T07:47:35Z
6   8bb1b8aa-af7c-4728-9a49-bfd536e6b26e    BillStage  2023-02-23T00:00:00Z                Introduced  ...       227-1  Finance and Expenditure     Select Committee  2023-05-11T23:50:56Z
7   49e7bb09-8112-40d9-975e-665f3e14648e    BillStage  2023-03-09T00:00:00Z          Select Committee  ...       227-1  Finance and Expenditure     Select Committee  2023-05-11T23:50:56Z
8   30963226-bac2-44e1-b378-e71e8c9ac594    BillStage  2023-03-09T14:53:00Z             First Reading  ...       227-1  Finance and Expenditure     Select Committee  2023-05-11T23:50:56Z
9   103c7957-b6ac-4acc-8612-6e4f888b58fa    BillStage  2022-12-08T00:00:00Z                Introduced  ...       210-1  Finance and Expenditure     Select Committee  2023-05-11T23:41:01Z
10  264fb913-8035-4509-aaa7-fa4dc9f0b119    BillStage  2022-12-13T00:00:00Z          Select Committee  ...       210-1  Finance and Expenditure     Select Committee  2023-05-11T23:41:01Z
11  945fb76a-24b5-48df-8cb6-6e8e60e6b026    BillStage  2022-12-13T21:09:00Z             First Reading  ...       210-1  Finance and Expenditure     Select Committee  2023-05-11T23:41:01Z
12  26fdc484-1890-4786-abb4-765436183b95    BillStage  2022-12-14T09:00:00Z             First Reading  ...       210-1  Finance and Expenditure     Select Committee  2023-05-11T23:41:01Z
13  1da55620-ff09-4094-989a-879bfd0ea525    BillStage  2022-12-08T00:00:00Z                Introduced  ...       192-1  Finance and Expenditure     Select Committee  2023-05-12T01:44:38Z
14  27fd4203-198a-4e09-93ef-1f3ca4ab5a27    BillStage  2022-12-13T00:00:00Z          Select Committee  ...       192-1  Finance and Expenditure     Select Committee  2023-05-12T01:44:38Z
15  9583449f-00da-401a-8840-31047b188e44    BillStage  2022-12-14T09:26:00Z             First Reading  ...       192-1  Finance and Expenditure     Select Committee  2023-05-12T01:44:38Z
16  dcbcaccc-cfea-4812-96ee-648ca27906fe    BillStage  2022-09-22T00:00:00Z                Introduced  ...       162-2  Finance and Expenditure       Second Reading  2023-05-12T01:25:12Z
17  778aeb50-5919-44b0-a3bf-c9a612099436    BillStage  2022-09-27T00:00:00Z          Select Committee  ...       162-2  Finance and Expenditure       Second Reading  2023-05-12T01:25:12Z
18  020f9bca-f43c-4201-81b1-b8a8816023b6    BillStage  2022-09-27T15:08:00Z             First Reading  ...       162-2  Finance and Expenditure       Second Reading  2023-05-12T01:25:12Z
19  694a67aa-cfa1-4603-a2ff-83e907a49a17    BillStage  2022-09-08T00:00:00Z                Introduced  ...       164-3  Finance and Expenditure         Royal Assent  2023-05-11T23:45:17Z
20  5fb4cd2d-49f6-4306-921a-52d34d768afc    BillStage  2022-09-21T00:00:00Z          Select Committee  ...       164-3  Finance and Expenditure         Royal Assent  2023-05-11T23:45:17Z
21  0988d7fa-74c5-493d-9c42-0768d5c564da    BillStage  2022-09-22T11:00:00Z             First Reading  ...       164-3  Finance and Expenditure         Royal Assent  2023-05-11T23:45:17Z
22  9a9bd0cc-9382-4de9-b75e-21f94d113814    BillStage  2023-03-08T19:28:00Z            Second Reading  ...       164-3  Finance and Expenditure         Royal Assent  2023-05-11T23:45:17Z
23  fae3442f-d9d8-4aed-a722-423281132b4c    BillStage  2023-03-14T14:58:00Z  Committee of whole House  ...       164-3  Finance and Expenditure         Royal Assent  2023-05-11T23:45:17Z
24  618a26cf-f39c-4d16-8c16-32288895f753    BillStage  2023-03-14T14:58:00Z  Committee of whole House  ...       164-3  Finance and Expenditure         Royal Assent  2023-05-11T23:45:17Z
25  7e719538-1e5e-418d-b939-00743ee2e561    BillStage  2023-03-28T15:09:00Z             Third Reading  ...       164-3  Finance and Expenditure         Royal Assent  2023-05-11T23:45:17Z
26  340df63c-a2c6-43cd-8b37-41e115e20e3e    BillStage  2023-03-28T15:09:00Z             Third Reading  ...       164-3  Finance and Expenditure         Royal Assent  2023-05-11T23:45:17Z
27  7cbd3ecf-0b50-464e-9abd-6d7d01920eec    BillStage  2023-03-31T00:00:00Z              Royal Assent  ...       164-3  Finance and Expenditure         Royal Assent  2023-05-11T23:45:17Z
28  d309f103-13a3-4e83-97d8-87012e6d9090    BillStage  2023-03-31T00:00:00Z              Royal Assent  ...       164-3  Finance and Expenditure         Royal Assent  2023-05-11T23:45:17Z
29  59e44c4f-6c24-4965-9110-9edd724c41e6    BillStage  2022-06-02T00:00:00Z                Introduced  ...       133-2  Finance and Expenditure         Royal Assent  2023-05-12T01:08:11Z
30  1da272dc-53f7-4ad9-a114-c2617a9bb480    BillStage  2022-06-21T00:00:00Z          Select Committee  ...       133-2  Finance and Expenditure         Royal Assent  2023-05-12T01:08:11Z
31  d1627660-d696-4cb9-a166-84138bb436e9    BillStage  2022-06-21T21:11:00Z             First Reading  ...       133-2  Finance and Expenditure         Royal Assent  2023-05-12T01:08:11Z
32  d8cb72e1-ce67-45c8-9ce4-1f785f3384e0    BillStage  2022-11-15T21:38:00Z            Second Reading  ...       133-2  Finance and Expenditure         Royal Assent  2023-05-12T01:08:11Z
33  07004987-a50d-4cfc-aba1-dca9b988773a    BillStage  2022-11-16T09:00:00Z            Second Reading  ...       133-2  Finance and Expenditure         Royal Assent  2023-05-12T01:08:11Z
34  21918038-11f6-495a-99c2-932ee9139bc6    BillStage  2022-11-24T22:13:00Z  Committee of whole House  ...       133-2  Finance and Expenditure         Royal Assent  2023-05-12T01:08:11Z
35  e950fead-8a71-40f7-b297-19a2096d8cd3    BillStage  2022-11-24T23:12:00Z             Third Reading  ...       133-2  Finance and Expenditure         Royal Assent  2023-05-12T01:08:11Z
36  15f27e7c-921c-4c91-8fea-df2cb0113e1a    BillStage  2022-11-28T00:00:00Z              Royal Assent  ...       133-2  Finance and Expenditure         Royal Assent  2023-05-12T01:08:11Z
37  dad58983-427d-42c4-85e2-bfa8c068e6f8    BillStage  2022-06-02T00:00:00Z                Introduced  ...       136-4  Finance and Expenditure         Royal Assent  2023-05-12T01:48:21Z
38  30b0298d-f71e-4669-9ace-c3ca7f3bc946    BillStage  2022-06-09T00:00:00Z          Select Committee  ...       136-4  Finance and Expenditure         Royal Assent  2023-05-12T01:48:21Z
39  f7ee57f1-1cbf-4a88-99d8-3238899f312e    BillStage  2022-06-09T14:55:00Z             First Reading  ...       136-4  Finance and Expenditure         Royal Assent  2023-05-12T01:48:21Z
40  c8e9887e-ab54-461c-a37c-63b485fcb804    BillStage  2022-11-16T17:10:00Z            Second Reading  ...       136-4  Finance and Expenditure         Royal Assent  2023-05-12T01:48:21Z
41  6e2a672f-4459-4943-8402-a130663a5ce6    BillStage  2022-11-22T20:57:00Z  Committee of whole House  ...       136-4  Finance and Expenditure         Royal Assent  2023-05-12T01:48:21Z
42  bc8a24c5-7233-4b9a-a5bf-9bc123e3ba3c    BillStage  2022-11-23T15:11:00Z  Committee of whole House  ...       136-4  Finance and Expenditure         Royal Assent  2023-05-12T01:48:21Z
43  cc9cd92b-c01c-4cd5-8b00-2e6b01d9146d    BillStage  2022-12-06T16:34:00Z  Committee of whole House  ...       136-4  Finance and Expenditure         Royal Assent  2023-05-12T01:48:21Z
44  ae6f4812-3865-4168-a498-7ab0729b28c6    BillStage  2022-12-08T09:00:00Z             Third Reading  ...       136-4  Finance and Expenditure         Royal Assent  2023-05-12T01:48:21Z
45  38ce5d53-99ab-4a01-a044-5ec557e6cdd2    BillStage  2022-12-14T00:00:00Z              Royal Assent  ...       136-4  Finance and Expenditure         Royal Assent  2023-05-12T01:48:21Z
46  770e5974-b288-4fef-b1b1-56617ae6715d    BillStage  2022-05-31T00:00:00Z                Introduced  ...       134-2  Finance and Expenditure         Royal Assent  2023-05-12T00:01:12Z
47  368e1c89-7976-4112-9dce-e6f2f185aac9    BillStage  2022-06-07T00:00:00Z          Select Committee  ...       134-2  Finance and Expenditure         Royal Assent  2023-05-12T00:01:12Z
48  b78fa74f-5a40-4991-bec3-47c7c9afd6c6    BillStage  2022-06-07T20:08:00Z             First Reading  ...       134-2  Finance and Expenditure         Royal Assent  2023-05-12T00:01:12Z
49  89bf6fd1-a1a4-4ac9-9416-8f6c331b89a8    BillStage  2022-08-04T16:27:00Z            Second Reading  ...       134-2  Finance and Expenditure         Royal Assent  2023-05-12T00:01:12Z
50  7efc5728-0c75-4fd8-9c2a-d4a7680740e7    BillStage  2022-08-09T19:10:00Z            Second Reading  ...       134-2  Finance and Expenditure         Royal Assent  2023-05-12T00:01:12Z
51  8bd89266-1c4e-4f2c-a898-47d187455d14    BillStage  2022-08-10T20:00:00Z  Committee of whole House  ...       134-2  Finance and Expenditure         Royal Assent  2023-05-12T00:01:12Z
52  544ac2e3-ed49-43fd-87eb-7adce0e58a31    BillStage  2022-08-11T14:58:00Z             Third Reading  ...       134-2  Finance and Expenditure         Royal Assent  2023-05-12T00:01:12Z
53  f392dae1-9b67-4abd-a348-44b5df1064ad    BillStage  2022-08-15T00:00:00Z              Royal Assent  ...       134-2  Finance and Expenditure         Royal Assent  2023-05-12T00:01:12Z

相关问题