selenium 如何使用Python从Map中抓取数据(无法单击图层)?

yzxexxkh  于 2022-12-29  发布在  Python
关注(0)|答案(1)|浏览(155)

Website
我试着从website上的Map中抓取数据,特别是,我需要从所附图片右侧显示的表格中获取数据。
要手动访问该表,需要单击:
1.地区名称(左栏用俄语书写)-〉
1.子区域名称-〉
1.然后在Map上分层,
1.然后桌面在屏幕上弹出。
为了实现这些代码,我使用了Python中的selenium库。
我在第3步卡住了,未能单击Map层到达表。
我希望就如何进一步开展工作提出任何建议。
我真的很感激你能提供的任何帮助。

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By

website = 'https://aisgzk.kz/aisgzk/ru/content/maps/'
path = 'D:\ML_project\driver\chromedriver'
driver = webdriver.Chrome(path)
driver.get(website)

region_button = driver.find_element(By.XPATH, '//button[@data-name = "Костанайская"]')
region_button.click()

subregion_button = driver.find_element(By.XPATH, '//button[@data-name = "Алтынсаринский"]')
subregion_button.click()

driver.quit()
xdyibdwo

xdyibdwo1#

问题是使用了错误的库。应该使用selenium request。
YouTube视频帮助我是[总是检查隐藏的API时,网页搜罗]:https://www.youtube.com/watch?v=DqtlR0y0suo
我的最终代码如下所示:

import requests
from urllib3.exceptions import InsecureRequestWarning
from urllib3 import disable_warnings
disable_warnings(InsecureRequestWarning)

url_regions = "https://aisgzk.kz/aisgzk/Index/BuildRegionTree"

payload_regions = "sLang="
headers_regions = {
    "Accept": "*/*",
    "Accept-Language": "ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7,de;q=0.6",
    "Connection": "keep-alive",
    "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
    "Origin": "https://aisgzk.kz",
    "Referer": "https://aisgzk.kz/aisgzk/ru/content/maps/",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "same-origin",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36",
    "X-Requested-With": "XMLHttpRequest",
    "sec-ch-ua": "^\^Not?A_Brand^^;v=^\^8^^, ^\^Chromium^^;v=^\^108^^, ^\^Google",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "^\^Windows^^"
}

response_regions = requests.request("POST", url_regions, data=payload_regions, headers=headers_regions, verify=False)

print(response_regions.text)

相关问题