我工作的一个小项目刮一些数据从一个网站,一切似乎都工作得很好,但我不能刮移动的号码它显示我在某些情况下空白输出和完整的HTML标记与手机在其他情况下。
我想把电话号码和其他数据沿着刮取,除了移动的,所有数据都刮取正确。下面是我得到的输出:Name: Klinik Seeschau AG Address: Bernrainstrasse 17, 8280 Kreuzlingen Phone:
下面是我的代码:
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
# initialize the Chrome driver
driver = webdriver.Chrome()
# navigate to the URL
driver.get("https://www.local.ch/en/",)
# Searching for "Clinic"
def search_query(query):
search = driver.find_element("name", "what")
search.clear()
time.sleep(3)
search.send_keys(query)
time.sleep(3)
search.send_keys(Keys.RETURN)
time.sleep(3)
# extract the source code
def source():
source_code = driver.page_source
# Sleep for 3 second
time.sleep(3)
# parse the source code with BeautifulSoup
soup = BeautifulSoup(source_code, "html.parser")
time.sleep(3)
# Extracting the data
def datasearch():
searchResult = driver.find_element(By.CLASS_NAME, "search-header-results")
data = searchResult.text
print(f"there's {data}\n")
time.sleep(2)
# Get the phone_numbers elements
def data_scrape():
# data = driver.find_element(By.CLASS_NAME, "col-xs-12.col-md-8")
# Loop in data end extract phone numbers
components = driver.find_elements(By.CSS_SELECTOR, ".js-entry-card-container.row.lui-margin-vertical-xs.lui-sm-margin-vertical-m")
for component in components:
name = component.find_element(By.CSS_SELECTOR, ".lui-margin-vertical-zero.card-info-title").text
addre = component.find_element(By.CSS_SELECTOR, ".card-info-address").text
phone = component.find_element(By.CLASS_NAME, "lui-sm-margin-left-xxs").text
print(f"Name: {name}\nAddress: {addre}\n Phone: {phone}\n")
# Sleep for 2 second
search_query("Clinique")
source()
datasearch()
data_scrape()
time.sleep(2)
driver.quit()
1条答案
按热度按时间dbf7pr2w1#
下面是根据现有代码获取该信息的一种方法:
最终结果:
您可以找到Selenium文档here。