我正在开发一个Python应用程序,它可以帮助我获得特定餐厅的评论。我正在使用Selenium 4. 1 Web Scraper与Python。
在我的项目文件夹中设置Selenium驱动程序后,我根据Selenium文档将以下代码放在一起:
#YELP REVIEW SCRAPER #
#Importing Dependencies
from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.common.by import By
# Setting up driver options
options = webdriver.ChromeOptions()
# Setting up Path to chromedriver executable file
CHROMEDRIVER_PATH ='../Selenium/chromedriver.exe'
# Adding options
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)
# Setting up chrome service
service = ChromeService(executable_path=CHROMEDRIVER_PATH)
# Establishing Chrom web driver using set services and options
driver = webdriver.Chrome(service=service, options=options)
driver.get('https://www.yelp.com/biz/taste-of-texas-houston')
这成功地打开了Yelp页面的餐厅,我想得到评论,但当我试图刮的评论使用:
driver.find_element(By.CLASS_NAME, ' raw__09f24__T4Ezm')
其中:“raw__09f24__T4Ezm”是第一次评审的span类的名称,我得到错误:
InvalidSelectorException: Message: invalid selector: An invalid or illegal selector was specified
(Session info: chrome=96.0.4664.45)
Stacktrace:
Backtrace:
Ordinal0 [0x00BD6903+2517251]
Ordinal0 [0x00B6F8E1+2095329]
Ordinal0 [0x00A72848+1058888]
Ordinal0 [0x00A74F44+1068868]
Ordinal0 [0x00A74E0E+1068558]
Ordinal0 [0x00A75070+1069168]
Ordinal0 [0x00A9D1C2+1233346]
Ordinal0 [0x00A9D63B+1234491]
Ordinal0 [0x00AC7812+1406994]
Ordinal0 [0x00AB650A+1336586]
Ordinal0 [0x00AC5BBF+1399743]
Ordinal0 [0x00AB639B+1336219]
Ordinal0 [0x00A927A7+1189799]
Ordinal0 [0x00A93609+1193481]
GetHandleVerifier [0x00D65904+1577972]
GetHandleVerifier [0x00E10B97+2279047]
GetHandleVerifier [0x00C66D09+534521]
GetHandleVerifier [0x00C65DB9+530601]
Ordinal0 [0x00B74FF9+2117625]
Ordinal0 [0x00B798A8+2136232]
Ordinal0 [0x00B799E2+2136546]
Ordinal0 [0x00B83541+2176321]
BaseThreadInitThunk [0x757C6739+25]
RtlGetFullPathName_UEx [0x773B8AFF+1215]
RtlGetFullPathName_UEx [0x773B8ACD+1165]
我试着研究这个错误,但没有运气。有没有什么想法如何修改我的代码,这样我就可以得到这个特定餐厅的所有可用评论,这样我就可以得到评论的日期,人,分数和评论的文本?
2条答案
按热度按时间vm0i2vca1#
无法获取最新的评论来为评论部分提取正确的值,但可以这样做。找到所有要获取的评论,然后对每个值进行xpath。
输出:
omqzjyyz2#
使用此代码来获得星星餐厅在yelp网站
首先找到可以进行星星评定的元素,然后找到属性“aria-label”,最后将“aria-label”的值存储到变量中