python-3.x 如何使用selenium在加载更多按钮后添加更新的数据(Webscraping)

yhuiod9q  于 2023-03-13  发布在  Python
关注(0)|答案(1)|浏览(185)

运行我的代码后,我没有得到“更新”的数据后,点击加载更多按钮。我尝试了很多次,我不知道我的代码是什么错误。我应该有一个数据列表后,加载更多按钮运行在python中。然而,我只得到了数据之前,按钮被点击,我附加在列表中。

class RunChromeTests():
    def test(self):
        chrome_options = Options()
        chrome_options.add_argument("--disable-notifications")
        chrome_options.add_argument("-incognito")
        chrome_options.add_argument("--disable-popup-blocking")
        chrome_options.add_argument("--ignore-certificate-errors")
        chrome_options.add_argument("--disable-javascript")

    
        # Download the chrome driver from https://chromedriver.chromium.org/downloads 
        # and find the driver location in your computer
        chrome_path = r"path"
        driver = webdriver.Chrome(chrome_path, options=chrome_options)
        driver.maximize_window()
        driver.implicitly_wait(10)
        
        final = []
    
    
            
        ### Enter your url to scrape but change the page number to {a} 
        driver.get("https://www.burpple.com/search/sg?q=Newly+Opened&type=places")
        content = driver.page_source
        soup = BeautifulSoup(content)
            
        loadmore = driver.find_element_by_id("masonryViewMore-btn")
        j = 0
        final1=[]
       
        try:
            while loadmore.is_displayed():
                loadmore.click()
                time.sleep(2)
                lrec = soup.find_all("span",{"searchVenue-header-name-name headingMedium"})
                #loadmore.is_displayed()
                newlist = lrec[j:]
                print(lrec)
                #print(newlist)
                for rec in newlist:
                    name = rec.text
                    #print(name)
                    final1.append(name)
                print(final1)
                j = len(lrec)+1
                #final1.append(name)
                time.sleep(5)
                #print(j)
                #
        except exceptions.StaleElementReferenceException:
            pass
chromed = RunChromeTests()
chromed.test()

输出:【“Kotuwa”、“Smoochie Creamery”、“Evan 's Kitch”、“Plus Coffee Joint”、“800°木烤比萨(KINEX)”、“Sarah' s Loft”、“nicher(Springleaf)”、“First Story Cafe”、“Tucela Gelato”、“Ri Ri Cha”、“Unatoto”、“皇家棕榈(肉类和餐饮)”】
正确输出:[“Kotuwa”、“Smoochie Creamery”、“Evan 's Kitch”、“Plus咖啡连锁店”、“800°木烤比萨饼(KINEX)“,“莎拉的阁楼”,“尼克(Springleaf)“、”第一故事咖啡馆“、”图塞拉凝胶“、”Ri Ri Cha“、”Unatoto“、”皇家棕榈(肉类与餐饮)“、”华丽的面包房“、”等同于咖啡(果园中心)','TAG Espresso(莱佛士城)、“Arc-En-Ciel糕点店”、"享受餐厅和酒吧“(史蒂文斯)、”土井康典SAGE(果园广场)“、”Hellu咖啡“、”杵和迫击炮协会“,...]
... -〉表示&更多

3lxsmp7m

3lxsmp7m1#

你需要在每次点击Lode More按钮时获取页面源代码。并不断检查名称是否存在于列表中,如果没有,然后添加。
使用无限循环并检查load more按钮是否存在,如果存在,则单击其他按钮中断循环。
代码:

final1=[]
driver.get ("https://www.burpple.com/search/sg?q=Newly+Opened&type=places")
time.sleep(2)

while True:
    content = driver.page_source
    soup = BeautifulSoup(content)
    for item in soup.select("span.searchVenue-header-name-name.headingMedium"):
        if not item.text in final1:
            final1.append(item.text)
    try:
       WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, "masonryViewMore-btn"))).click()
       time.sleep(2)
    except:
        break
print("Total number of elements:  {}".format(len(final1)))
print(final1)

你需要导入下面的库.

from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By

输出:

Total number of elements:  161
['Plus Coffee Joint', 'Kotuwa', 'Smoochie Creamery', "Evan's Kitch", '800° Woodfired Pizza (KINEX)', "Sarah's Loft", 'nicher (Springleaf)', 'First Story Cafe', 'Tucela Gelato', 'Ri Ri Cha', 'Unatoto', 'Royal Palm (Meat & Dine)', 'Flourish Bakehouse', 'TAG Espresso (Raffles City)', 'Equate Coffee (Orchard Central)', 'Arc-en-ciel Pâtisserie', 'Enjoy Eating House & Bar (Stevens)', 'SAGE by Yasunori Doi (Orchard Plaza)', 'Hellu Coffee', 'Pestle & Mortar Society', 'Omoté Soho (Velocity)', 'Le Matin Patisserie (ION Orchard)', 'KEK Keng Eng Kee Seafood (Tampines)', 'Mother-In-Law Egg Tart (Havelock)', 'Mono Izakaya', 'Daily Staples', "Ben's Tavern", 'FiftyFive Coffee Bar', 'Butcher’s Block', 'There Was No Coffee', 'Roemah Makan', 'Overscoop (Hougang)', 'Wong Fu Fu', 'Dim Sum Haus (Upper Weld)', 'Long Phung Vietnamese Restaurant (Chinatown)', 'Abundance (Jalan Besar)', 'Beans Factory', 'XiabuXiabu', 'B/W Bagelwich', 'Nakiryu', 'Fiamma (Capella Singapore)', 'Kei Kaisendon (Breadtalk IHQ)', 'cloud', 'Sweedy Patisserie', 'Fatty Patty (The Bedok Marketplace)', 'afterwords', 'NAN YANG DAO (Aljunied)', 'KREME', 'Pilot Kitchen', 'Peacock North Indian Cuisine (NEWest)', 'Café Natsu (Clemenceau)', 'Little Cart Noodle House', 'Anchovies & Peanuts (Golden Mile Food Centre)', 'Otto Pizza', 'Little French Fusion', 'Underdog Inn', 'Sum Dim Sum', 'Seng House', 'Lor Mak Mak (Changi Village Hawker Centre)', 'Guriru', 'The Last Scoop', 'Darkness Dessert', 'Hatsu', 'Dan Lao (Maxwell Food Centre)', 'Chong Qing Xiao Mu Deng Traditional Hot Pot (GR.iD)', 'Refuel J9 (Junction Nine)', 'Fu Xiao Xian', 'Good Chai People', 'Ginger.Lily', 'Paris Van Java', 'Hejio', 'LUNA (Joo Chiat)', 'The Better Scoop (Serangoon)', 'Beccarino Patisserie', 'Yi Qian Private Dining', 'HUSK Nasi Lemak (Bugis Cube)', 'Kong Cafe (Thomson Plaza)', 'Mooi Pâtisserie', 'Wanpo Tea Shop (Lazada One)', 'The Hainan Story Coffee House (NEX)', 'Hong Kong Zhai Dim Sum (Marina Square)', 'Beauty in The Pot (NEX)', 'GaiBang (Paya Lebar Square)', 'Sugaroses Cafe', 'SWIRLGO', 'Victoria Bakery', 'Dirty Cheesecake Bakery & Cafe', 'Kao Ge Yu (East Village)', 'Wonders', 'Parliament Bar', 'Mr. Bucket Chocolaterie (Dempsey)', 'Chapter 1', 'Qi Xiang Hotpot', 'The DEN - Kway Teow Kia & Bar', 'Yamakita - Tempura x Tapas', 'Onigarazu Don (Senja Hawker Centre)', 'Full Circle by J-man', 'Ah Lock Kitchen (Senja Hawker Centre)', "Winner's Fried Chicken", 'Jane Deer', 'Uncle Leong Seafood (Techview)', 'Hub & Spoke Cafe (Changi Airport T2)', 'Nani Bowl', 'Staple', 'Daniu Teochew Seafood Restaurant', '123 Zô Vietnamese BBQ Skewers and Hotpot', 'TwoBakeBoys (CT Hub 2)', 'Canopy (Changi City Point)', 'Capital Kitchen (Clarke Quay)', '8Bar Espresso', 'Boms & Buns', 'Lau Wang Claypot (Bugis+)', 'La Lola Churreria (Cross Street Exchange)', "Dawn's", 'Heng Heng Kueh', 'Fosters Cafe', 'MUGUNG', 'Sakutto Tempura&Oyster', 'Firewood Chicken & Bagel', 'Shikar', 'Ahara', '1-V:U', 'Cocokata', 'MANAM', 'Shake Shake In A Tub (IMM)', '929 Desserts & Bites', "Verandah @ Rael's (111 Somerset)", 'Kei Kaisendon (Tanglin Mall)', 'Lee Lai Jiak', 'Gelatology Lab', 'Matchaya (Jem)', 'Bag Me Up', 'PaPa Gelare by CoffeePlus', 'Chun Noodle Bar (Amoy Street Food Centre)', 'Butter Bean (VivoCity)', 'OMU NOMU Craft Sake & Raw Bar', 'The Dim Sum Place (The Centrepoint)', 'Yuugo Cafe', 'That Wine Place', 'The Whole Kitchen (CBD)', 'Colobaba (Century Square)', 'Boomeranz Nasi Ayam Power By Adimann (Northpoint City)', 'Hitoyoshi Izakaya', 'popotatoe', 'Jin Yu Man Tang Dessert Shop (South Bridge)', 'LILY•S Gourmet Bistro (Tanglin Mall)', 'Surrey Hills Deli', 'Main Street Commissary', 'Seroja', 'Luckmeow (Maxwell)', 'Old Shifu Charcoal Porridge', 'iSTEAKS Reserved', 'Rumours Beach Club', 'Willy Wankerz', 'Cupping Room Coffee Roasters (Robinson Centre)', 'Monte Risaia', 'Shinya Izakaya', 'OSTERIA BBR by Alain Ducasse', '+886 Bistro', 'Dabao Gelato', '99 Thai Taste']

相关问题