css 如何使用BeautifulSoup找到页面上包含“/pl/oferta/”的每个链接?

a9wyjsp7  于 2023-04-23  发布在  其他
关注(0)|答案(1)|浏览(99)

我是Python新手,我正在学习用于抓取目的。我使用BeautifulSoup只收集来自以下链接:https://www.otodom.pl/pl/oferty/sprzedaz/mieszkanie/wiele-lokalizacji?distanceRadius=0&page=1&limit=36&locations=%5Bregions-1%2Cregions-11%2Cregions-12%2Cregions-13%2Cregions-2%2Cregions-5%2Cregions-10%2Cregions-9%2Cregions-8%2Cregions-7%2Cregions-4%2Cregions-6%2Cregions-3%2Cregions-14%2Cregions-15%2Cregions-16%5D&by=DEFAULT&direction=DESC&viewType=listing with part“/pl/oferta/'”我发现

import logging
import requests
from bs4 import BeautifulSoup

# Set up logging
logging.basicConfig(level=logging.DEBUG)
logging.getLogger(__name__)

base_url = 'https://www.otodom.pl/pl/oferty/sprzedaz/mieszkanie/wiele-lokalizacji?distanceRadius=0'
base_url2= '&limit=36&locations=%5Bregions-1%2Cregions-11%2Cregions-12%2Cregions-13%2Cregions-2%2Cregions-5%2Cregions-10%2Cregions-9%2Cregions-8%2Cregions-7%2Cregions-4%2Cregions-6%2Cregions-3%2Cregions-14%2Cregions-15%2Cregions-16%5D&by=DEFAULT&direction=DESC&viewType=listing'
url = base_url + '&page={}' + base_url2

page_num = 1

#https://www.otodom.pl/pl/oferty/sprzedaz/mieszkanie/wiele-lokalizacji?distanceRadius=0&page=1&limit=36&locations=%5Bregions-1%2Cregions-11%2Cregions-12%2Cregions-13%2Cregions-2%2Cregions-5%2Cregions-10%2Cregions-9%2Cregions-8%2Cregions-7%2Cregions-4%2Cregions-6%2Cregions-3%2Cregions-14%2Cregions-15%2Cregions-16%5D&by=DEFAULT&direction=DESC&viewType=listing

#while True:
print('Scraping page', page_num)
response = requests.get(url.format(page_num))
soup = BeautifulSoup(response.text, 'html.parser')
houses_listings = soup.find_all('a', {'class': 'css-1up0y1q e1n6ljqa16', 'href': lambda x: x and '/pl/oferta/' in x})
print(job_listings)

请帮我写一个函数代码。
我想在这个页面上的每一个链接提供

zpgglvta

zpgglvta1#

您可以通过遍历houses_listings来访问找到的每个链接:

for link in houses_listings:
    print(link["href"])

这是你在找的东西吗?

相关问题