python—如何使用beautifulsoup4和请求获取标题的内容

zu0ti5jz 于 2021-09-29 发布在 Java

关注(0)|答案(1)|浏览(392)

所以我从这个链接中取了药物的名称：药物列表
现在我想获得每种药物的内容，同时每种药物都有自己的链接示例：medicines示例
如何使用beautifulsoup4和请求库获取该药物的内容？

import requests
from bs4 import BeautifulSoup
from pprint import pp

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0'
}

def main(url):
    r = requests.get(url, headers=headers)
    soup = BeautifulSoup(r.text, 'lxml')
    title = [x.text for x in soup.select(
        'a[class$=section__item-link]')]
    count = 0
    for x in range (0, len(title)):
        count += 1
        print("{0}. {1}\n".format(count, title[x]))

main('https://www.klikdokter.com/obat')

python python-requests beautifulsoup

来源：https://stackoverflow.com/questions/68546001/how-to-get-the-content-of-a-title-using-beautifulsoup4-and-requests

1条答案

按热度按时间

zzwlnbp81#

根据我所看到的来自https://www.klikdokter.com/obat 您应该能够执行以下操作：-

import requests
from bs4 import BeautifulSoup
AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_5_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Safari/605.1.15'
BASEURL = 'https://www.klikdokter.com/obat'
headers = {'User-Agent': AGENT}
response = requests.get(BASEURL, headers=headers)
response.raise_for_status()
soup = BeautifulSoup(response.text, 'html.parser')
for tag in soup.find_all('a', class_='topics-index--section__item-link'):
    href = tag.get('href')
    if href is not None:
        print(href)
        response = requests.get(href, headers=headers)
        response.raise_for_status()
        """ Do your processing here """

赞(0）回复(0）举报 2021-09-29

我来回答

python—如何使用beautifulsoup4和请求获取标题的内容

1条答案

相关问题

热门标签

最新问答