scrapy newspaper 3 k-从HTML而不是URL获取文章

qni6mghb 于 2022-11-09 发布在其他

关注(0)|答案(1)|浏览(158)

我正在使用Scrapy解析方法中的newspaper3k。我想提取链接，但我不想再次获取网站。
是否可以使用此功能：

newspaper.build(..)

这样我就可以调用.articles了

scrapy

来源：https://stackoverflow.com/questions/68360767/newspaper3k-get-articles-from-html-instead-of-url

1条答案

按热度按时间

a5g8bdjr1#

我找到了这个解决方案：

import httpx

from newspaper import Article

async def get_article(url):
    with httpx.AsyncClient() as client:
        response = await client.get(url)

    article = Article(url)
    article.set_html(response.text)
    article.parse()

赞(0）回复(0）举报 2022-11-09

我来回答

scrapy newspaper 3 k-从HTML而不是URL获取文章

1条答案

相关问题

热门标签

最新问答