Scrapy循环通过url格式的文字的txt文件

0ejtzxu1 于 2022-11-09 发布在其他

关注(0)|答案(1)|浏览(143)

假设我有一只蜘蛛：

class ExampleSpider(scrapy.Spider):
    name = 'ExampleSpider'
    start_urls = []

    def parse(self, response):
        for res in response.css('div.example'):
            item = {
                 'example' : res.css(examplehere)
            }
            yield item

有没有一种方法，我可以有starturls = [“examplesite.com/{}/search”]，然后循环通过我的文字文本文件，并格式化它，例如像这样的东西：starturls = [“examplesite.com/{}/search”.format（i for i in txtfile.txt）]，这样它就可以通过所有的URL来查找我在文本文件中的单词了？我不确定这是否可以在scrappy中完成，请让我知道最好的方法。

scrapy

来源：https://stackoverflow.com/questions/72552686/scrapy-loop-through-txt-file-of-words-for-url-format

1条答案

按热度按时间

t1qtbnec1#

这个问题之前有人问过。
使用启动请求方法（_R）：

import scrapy

class ExampleSpider(scrapy.Spider):
    name = 'ExampleSpider'

    def start_requests(self):
        with open('spiders/urlFile.txt', 'r') as f:
            for line in f:
                url = f"https://examplesite.com/{line.rstrip()}/search"
                scrapy.Request(url=url)

    def parse(self, response):
        for res in response.css('div.example'):
            item = {
                'example': res.css('examplehere').get()
            }
            yield item

赞(0）回复(0）举报 2022-11-09

我来回答

Scrapy循环通过url格式的文字的txt文件

1条答案

相关问题

热门标签

最新问答