如何从链接取2个值Scrapy LinkExtractor

jhkqcmku  于 2022-11-09  发布在  其他
关注(0)|答案(1)|浏览(138)

我需要从亚马逊的所有链接开始这一个-

https://www.amazon.com/s?k=guess+case&crid=2Q25FH0FOTCA4&sprefix=guess+case%2Caps%2C215&ref=nb_sb_noss

但我只需要猜测的情况下。这些链接必须包含2值-“猜测”和“电话”。例如:

https://www.amazon.com/Guess-Scarlett-Collection-Hard-iPhone/dp/B00QTEP0B0/ref=sr_1_2?crid=2Q25FH0FOTCA4&keywords=guess+case&qid=1650550474&sprefix=guess+case%2Caps%2C215&sr=8-2

https://www.amazon.com/Guess-GUHCP13SPCUMABK-Marble-Collection-iPhone/dp/B09J94ZMZ3/ref=sr_1_3?crid=2Q25FH0FOTCA4&keywords=guess+case&qid=1650550474&sprefix=guess+case%2Caps%2C215&sr=8-3

我怎样才能把这些链接与帮助库重新?

start_urls = ['https://www.amazon.com/s?k=guess+case&crid=2Q25FH0FOTCA4&sprefix=guess+case%2Caps%2C215&ref=nb_sb_noss/']

      rules = [Rule(LinkExtractor(allow=r'???' , ))...
v440hwme

v440hwme1#

只需使用if语句...
如果“guess”和“phone”不在url中:

相关问题