我想在两个网站上比较椰子的价格。有两个商店(网站)叫laughs和glomark。
现在,我有两个文件main.py
和comparison.py
。我认为问题是在笑的价格报废部分。这条线运行没有错误。我会把我的输出和预期输出bellow后的代码。
主文件.py
from compare_prices import compare_prices
laughs_coconut = 'https://scrape-sm1.github.io/site1/COCONUT%20market1super.html'
glomark_coconut = 'https://glomark.lk/coconut/p/11624'
compare_prices(laughs_coconut,glomark_coconut)
比较.py
import requests
import json
from bs4 import BeautifulSoup
#Imitate the Mozilla browser.
user_agent = {'User-agent': 'Mozilla/5.0'}
def compare_prices(laughs_coconut,glomark_coconut):
# Aquire the web pages which contain product Price
laughs_coconut = requests.get(laughs_coconut)
glomark_coconut = requests.get(glomark_coconut)
# LaughsSuper supermarket website provides the price in a span text.
soup_laughs = BeautifulSoup(laughs_coconut.text, 'html.parser')
price_laughs = soup_laughs.find('span',{'class': 'price'}).text
# Glomark supermarket website provides the data in jason format in an inline script.
soup_glomark = BeautifulSoup(glomark_coconut.text, 'html.parser')
script_glomark = soup_glomark.find('script', {'type': 'application/ld+json'}).text
data_glomark = json.loads(script_glomark)
price_glomark = data_glomark['offers'][0]['price']
#TODO: Parse the values as floats, and print them.
price_laughs = price_laughs.replace("Rs.","")
price_laughs = float(price_laughs)
price_glomark = float(price_glomark)
print('Laughs COCONUT - Item#mr-2058 Rs.: ', price_laughs)
print('Glomark Coconut Rs.: ', price_glomark)
# Compare the prices and print the result
if price_laughs > price_glomark:
print('Glomark is cheaper Rs.:', price_laughs - price_glomark)
elif price_laughs < price_glomark:
print('Laughs is cheaper Rs.:', price_glomark - price_laughs)
else:
print('Price is the same')
我的代码运行时没有错误,并且作为输出显示。
Laughs COCONUT - Item#mr-2058 Rs.: 0.0
Glomark Coconut Rs.: 110.0
Laughs is cheaper Rs.: 110.0
但预期输出为:
Laughs COCONUT - Item#mr-2058 Rs.: 95.0
Glomark Coconut Rs.: 110.0
Laughs is cheaper Rs.: 15.0
注意:-<span class="price">Rs.95.00</span>
这是笑椰子价格的元素
2条答案
按热度按时间qgelzfjb1#
因为
'span',{'class': 'price'}
有两个项,find()方法返回第一个值,所以我们使用findAll()方法返回第二个值,所以在代码中,如果修改为price_laughs = soup_laughs.findAll('span',{'class': 'price'})[1].text
,问题就解决了。vltsax252#
尝试改变您选择元素的策略-有一个
id
来选择更具体的元素容器。关于其他网站,你也可以使用它的API来获得价格: