我试图从一个网站刮一些数据excel。我以为我想通了,但现在我的结果显示为:
<span class="ElementLeaf_elementTitle__xda82" data-test="pab-item-title">PLATE 1X1 ROUND</span>
<span class="ElementLeaf_elementTitle__xda82" data-test="pab-item-title">FLAT TILE 1X1, ROUND</span>
我想我只需要独特的文本部分。我正在遵循一门课程,但它没有涵盖这类问题。虽然我有一个感觉,这是非常简单的,我只是不能弄清楚。希望有人能帮助我。
import pandas
import requests
from bs4 import BeautifulSoup
webpage = requests.get("https://www.lego.com/nl-nl/pick-and-build/pick-a-brick?page=1&perPage=400&filters.i0.key=variants.attributes.colourId&filters.i0.values.i0=323")
content = webpage.content
result = BeautifulSoup(content, 'html.parser')
products = result.find_all("span", {"class": "ElementLeaf_elementTitle__xda82"})
names = []
for item in products:
names.append(item)
data = list(zip(names))
d = pandas.DataFrame(data, columns = ['Name'])
try:
d.to_excel("C:\\Users\\minib\\PycharmProjects\\pythonProject\\scraper\\lego.xlsx")
except:
print("\nSomething went wrong! Please check your code.")
else:
print("\nWeb data successfully written to Excel.")
finally:
print("\nQuitting the program. Bye!")
1条答案
按热度按时间gr8qqesn1#
要将有关项的数据获取到DataFrame,可以使用以下示例:
图纸: