pandas webscrape到excel需要问题

rt4zxlrg  于 2023-03-06  发布在  其他
关注(0)|答案(1)|浏览(115)

我试图从一个网站刮一些数据excel。我以为我想通了,但现在我的结果显示为:

<span class="ElementLeaf_elementTitle__xda82" data-test="pab-item-title">PLATE 1X1 ROUND</span>

<span class="ElementLeaf_elementTitle__xda82" data-test="pab-item-title">FLAT TILE 1X1, ROUND</span>

我想我只需要独特的文本部分。我正在遵循一门课程,但它没有涵盖这类问题。虽然我有一个感觉,这是非常简单的,我只是不能弄清楚。希望有人能帮助我。

import pandas
import requests
from bs4 import BeautifulSoup
webpage = requests.get("https://www.lego.com/nl-nl/pick-and-build/pick-a-brick?page=1&perPage=400&filters.i0.key=variants.attributes.colourId&filters.i0.values.i0=323")
content = webpage.content
result = BeautifulSoup(content, 'html.parser')
products = result.find_all("span", {"class": "ElementLeaf_elementTitle__xda82"})
names = []
   for item in products:
names.append(item)
   data = list(zip(names))
d = pandas.DataFrame(data, columns = ['Name'])
try:   
   d.to_excel("C:\\Users\\minib\\PycharmProjects\\pythonProject\\scraper\\lego.xlsx")
except:
   print("\nSomething went wrong! Please check your code.")
else:
   print("\nWeb data successfully written to Excel.")
finally:
   print("\nQuitting the program. Bye!")
gr8qqesn

gr8qqesn1#

要将有关项的数据获取到DataFrame,可以使用以下示例:

import requests
import pandas as pd
from bs4 import BeautifulSoup

url = "https://www.lego.com/nl-nl/pick-and-build/pick-a-brick?page=1&perPage=400&filters.i0.key=variants.attributes.colourId&filters.i0.values.i0=323"
soup = BeautifulSoup(requests.get(url).content, 'html.parser')

all_data = []
for item in soup.select('[data-test="pab-item"]'):
    title = item.select_one('[data-test="pab-item-title"]').text
    id_ = item.select_one('[data-test="element-item-id"]').text
    id_ = id_.split(maxsplit=1)[-1]
    price = item.select_one('[data-test="pab-item-price"]').text
    all_data.append({
        'Title': title,
        'ID': id_,
        'Price': price
    })

df = pd.DataFrame(all_data)

# print sample data to screen
print(df.head())

# save data to excel
# df.to_excel(...)

图纸:

Title             ID  Price
0            PLATE 1X1 ROUND   6382504/6141  €0,05
1       FLAT TILE 1X1, ROUND  6322818/35381  €0,04
2  MINI UPPER PART, NO. 5670  6352891/76382  €0,97
3              FLAT TILE 1X1   6251846/3070  €0,06
4        1/4 CIRCLE TILE 1X1  6199886/25269  €0,04

相关问题