Python Web抓取：Xbox游戏

9lowa7mx 于 2023-06-28 发布在 Python

关注(0)|答案(2)|浏览(133)

我是非常新的网页scarping和我试图刮的xbox网站得到gameid，game name，game desc，game genre，和game tags
网站链接：XBOX
到目前为止的代码：

import requests
from bs4 import BeautifulSoup

response = requests.get("https://www.xbox.com/en-us/games/all-games?cat=all")
soup = BeautifulSoup(response.content, "html.parser")
games = []
for game in soup.find_all("div", class_="gameDivsWrapper x-hidden-focus"):
    print(game)

我被卡住了，我不知道在哪里寻找上面的项目。

python

来源：https://stackoverflow.com/questions/76569290/python-web-scraping-xbox-games

2条答案

按热度按时间

mepcadol1#

根据所提供的代码，似乎您已经正确地检索了游戏div。下面是一段代码，用于检索游戏ID、游戏名称、游戏描述、游戏类型和游戏标签。

import requests
from bs4 import BeautifulSoup

response = requests.get("https://www.xbox.com/en-us/games/all-games?cat=all")
soup = BeautifulSoup(response.content, "html.parser")

games = []

for game in soup.find_all("div", class_="gameDivsWrapper x-hidden-focus"):
    game_id = game["data-bi-id"]
    game_name = game.find("a", class_="c-heading-5 link-emphasis").text.strip()
    game_desc = game.find("div", class_="x-game-description").text.strip()
    
    # Extracting game genre and tags
    genre_tags = game.find_all("span", class_="c-meta-text x-game-genre")
    game_genre = genre_tags[0].text.strip() if genre_tags else ""
    
    tag_elements = game.find_all("span", class_="c-meta-text x-game-tags")
    game_tags = [tag.text.strip() for tag in tag_elements] if tag_elements else []
    
    game_info = {
        "game_id": game_id,
        "game_name": game_name,
        "game_desc": game_desc,
        "game_genre": game_genre,
        "game_tags": game_tags
    }
    games.append(game_info)

# Printing the scraped game information
for game in games:
    print(game)

然后将提取的信息存储在字典（game_info）中并附加到游戏列表中。Xbox网站的特定类和结构可能会随着时间的推移而改变，因此仔细检查您试图抓取的元素的HTML结构和类名总是一个好主意。

赞(0）回复(0）举报 2023-06-28

ewm0tg9j2#

另一个答案由ChatGPT生成，
我会根据我的经验给你一些个人建议。如果你愿意使用Node.js，有很多包可以用于抓取，还有一个Xbox Live API包可以帮助你更正式地验证和访问API-这意味着你根本不需要抓取。然而，获取游戏信息的方法是不知道的，所以里程可能会有所不同。
如果你不想用这个，我还有其他建议：
确保你正确选择了每一个游戏。我不熟悉Python的使用，但实际上您需要使用#ContentBlockList_1 > div.thecatalog > div.gameList > div.gameDivsWrapper > div来选择这些项
这将包括一些非游戏，但这将允许您通过div循环并提取嵌入的信息。它不会提供您正在寻找的所有信息，例如：

<div class="m-product-placement-item f-size-medium context-game gameDiv" itemscope="" itemtype="http://schema.org/Product" data-bigid="9PP97VC2BL8H" data-releasedate="2023-06-05T23:00:00.0000000Z" data-msproduct="false" data-multiplayer="true" data-rating="MATURE 17+" data-ratingsystem="ESRB" data-listprice="69.99">

此元素缺少名称和说明。所以在这个div中，你必须选择/解析额外的属性，名称和描述可以在类“pop-info”下找到。希望这有所帮助，尝试ChatGPT答案也有类似的成功率。

赞(0）回复(0）举报 2023-06-28

我来回答

Python Web抓取：Xbox游戏

2条答案

相关问题

热门标签

最新问答