json Web Scraping“Exception in thread Thread-1(periodic):“error - Python

js4nwp54  于 2023-07-01  发布在  Python
关注(0)|答案(1)|浏览(125)

因此,我需要在从Web抓取生成的JSON文件中查找某个元素。我试着寻找一些元素,如“time_pretty”,“ip_address”等。这一切都打印出来很好,但一些元素,我尝试打印如“组织”,打印也许一半的名单,然后在此之后,它给了我这个错误。

Exception in thread Thread-1 (periodic):
Traceback (most recent call last):
  File "C:\Program Files\Python310\lib\threading.py", line 1009, in _bootstrap_inner
    self.run()
  File "C:\Program Files\Python310\lib\threading.py", line 946, in run
    self._target(*self._args, **self._kwargs)
  File "d:\Visual Studio Code\TechCellar\scraper3.py", line 33, in periodic
    organization = i["organization"]
KeyError: 'organization'

这是我的代码的主要部分。由于保密性,我不会包含参数:)。
图书馆

import requests
import json
from threading import Thread
from timed_count import timed_count

代码

def periodic():
    for count in timed_count(60):
        response = requests.get(api_url, params=params)

        # Execute code here exactly every 60 seconds
        if (response.status_code == 200):
            data = response.json()

            arr = []
            for element in data:
                time = element["dates"][0]["items"]
                for i in time:
                    time2 = i["ip_address"]
                    organization = i["organization"]
                    # arr.append(organization)
                    print(organization)

            data_file = open("visitors_list", "w")
            json_formatted_str = json.dump(data, data_file, indent=4)

            data_file.close()

        else:
            print("Error connecting to Clicky.com API.\nError: " +
                  response.status_code)

thread = Thread(target=periodic)
thread.start()

这里https://i.stack.imgur.com/sUpiu.pnghttps://i.stack.imgur.com/RaEkR.png是JSON文件
这里是https://i.stack.imgur.com/YdTYx.pnghttps://i.stack.imgur.com/9ajKk.png,这是输出。

yqhsw0fo

yqhsw0fo1#

如果不查看JSON文件,很难确定。我将检查items数组中键查找失败的元素的索引,并确保它包含organization键。如果元素之间的键不一致,则需要更好的错误处理来防止此问题发生。

for element in data:
    time = element["dates"][0]["items"]
    for index, i in enumerate(time):
        time2 = i["ip_address"]
        organization = i["organization"]
        print(f"{index}: {organization}")

相关问题