如何从复杂的JSON文件中提取数据集?

j2cgzkjk  于 2023-05-19  发布在  其他
关注(0)|答案(1)|浏览(190)

使用Python3,我想将下面格式的json数据转换为一个简单的表,其中包含['domain']['axes']['t '][' values ']和[' ranges '][' global-radiation '][' values ']。我如何在不使用Pandas的情况下编程?

{
  "type" : "Coverage",
  "title" : {
    "en" : "Grid Feature"
  },
  "domain" : {
    "type" : "Domain",
    "domainType" : "Grid",
    "axes" : {
      "t" : {
        "values" : [ "2023-05-08T08:00:00.000Z", "2023-05-08T09:00:00.000Z", "2023-05-08T10:00:00.000Z", "2023-05-08T11:00:00.000Z", "2023-05-08T12:00:00.000Z", "2023-05-08T13:00:00.000Z", "2023-05-08T14:00:00.000Z", "2023-05-08T15:00:00.000Z", "2023-05-08T16:00:00.000Z", "2023-05-08T17:00:00.000Z", "2023-05-08T18:00:00.000Z", "2023-05-08T19:00:00.000Z", "2023-05-08T20:00:00.000Z", "2023-05-08T21:00:00.000Z", "2023-05-08T22:00:00.000Z", "2023-05-08T23:00:00.000Z", "2023-05-09T00:00:00.000Z", "2023-05-09T01:00:00.000Z", "2023-05-09T02:00:00.000Z", "2023-05-09T03:00:00.000Z", "2023-05-09T04:00:00.000Z", "2023-05-09T05:00:00.000Z", "2023-05-09T06:00:00.000Z", "2023-05-09T07:00:00.000Z", "2023-05-09T08:00:00.000Z" ]
      },
      "x" : {
        "values" : [ 12.26646929541765 ],
        "bounds" : [ 12.26646929541765, 12.26646929541765 ]
      },
      "y" : {
        "values" : [ 55.49876291703976 ],
        "bounds" : [ 55.49876291703976, 55.49876291703976 ]
      }
    },
    "referencing" : [ {
      "coordinates" : [ "x", "y" ],
      "system" : {
        "type" : "GeographicCRS",
        "id" : "http://www.opengis.net/def/crs/OGC/1.3/CRS84"
      }
    }, {
      "coordinates" : [ "t" ],
      "system" : {
        "type" : "TemporalRS",
        "calendar" : "Gregorian"
      }
    } ]
  },
  "parameters" : {
    "global-radiation" : {
      "type" : "Parameter",
      "description" : {
        "en" : "Global radiation"
      },
      "observedProperty" : {
        "label" : {
          "en" : "https://apps.ecmwf.int/codes/grib/param-db/?id=300117"
        }
      }
    }
  },
  "ranges" : {
    "global-radiation" : {
      "type" : "NdArray",
      "dataType" : "float",
      "axisNames" : [ "t", "y", "x" ],
      "shape" : [ 25, 1, 1 ],
      "values" : [ 4739083.5, 7158156.0, 9916988.0, 1.2867561E7, 1.5854004E7, 1.8688858E7, 2.1224932E7, 2.3335228E7, 2.4934776E7, 2.598796E7, 2.6518532E7, 2.6639176E7, 2.6638888E7, 2.663874E7, 2.6638976E7, 2.6638976E7, 2.6638976E7, 2.6638976E7, 2.6638976E7, 2.6638976E7, 2.670284E7, 2.7124774E7, 2.8051116E7, 2.9527746E7, 3.1528238E7 ]
    }
  }
}

我通过URL得到了数据集,我已经编程了一些不能真正工作的东西。什么是错误/缺失?

data = requests.get(url)
    binary = data.content
    output = json.loads(binary)

    print(output['domain']['axes']['t']['values'][1])
    dates = output['domain']['axes']['t']['values']
    print(output['ranges']['global-radiation']['values'][1])
    globrad = output['ranges']['global-radiation']['values']
    
    print('Records:')
    for d in dates:
        print(d['domain']['axes']['t']['values'], d['ranges']['global-radiation']['values'])
        #print(output['ranges']['global-radiation']['values'][d])

我试图使用for语句提取形式为“2023-05- 08 T08:00:00.000Z”,4739083.5的数据行,但要么我得到了所有的日期时间,然后是所有的全局辐射值,要么我得到了像d不应该是str()这样的错误。我认为这很简单,但我被卡住了。

d7v8vwbk

d7v8vwbk1#

你有两个列表,你可以使用enumerate()迭代第一个列表,并使用索引访问第二个列表中的相应值:

print('Records:')
for idx, d in enumerate(output['domain']['axes']['t']['values']):
    print(f"{d}, {output['ranges']['global-radiation']['values'][idx]}")

结果如下所示:

Records:
2023-05-08T08:00:00.000Z, 4739083.5
2023-05-08T09:00:00.000Z, 7158156.0
2023-05-08T10:00:00.000Z, 9916988.0
2023-05-08T11:00:00.000Z, 12867561.0
2023-05-08T12:00:00.000Z, 15854004.0
2023-05-08T13:00:00.000Z, 18688858.0
2023-05-08T14:00:00.000Z, 21224932.0
2023-05-08T15:00:00.000Z, 23335228.0
2023-05-08T16:00:00.000Z, 24934776.0
2023-05-08T17:00:00.000Z, 25987960.0
2023-05-08T18:00:00.000Z, 26518532.0
2023-05-08T19:00:00.000Z, 26639176.0
2023-05-08T20:00:00.000Z, 26638888.0
2023-05-08T21:00:00.000Z, 26638740.0
2023-05-08T22:00:00.000Z, 26638976.0
2023-05-08T23:00:00.000Z, 26638976.0
2023-05-09T00:00:00.000Z, 26638976.0
2023-05-09T01:00:00.000Z, 26638976.0
2023-05-09T02:00:00.000Z, 26638976.0
2023-05-09T03:00:00.000Z, 26638976.0
2023-05-09T04:00:00.000Z, 26702840.0
2023-05-09T05:00:00.000Z, 27124774.0
2023-05-09T06:00:00.000Z, 28051116.0
2023-05-09T07:00:00.000Z, 29527746.0
2023-05-09T08:00:00.000Z, 31528238.0

相关问题