pandas 无法将展平的JSON数据上载到MySQL

olmpazwi  于 2022-12-16  发布在  Mysql
关注(0)|答案(1)|浏览(110)

我有这个JSON文件,因为JSON数据是嵌套的,我扁平化它,但现在当我尝试上传到SQL服务器它说的错误.

[1 rows x 21 columns]
Traceback (most recent call last):
  File "c:\Users\Anjan\OneDrive\Desktop\BIG_DATA\dataframe.py", line 50, in <module>
    df = pd.read_json(final_df)
  File "C:\Users\Anjan\AppData\Roaming\Python\Python310\site-packages\pandas\util\_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Anjan\AppData\Roaming\Python\Python310\site-packages\pandas\util\_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Anjan\AppData\Roaming\Python\Python310\site-packages\pandas\io\json\_json.py", line 733, in read_json
    json_reader = JsonReader(
  File "C:\Users\Anjan\AppData\Roaming\Python\Python310\site-packages\pandas\io\json\_json.py", line 818, in __init__
    data = self._get_data_from_filepath(filepath_or_buffer)
  File "C:\Users\Anjan\AppData\Roaming\Python\Python310\site-packages\pandas\io\json\_json.py", line 858, in _get_data_from_filepath
    self.handles = get_handle(
  File "C:\Users\Anjan\AppData\Roaming\Python\Python310\site-packages\pandas\io\common.py", line 704, in get_handle
    if _is_binary_mode(path_or_buf, mode) and "b" not in mode:
  File "C:\Users\Anjan\AppData\Roaming\Python\Python310\site-packages\pandas\io\common.py", line 1163, in _is_binary_mode
    return isinstance(handle, _get_binary_io_classes()) or "b" in getattr(
TypeError: argument of type 'method' is not iterable

下面是文件中的JSON数据。

{
  "reporting_entity_name": "medicare",
  "reporting_entity_type": "medicare",
  "plan_name": "medicaid",
  "plan_id_type": "hios",
  "plan_id": "1111111111",
  "plan_market_type": "individual",
  "last_updated_on": "2020-08-27",
  "version": "1.0.0",
  "in_network": [
    {
      "negotiation_arrangement": "ffs",
      "name": "Knee Replacement",
      "billing_code_type": "CPT",
      "billing_code_type_version": "2020",
      "billing_code": "27447",
      "description": "Arthroplasty, knee condyle and plateau, medial and lateral compartments",
      "negotiated_rates": [
        {
          "provider_groups": [
            {
              "npi": [0],
              "tin": {
                "type": "ein",
                "value": "11-1111111"
              }
            }
          ],
          "negotiated_prices": [
            {
              "negotiated_type": "negotiated",
              "negotiated_rate": 123.45,
              "expiration_date": "2022-01-01",
              "billing_class": "institutional"
            }
          ]
        }
      ]
    }
  ]
}

我已经展平了JSON数据,并尝试使用Python代码将其上传到SQL。

import json
import pandas as pd
from sqlalchemy import create_engine
from functools import reduce
with open('new_ravi_test.json', 'r') as f:
    data = json.loads(f.read())

df_main = pd.json_normalize(
    data=data,
    meta=["reporting_entity_name", "reporting_entity_type", "plan_name", "plan_id_type",
          "plan_id", "plan_market_type", "last_updated_on", "version"],
    record_path=["in_network"]
).drop(columns="negotiated_rates")

df_provider = pd.json_normalize(
    data=data,
    meta=["reporting_entity_name", "reporting_entity_type", "plan_name", "plan_id_type",
          "plan_id", "plan_market_type", "last_updated_on", "version"],
    record_path=["in_network", "negotiated_rates", "provider_groups"]
)

df_prices = pd.json_normalize(
    data=data,
    meta=["reporting_entity_name", "reporting_entity_type", "plan_name", "plan_id_type",
          "plan_id", "plan_market_type", "last_updated_on", "version"],
    record_path=["in_network", "negotiated_rates", "negotiated_prices"]
)

dfs = [df_main, df_provider, df_prices]
final_df = reduce(lambda left, right: pd.merge(
    left,
    right,
    on=["reporting_entity_name", "reporting_entity_type", "plan_name", "plan_id_type",
        "plan_id", "plan_market_type", "last_updated_on", "version"]
), dfs).explode("npi")

print(final_df)
engine = create_engine('mysql+pymysql://root:@localhost/json')

df = pd.read_json(final_df)

df.to_sql("test_file_01", con=engine, if_exists='replace', index=False)

下面是JSON表的外观

negotiation_arrangement name    billing_code_type   billing_code_type_version   billing_code    description reporting_entity_name   reporting_entity_type   plan_name   plan_id_type    plan_id plan_market_type    last_updated_on version npi tin.type    tin.value   negotiated_type negotiated_rate expiration_date billing_class
0   ffs Knee Replacement    CPT 2020    27447   Arthroplasty, knee condyle and plateau, medial and lateral compartments medicare    medicare    medicaid    hios    1111111111  individual  2020-08-27  1.0.0   0   ein 11-1111111  negotiated  123.45  2022-01-01  institutional

任何人都可以请提供的代码块,我可以用它来上传JSON数据到SQL。

cx6n0qe3

cx6n0qe31#

您的问题似乎与此行有关:

df = pd.read_json(final_df)

read_json方法从一个源获取JSON数据。它可以是一个包含路径的字符串、一个数据缓冲区或一个打开的文件句柄。在我看来,final_df不是这些,而是一个dataframe。我希望这就是为什么您会看到错误的原因。

相关问题