excel JSON的问题:使用主对象中嵌入的多个JSON对象将JSON转换为CSV

hm2xizp9  于 2023-11-20  发布在  其他
关注(0)|答案(2)|浏览(103)

我在将JSON对象转换为CSV时遇到了一些格式问题
我的json文件是一个json对象的列表,它们都共享相同的主键名称,除了Dict 2,有些可能没有,Dict 1可能是也可能不是字典类型,也可能包括其他字典。输出将是所有这些对象的csv文件,尽管它们的参数
Json对象是:

[{
        "Name": "NameOfObject1",
        "Number": "objectNumber1",
        "Dict1": {
            "Key1": "true",
            "Key2": "false",
            "Key3": {
                     "key1InK3": "Hello",
                     "key2InK3": "There",
}
        },
        "BooleanValue": false,
    },
{
        "Name": "NameOfObject2",
        "Number": "objectNumber2",
        "Dict1": {
            "Key1": "true",
            "Key2": "false",
            "Key3": "true"
        },
        "BooleanValue": false,
        "Dict2":{
            "otherKey1" : "this ",
            "otherKey2" : "is ",
            "otherKey3" : "a ",
            "otherKey4" : "test ",
            "otherKey5" : "haha ",
            "otherKey6" : "i ",
            "otherKey7" : "like ",
            "otherKey8" : "strawberry "
        }
    },
{
        "Name": "NameOfObject3",
        "Number": "objectNumber",
        "Dict1": "this is a string",
        "BooleanValue": false
    }
]

字符串
我的输出看起来像这样:

Name           Number        Dic1             Dic1 Values Dic1 Values Of dictionaries Inside Boolean Value Dic2          Dic2 Values   
         
NameOfObject1 objectNumber1                                                                 FALSE                                     
NameOfObject1 objectNumber1 Key1             TRUE                                           FALSE                                     
NameOfObject1 objectNumber1 Key2                                                            FALSE                                     
NameOfObject1 objectNumber1 Key2             key1InK2                                       FALSE                                     
NameOfObject1 objectNumber1 Key2             key1InK2    Hello                              FALSE                                     
NameOfObject1 objectNumber1 Key2             key2InK2                                       FALSE                                     
NameOfObject1 objectNumber1 Key2             key2InK2    There                              FALSE                                     
NameOfObject2 objectNumber2                              TRUE                                                                         
NameOfObject1 objectNumber1 Key3                                                            FALSE                                     
NameOfObject1 objectNumber1 Key3             key1InK3                                       FALSE                                     
NameOfObject1 objectNumber1 Key3             key1InK3    Good                               FALSE                                     
NameOfObject1 objectNumber1 Key3             key2InK3                                       FALSE                                     
NameOfObject1 objectNumber1 Key3             key2InK3    Morining                           FALSE                                     
NameOfObject2 objectNumber2 Key1             TRUE                                           TRUE                                      
NameOfObject2 objectNumber2 Key2             FALSE                                          TRUE                                      
NameOfObject2 objectNumber2 Key3             TRUE                                           TRUE                                      
NameOfObject2 objectNumber2                                                                 TRUE          otherKey1     This          
NameOfObject2 objectNumber2                                                                 TRUE          otherKey2     is            
NameOfObject2 objectNumber2                                                                 TRUE          otherKey3     a             
NameOfObject2 objectNumber2                                                                 TRUE          otherKey4     test          
NameOfObject2 objectNumber2                                                                 TRUE          otherKey5     haha          
NameOfObject2 objectNumber2                                                                 TRUE          otherKey6     i             
NameOfObject2 objectNumber2                                                                 TRUE          otherKey7     like          
NameOfObject2 objectNumber2                                                                 TRUE          otherKey8     strawberry    
NameOfObject3 objectNumber3 This is a String                                                FALSE


我希望它看起来像这样:注意最后2列是如何上移的,对象2的最后3行是如何被删除的。还注意对象2中Dic 1值为空的所有行也是如何被删除的

Name           Number        Dic1             Dic1 Values Dic1 Values Of dictionaries Inside Boolean Value Dic2          Dic2 Values   

NameOfObject1 objectNumber1                                                                 FALSE                                     
NameOfObject1 objectNumber1 Key1             TRUE                                           FALSE                                     
NameOfObject1 objectNumber1 Key2             key1InK2    Hello                              FALSE                                     
NameOfObject1 objectNumber1 Key2             key2InK2    There                              FALSE                                     
NameOfObject2 objectNumber2                              TRUE                                                                         
NameOfObject1 objectNumber1 Key3             key1InK3    Good                               FALSE                                     
NameOfObject1 objectNumber1 Key3             key2InK3    Morining                           FALSE                                     
NameOfObject2 objectNumber2                  TRUE                                           TRUE                                      
NameOfObject2 objectNumber2 Key1             TRUE                                           TRUE          otherKey1     this          
NameOfObject2 objectNumber2 Key2             FALSE                                          TRUE          otherKey2     is            
NameOfObject2 objectNumber2 Key3             TRUE                                           TRUE          otherKey3     a             
NameOfObject2 objectNumber2                                                                 TRUE          otherKey4     test          
NameOfObject2 objectNumber2                                                                 TRUE          otherKey5     haha          
NameOfObject2 objectNumber2                                                                 TRUE          otherKey6     i             
NameOfObject2 objectNumber2                                                                 TRUE          otherKey7     like          
NameOfObject2 objectNumber2                                                                 TRUE          otherKey8     strawberry    
NameOfObject2 objectNumber2                                                                 TRUE          otherKey      i             
NameOfObject2 objectNumber2                                                                 TRUE          otherKey7     like          
NameOfObject2 objectNumber2                                                                 TRUE          otherKey8     strawberry    
NameOfObject3 objectNumber3 This is a String                                                FALSE


我不知道如何做到这一点,我不明白背后的逻辑。我不想重写整个代码。代码太长,无法在这里张贴,但这里是它背后的逻辑

ListOfLines = []
Loop Through Dictionary keys file
    if key is dictionary:
          append a new line and recursively call the function
     else: (key is not dictionary)
          add the keys and values to a line
Formats all the lines and cleans up the list then writes to the output file


你可以看到if语句,新创建了一行,所以字典输出总是在下一行。当只有一个字典时,这很好用,但是当有两个字典时,我得到了上面显示的输出,这不坏,但也不好。我不想改变代码,除非绝对必要。
我的思路是简单地使用python将单元格组向上移动几行,但我不知道如何做到这一点。即使我知道,我也不能简单地将其向上移动3行,因为其他文件在Dict 1中可能有5-7个键。
总之,我的问题是:我如何使用Python移动一组单元格,基于这种逻辑,我想不出一种方法来放入代码中?对此有任何帮助吗?提前感谢

z2acfund

z2acfund1#

你可以用pd.json_normalize(..., max_level=0)来试试:

import pandas as pd
import json

data = json.loads(jdata)  # jdata is your JSON as string
df = pd.json_normalize(data, max_level=0)

# Extract dict columns
dcols = [k for k, v in data.items() if isinstance(v, dict)]

# Explode dict as DataFrame
dfs = []
for col in dcols:
    dd = (pd.json_normalize(df.pop(col)).squeeze()
            .rename_axis(col).rename(f'{col} Values')
            .reset_index())
    dfs.append(dd)

# Create the final dataframe
out = df.merge(pd.concat(dfs, axis=1), how='cross')

字符串
输出量:

>>> out
           Name        Number  BooleanValue Dict1 Dict1 Values      Dict2 Dict2 Values
0  NameOfObject  objectNumber         False  Key1         true  otherKey1        this 
1  NameOfObject  objectNumber         False  Key2        false  otherKey2          is 
2  NameOfObject  objectNumber         False  Key3         true  otherKey3           a 
3  NameOfObject  objectNumber         False   NaN          NaN  otherKey4        test 
4  NameOfObject  objectNumber         False   NaN          NaN  otherKey5        haha 
5  NameOfObject  objectNumber         False   NaN          NaN  otherKey6           i 
6  NameOfObject  objectNumber         False   NaN          NaN  otherKey7        like 
7  NameOfObject  objectNumber         False   NaN          NaN  otherKey8  strawberry 

>>> df
           Name        Number                                              Dict1  BooleanValue                                              Dict2
0  NameOfObject  objectNumber  {'Key1': 'true', 'Key2': 'false', 'Key3': 'true'}         False  {'otherKey1': 'this ', 'otherKey2': 'is ', 'ot...


输入:

jdata = """
{
    "Name": "NameOfObject",
    "Number": "objectNumber",
    "Dict1": {
        "Key1": "true",
        "Key2": "false",
        "Key3": "true"
    },
    "BooleanValue": false,
    "Dict2":{
        "otherKey1" : "this ",
         "otherKey2" : "is ",
         "otherKey3" : "a ",
         "otherKey4" : "test ",
         "otherKey5" : "haha ",
         "otherKey6" : "i ",
         "otherKey7" : "like ",
         "otherKey8" : "strawberry "
    }
}
"""

zte4gxcn

zte4gxcn2#

我建立在Corralien的答案上,我让它按照我想要的方式工作

import pandas as pd
import json
with open('file.json','r') as f:
    fileStuff = json.load(f)
for obj in fileStuff:
   jdata=json.dump(obj)
   data = json.loads(jdata)  # jdata is your JSON as string
   df = pd.json_normalize(data, max_level=0)

   # Extract dict columns
   dcols = [k for k, v in data.items() if isinstance(v, dict)]

   # Explode dict as DataFrame
   dfs = []
   for col in dcols:
       dd = (pd.json_normalize(df.pop(col)).squeeze()
            .rename_axis(col).rename(f'{col} Values')
            .reset_index())
       dfs.append(dd)

   # Create the final dataframe
   if len(dcols)>0: #concat throws an exception when it a basic json object
      df = df.merge(pd.concat(dfs, axis=1), how='cross')
   df.to_csv('out.csv', 'a',index=False)

字符串
之后你会得到一个包含所有json对象的csv文件,这个文件需要一些清理,但它实际上只是处理基本的pandas框架,这取决于你的情况

相关问题