将嵌套JSON转换为CSV而不对列值进行硬编码

fgw7neuy  于 2022-12-06  发布在  其他
关注(0)|答案(1)|浏览(167)

我是一个Python新手,我有一些数据文件,我想将其从JSON转换为CSV。问题是,我的代码返回了一个无法解决的错误,并且数据因文件而异,我希望有一个脚本,只需更改文件位置即可应用于多个文件。我不想硬编码公司名称和公司类型,但我不想“I don“我不知道怎么做。数据结构如下:

{
    "company_name": "Google",
    "company_type": "Public",
    "employees": [{
        "staff": [{
            "name": "John Doe",
            "type": "FTE",
            "id": "1111111111",
            "region": "Northeast"
        }, {
            "name": "Jane Doe",
            "type": "FTE",
            "id": "222222222",
            "region": "Northwest"
        }],
        "setup": [{
            "description": "Onsite",
            "location": "New York City"
        }, {
            "description": "Hybrid",
            "location": "Seattle"
        }],
        "role": [{
            "description": "Business Analyst",
            "salary": "70000"
        }, {
            "description": "Manager",
            "salary": "90000"
        }]
    }, {
        "contractors": [{
            "name": "Jessica Smith",
            "type": "PTE",
            "id": "333333333",
            "region": "Southeast"
        }],
        "setup": [{
            "description": "Remote",
            "location": "Miami"
        }],
        "role": [{
            "description": "Project Manager",
            "salary": "80000"
        }]
    }]
}

目前为止,我拥有的代码是:

import json
import csv
import ijson

file = open("C:/Users/User1/sample_file.json","w")
file_writer = csv.writer(file)
file_writer.writerow(("Company Name","Company Type","Name","Type","ID","Region","Description","Location","Description","Salary"))

with open("C:/Users/User1/sample_file.json","rb") as f:

  company_name = "Google"
  company_type = "Public"
  for record in ijson.items(f,"employees.item"):
    name = record['staff'][0]['name']
    type = record['staff'][0]['type']
    id = record['staff'][0]['id']
    region = record['staff'][0]['region']
    description = record['setup'][0]['description']
    location = record['setup'][0]['location']
    description = record['role'][0]['description']
    salary = record['role'][0]['salary']
    file_writer.writerow((comapny_name, company_type, name, type, id, region, description, location, description, salary))

file.close()

任何帮助都是非常感谢的。

de90aj5v

de90aj5v1#

假设所有文件都具有相同的通用结构,那么使用csv.DictWriter应该是可行的。只需遍历employee部分,创建一个字典来表示每个雇员,并在收集完所有数据后调用writer.writerow()
例如:

import csv
import json

data = json.load(open(filename))
columns = ["company name","company type","name","type","id","region","description","location","salary"]

def convert(data, headers):
    with open("employees.csv", "wt") as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=headers, extrasaction="ignore", restval=None)
        writer.writeheader()
        for emp_type in data["employees"]:
            lst = []
            for _, v in emp_type.items():
                for i,x in enumerate(v):
                    if len(lst) <= i:
                        lst.append({"company name": data["company_name"],
                                "company type": data["company_type"]})
                    lst[i].update(x)
            for item in lst:
                writer.writerow(item)

convert(data, columns)

输出

company name,company type,name,type,id,region,description,location,salary
Google,Public,John Doe,FTE,1111111111,Northeast,Business Analyst,New York City,70000
Google,Public,Jane Doe,FTE,222222222,Northwest,Manager,Seattle,90000
Google,Public,Jessica Smith,PTE,333333333,Southeast,Project Manager,Miami,80000

相关问题