如何将指定的excel工作表读入单独的数据框?

nfs0ujit  于 2021-09-29  发布在  Java
关注(0)|答案(0)|浏览(160)

我是python新手,需要一些帮助。我有一个函数,它接受一个 Dataframe 并返回一个有序字典来创建一个yaml文件,还有一个函数接受创建的有序字典并执行yaml转储。该函数适用于单个字典,但我想添加一个附加函数,可以将多个输入字典输出到一个yaml文件中。
以这两个 Dataframe 为例,将它们转换为两个有序字典:

df1:
      Name        Occupation  Years  JobSatisfaction
0     Lucy Black  Teacher     10     Yes
1     John Doe    Gardener    3      Yes

df2:
      Name        Age  Salary  CompanyName
0     Lucy Black  31   $38000  LAUSD
1     John Doe    23   $17000  Beautiful Lawn

现在,如何一次读取两个有序字典并使yaml输出如下所示:


# In this case, the key_index='Name'for both df1 and df2:

-df1:
   Lucy Black:
      -Occupation:Teacher
      -Years:10
      -JobSatisfaction: Yes
   John Doe:
      -Occupation:Gardener
      -Years:3
      -JobSatisfaction: Yes
-df2:
   Lucy Black:
      -Age: 31
      -Salary: $38000
      -CompanyName: LAUSD
   John Doe:
      -Age: 23
      -Salary: $17000
      -CompanyName: Beautiful Lawn

下面是一些我喜欢的代码格式的框架代码:

def yaml_output(arguments):
"""
considerations/notes/checks:
        - we want the output yaml to have one outermost dictionary per input 
          excel sheet (so in our case, this will be two total: df1, df2)
        - confirm the first two entries for each dictionary match the example 
          yaml file contained in this directory
"""    
    function parameters
    return()

到目前为止,我的代码是这样的:

def create_dict(df, key_index='uniqueID'):
    """
    Function create_dict to take input pandas df and return dictionary which will 
    then be used to create final YAML file

    args:
        df: pandas 2-dim labeled data structure
        key_index: the column that we want set as the index

    input: pandas df
    returns: dictionary

    """
    # Get the unordered dictionary
    unordered_dict = df.set_index(key_index).T.to_dict()

    # Order the dictionary
    ordered_dict = OrderedDict((k,unordered_dict.get(k)) for k in df[key_index])

    return ordered_dict

def dump_ordered(dictionary):
    """
    Serialize the ordered dictionary into a YAML stream 

    args:
        dictionary: ordered collection of data values that were 
        converted from pandas dataframes

    input: Ordered dictionary
    return: ordered yaml 
    """
    yaml.add_representer(OrderedDict, lambda dumper, 
                         data:dumper.represent_mapping('tag:yaml.org,2002:map',     
                         data.items()))

    return yaml.dump(dictionary)

最终,我希望能够向新创建的函数传递具有每个字典名称的可变数量的字典(可以是元组列表?),以便所有字典成为一个输出yaml文件。如果您需要进一步的澄清,请告诉我。非常感谢。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题