从csv文件创建字典,其中特定列是键,行是Python中的值

cbwuti44  于 2023-04-04  发布在  Python
关注(0)|答案(2)|浏览(110)

我有一个csv文件格式如下:

jersey number,   position,   name,   birth_date,    birth_city,   years_in_nba,   team
23           ,     SF    ,   Lebron,  12/30/84 ,     Akron    ,      19      ,   Lakers
30           ,     PG    ,   Curry,   03/14/88 ,     Akron    ,      13      ,   Warriors
34           ,     PF    ,   Giannis, 08/26/89 ,     Athens    ,      8      ,   Bucks

我的目标是能够按位置对行进行分组,其中关键是不同的位置,并且值是一个球员信息列表,这些球员信息具有共同的相同位置,如下图所示:

{
"SF": [{..player1..}, {..player2..}],
"PF": [{..player1..}, {..player2..}],
"SG": [{..player1..}, {..player2..}],
"PG": [{..player1..}, {..player2..}],
}

这是我目前掌握的情况:

positions = {}
def players_position():
with open(filename, 'r') as file_obj:
    dict_reader = DictReader(file_obj, delimiter=",")
    for row in dict_reader:
        positions[row["position"]] = row
return positions

print(players_position())

与代码,因为它是,当前值将永远覆盖前一个。我不希望这样。我想有一个列表,我们可以不断追加球员信息,如上面显示。

ct2axkht

ct2axkht1#

试试这个

positions = {}

def players_position():
    with open(filename, 'r') as file_obj:
        dict_reader = DictReader(file_obj, delimiter=",")
        for row in dict_reader:
            position = row["position"]
            if position in positions:
                positions[position].append(row)
            else:
                positions[position] = [row]
    return positions

print(players_position())

该函数检查position是否已经在positions中。如果是,我们只需将当前row添加到已经在该位置的玩家列表中。如果不是,我们创建一个包含当前row的新列表并将其分配给positions[position]
请注意,这是基于你的答案。

nlejzf6q

nlejzf6q2#

开始了。我最初试图根据你所拥有的来做这个,但是因为在CSV原始数据中分割的数据最终会有一堆空白填充,所以我做了一些额外的工作来清理键/值。

import csv

CSV = """
# jersey number,   position,   name,   birth_date,    birth_city,   years_in_nba,   team
# 23           ,     SF    ,   Lebron,  12/30/84 ,     Akron    ,      19      ,   Lakers
# 30           ,     PG    ,   Curry,   03/14/88 ,     Akron    ,      13      ,   Warriors
# 34           ,     PF    ,   Giannis, 08/26/89 ,     Athens    ,      8      ,   Bucks
"""

def main():
    players = [player for player in csv.DictReader(CSV.splitlines())]
    # gives me:

    """
    {None: ['# jersey number', '   position', '   name', '   birth_date', '    birth_city', '   years_in_nba', '   team']}
    {None: ['# 23           ', '     SF    ', '   Lebron', '  12/30/84 ', '     Akron    ', '      19      ', '   Lakers']}
    {None: ['# 30           ', '     PG    ', '   Curry', '   03/14/88 ', '     Akron    ', '      13      ', '   Warriors']}
    {None: ['# 34           ', '     PF    ', '   Giannis', ' 08/26/89 ', '     Athens    ', '      8      ', '   Bucks']}
    """

    # strip the keys because extra whitespace, and because of how DictReader makes arrays, we have to do this:
    keys = [k.strip() for k in [v for v in players.pop(0).values()][0]]

    # then let's gather the players and strip the values so they don't have unnecessary spaces
    players = [[p.strip() for p in list(p.values())[0]] for p in [v for v in players]]

    # and combine (zip) the keys and values to make a dictionary
    players_list = [dict(zip(keys, p)) for p in players]

    # this gives me:
    """
    {'# jersey number': '# 23', 'position': 'SF', 'name': 'Lebron', 'birth_date': '12/30/84', 'birth_city': 'Akron', 'years_in_nba': '19', 'team': 'Lakers'}
    {'# jersey number': '# 30', 'position': 'PG', 'name': 'Curry', 'birth_date': '03/14/88', 'birth_city': 'Akron', 'years_in_nba': '13', 'team': 'Warriors'}
    {'# jersey number': '# 34', 'position': 'PF', 'name': 'Giannis', 'birth_date': '08/26/89', 'birth_city': 'Athens', 'years_in_nba': '8', 'team': 'Bucks'}
    """

    # last, create a dictionary of lists grouping by position using a list comprehension
    # you could do this in a for loop, but we're writing python, after all...
    players_by_position = {
        position: [
        player for player in players_list if player['position'] == position
      ] for position in set([player['position'] for player in players_list])
    }
    
    return players_by_position

    """
    {
      'SF': [
        {
          '# jersey number': '# 23', 
          'position': 'SF', 
          'name': 'Lebron', 
          'birth_date': '12/30/84', 
          'birth_city': 'Akron', 
          'years_in_nba': '19', 
          'team': 'Lakers'
        }
      ], 
      'PG': [
        {
          '# jersey number': '# 30', 
          'position': 'PG', 
          'name': 'Curry', 
          'birth_date': '03/14/88', 
          'birth_city': 'Akron', 
          'years_in_nba': '13', 
          'team': 'Warriors'
        }
      ], 
      'PF': [
        {
          '# jersey number': '# 34', 
          'position': 'PF', 
          'name': 'Giannis', 
          'birth_date': '08/26/89', 
          'birth_city': 'Athens', 
          'years_in_nba': '8', 
          'team': 'Bucks'
        }
      ]
    }
    """

if __name__ == "__main__":
    print(main())

相关问题