pandas 工程计算指标

cnwbcb6i  于 2023-05-21  发布在  其他
关注(0)|答案(2)|浏览(131)

我正在为我的投资组合做一个Python项目,它正在向学生分配项目。为此,我从文件“student_Record.csv”导入数据。该文件包含学生的姓名、姓氏、分数和3个项目偏好。我想在给学生分配项目之前给每个项目分配一个索引。指数计算为指数(P)=50*(将P作为第一偏好的学生人数)+25 *(将P作为第二偏好的学生人数)+1 *(将P作为第三偏好的学生人数)
文件中的示例数据类似于Sample data
我做了以下工作:

import pandas as pd
df = pd.read_csv('sample_data.csv')

project_preferences = {}
for index, row in df.iterrows():
    # Get the student's preferences
    preferences = [row['Pref_1'], row['Pref_2'], row['Pref_3']]
    
    # Assign preference weights to projects
    for preference in preferences:
        project = int(preference)
        
        # Update the project's preference count
        if project not in project_preferences:
            project_preferences[project] = [0, 0, 0]
        project_preferences[project][preferences.index(preference)] += 1

# Calculate Wayne index for each project
wayne_index = {}
for project, preferences in project_preferences.items():
    wayne_index[project] = 50 * preferences[0] + 25 * preferences[1] + preferences[2]

# Print the Wayne index for each project
for project, index in wayne_index.items():
    print(f"Wayne index for Project {project}: {index}")

我做的对吗?

yr9zkbsy

yr9zkbsy1#

这是我对这个问题的看法。重点提示:尽可能避免在 Dataframe 中迭代行。
1.通过联合所有三列中的集合来读取所有项目。但是,它不会告诉您哪些项目没有答案,因此最好使用项目表/df完成此操作
1.对于每个项目,计算它在每列中的计数(这有一个try-except,如果value_count为零,那么你会得到一个键错误)
1.计算值并更新字典

import pandas as pd
    df = pd.read_csv('sample_data.csv')
    projects = set(df['pref_1']).union(set(df['pref_2'])).union(set(df['pref_3']))
    wanye_index={}
    for project in projects:
        try:
            first_pref_count=df['pref_1'].value_counts()[project]
        except:
            first_pref_count=0 #need an except here incase the count is zero, and then a key error occurs
        try:
            second_pref_count=df['pref_2'].value_counts()[project]
        except:
            second_pref_count=0
        try:
            third_pref_count=df['pref_3'].value_counts()[project]
        except:
            third_pref_count=0
        wayne_index.update({project: ((50 * first_pref_count) + (25 * second_pref_count) + third_pref_count)})
    print("wayne index by project (project:index): ", wayne_index)
qco9c6ql

qco9c6ql2#

这解决了我的问题:

import pandas as pd
    
    # Read the CSV file into a DataFrame
    df = pd.read_csv('student_record.csv', skiprows=1)
    
    # Initialize the wayne_index dictionary
    wayne_index = {}
    
    # Calculate the wayne_index for each number
    for number in range(1, 101):
        count_Pref_1 = df['Pref_1'].value_counts().get(number, 0)
        count_Pref_2 = df['Pref_2'].value_counts().get(number, 0)
        count_Pref_3 = df['Pref_3'].value_counts().get(number, 0)
        wayne_index= (50 * count_Pref_1) + (25 * count_Pref_2) + count_Pref_3
        print(wayne_index)import pandas as pd
    
    # Read the CSV file into a DataFrame
    df = pd.read_csv('student_record.csv', skiprows=1)
    
    # Initialize the wayne_index dictionary
    wayne_index = {}
    
    # Calculate the wayne_index for each number
    for number in range(1, 101):
        count_Pref_1 = df['Pref_1'].value_counts().get(number, 0)
        count_Pref_2 = df['Pref_2'].value_counts().get(number, 0)
        count_Pref_3 = df['Pref_3'].value_counts().get(number, 0)
        wayne_index= (50 * count_Pref_1) + (25 * count_Pref_2) + count_Pref_3
        print(wayne_index)

相关问题