将负值转换为1,并从pandas中相同变量的+ve值中减少这些值

w46czmvw  于 2023-11-15  发布在  其他
关注(0)|答案(2)|浏览(104)

我有一个相框

df_in = pd.DataFrame([["A",-2],["B",23],["A",-4],["A",14],["B",12],["A",34],["B",-4],["C",-1],["A",-5],["B",21],["C",4],["B",-6]], columns=['var', 'val'])

个字符
我想将所有负值转换为1。找到值与1之间的差,在var级别对其求和。将总和除以var的+ve值的数量,并从var的+ ve值中减少该值。例如:var A有3个负值和2个正值。将所有负值转换为1,并找到-2和1之间的差为3,-4和1是5 -5和1是6。3+5+6之和=14。A有2 +个值。所以14除以2是7。现在从A的+ve值(23,34)中减去7。同样地,对其他变量也重复,或者执行groupby。
预期输出为:

df_out = pd.DataFrame([["A",1],["B",19],["A",1],["A",7],["B",8],["A",27],["B",1],["C",1],["A",1],["B",17],["C",2],["B",1]], columns=['var', 'val'])
var val
A   1
B   19
A   1
A   7
B   8
A   27
B   1
C   1
A   1
B   17
C   2
B   1

的字符串
怎么做?

nkhmeac6

nkhmeac61#

def modify(avg, num):
    if num < 0:
        return 1
    if num > 0:
        return num-avg

# create a new df
df_new = pd.DataFrame(columns=["var", "val"])

for var in df_in.groupby("var"):
    cur_df = var[1]
    # calculate -ve count
    neg_count = cur_df[cur_df["val"] < 0]["val"].count()
    # calculate -ve sum
    neg_sum = cur_df[cur_df["val"] < 0]["val"].sum()
    # calculate +ve count
    pos_count = cur_df[cur_df["val"] > 0]["val"].count()
    # calculate avg
    avg = (neg_count - neg_sum)/pos_count
    cur_df["val"] = cur_df["val"].apply(lambda x: modify(avg, x))
    df_new = pd.concat([df_new, cur_df], axis=0)

df_new = df_new.sort_index()
print(df_new)

个字符

kt06eoxx

kt06eoxx2#

用途:

#Trues for positive
m = df_in['val'].gt(0)
#substract 1 form right side and aggregate negative values
neg = df_in.loc[~m, 'val'].rsub(1).groupby(df_in['var']).sum()
#count positive values
pos = df_in.loc[m, 'var'].value_counts()
#divide both
diff = neg.div(pos)
#mapping difference with subtract `val` for positive else set 1
df_in['val'] = np.where(m, df_in['val'].sub(df_in['var'].map(diff)), 1)
print (df_in)
   var   val
0    A   1.0
1    B  19.0
2    A   1.0
3    A   7.0
4    B   8.0
5    A  27.0
6    B   1.0
7    C   1.0
8    A   1.0
9    B  17.0
10   C   2.0
11   B   1.0

字符串

相关问题