pandas 使用规则重新调整数据

3pvhb19x 于 2023-01-24 发布在其他

关注(0)|答案(1)|浏览(151)

我希望使用特定规则重新缩放数据。我的数据范围为-finite到+finite。我希望使用以下规则重新缩放：

对于-有限到0：重新调整为-1至0
对于0到有限：重新调整为0到1

现在，我的数据显示了一个错误的重新调整...

import pandas as pd
from sklearn.preprocessing import MinMaxScaler    
df = pd.DataFrame({
        'reviewId': ['01', '02', '03', '04', '05'],
        'score': [-1, -5, 0, 3, 38]})
scaler = MinMaxScaler(feature_range=(-1, 1))
df['polarity'] = scaler.fit_transform(df[['score']])
    
print(df)

退货

reviewId  score  polarity
0       01     -1 -0.813953
1       02     -5 -1.000000
2       03      0 -0.767442
3       04      3 -0.627907
4       05     38  1.000000

请注意正分数（“3”）是如何缩放到负极性的。我尝试使用MaxAbsScaler，但缩放会根据正值或负值中任一个的最大值而改变。我仍然希望处于“极性”的总体数据在-1到1的范围内，同时保持正值和负值的最大范围。我应该如何处理？

pandas

来源：https://stackoverflow.com/questions/75205508/rescale-data-with-rules

1条答案

按热度按时间

gmol16391#

出现这种情况是因为sklearn.preprocessing.MinMaxScaler试图一次缩放整列（两个区间的范围相同），避免/修复这种情况的一种方法是使用 * boolean indexing * 并分隔两个区间（负数和正数）。

m1 = df["score"].lt(0)
m2 = df["score"].ge(0)

negScaler = MinMaxScaler(feature_range=(-1, 0))
posScaler = MinMaxScaler(feature_range=(0 , 1))

df = pd.concat([df[m1].assign(polarity = negScaler.fit_transform(df.loc[m1, ["score"]])),
                df[m2].assign(polarity = posScaler.fit_transform(df.loc[m2, ["score"]]))])

输出：

print(df)

  reviewId  score  polarity
0       01     -1  0.000000
1       02     -5 -1.000000
2       03      0  0.000000
3       04      3  0.078947
4       05     38  1.000000

赞(0）回复(0）举报 2023-01-24

我来回答

pandas 使用规则重新调整数据

1条答案

相关问题

热门标签

最新问答