我有一种给每个零售商打分的方法,零售商应该有一个分数,稍后将被聚类,但我需要根据每个零售商的标记target
为他打分。有两个目标:
balanced
这是一个基于多个标准的综合得分,我现在将在代码中显示nmv
根据零售商的nmv有多高来定位零售商。
下面是代码和我尝试的内容:
targets = ['balanced','nmv']
day_of_month = date.today().day
df['Score'] = 0
if day_of_month > 10: #If today is greater than the 10th day, do the dynamic targeting. Else, do the first 10 days plan
for index, row in df.iterrows():
target = row['target']
if target == 'balanced':
conditions = [
(df['retailer_id'].isin(droppers['retailer_id'])), # Dropped From MP
(df['months_sr'] > 0.4) | (df['historical_sr'] > 0.4) & (df['orders_this_month_total'] >= 1),
(df['wallet_amount'] > 0) & (df['orders_this_month_total'] > 0), #Has Wallet Amount and still made no orders this month
(df['orders_this_month_total'] == 1), # Ordered Once this month,
( (df[['nmv_this_month_total','nmv_one_month_ago_total','nmv_two_months_ago_total','nmv_three_months_ago_total']].fillna(0).pct_change(axis = 1).mean(axis = 1) ) > 0), # His nmv is making progress
(df['skus_pct_change_q_cut'].isin(['med','high','extreme'])), # His orders are more likely to contain more than 3 SKUs
(df['orders_one_month_ago_total'] >= 1) & (df['orders_this_month_total'] <= 1), # Ordered once this month or not at all and ordered last month once or more.
(df[['orders_one_month_ago_total','orders_two_months_ago_total','orders_three_months_ago_total']].sum(axis = 1) > 0) & (df['orders_this_month_total'] >= 1), # Ordered At least in one of the previous three months and made one order this month
(df[['orders_one_month_ago_total','orders_two_months_ago_total','orders_three_months_ago_total']].sum(axis = 1) > 0) & (df['orders_this_month_total'] <= 1), # Ordered At least in one of the previous three months and made none orders this month
(df['sessions_this_month'] > 0) & (df['visits_this_month'] == 0), # Opens the app and we did not pay him a visit.
(df['visits_this_month'] == 0) & (df['peak_week'] == wom) & ((df['months_sr'] >= 0.4) & (df['months_sr'] <= 1)) & (df['orders_this_month_total'] < 4), # This week is his peak week and he made less than 4 orders
(df['peak_week'] < wom) & (df['orders_this_month_total'] == 0), # Missed their critical week
(df['wallet_amount'] > 0),
True
]
results = list(range(len(conditions) - 1, -1, -1)) # define results for balanced target
elif target == 'nmv':
conditions = [
(df['retailer_id'].isin(droppers['retailer_id'])), # Dropped From MP
(df['visits_this_month'] == 0) & (df['peak_week'] == wom) & ((df['months_sr'] >= 0.4) & (df['months_sr'] <= 1)) & (df['orders_this_month_total'] == 0), # This week is his peak week
(df['visits_this_month'] == 0) & (df['historical_sr'] >= 0.4) & (df['orders_this_month_total'] == 0), # Overall Strike Rate is greater than 40%
(df['nmv_q_cut_total'].isin(['high','extreme'])),
(df['nmv_q_cut_total'].isin(['high','extreme'])) & ( (df['wallet_amount'] > 0) | (df['n_offers'] > 0) ),
(df['months_nmv'].median() >= df['polygon_average_nmv']),
(df['orders_one_month_ago'] > 0),
(df['months_sessions_q_cut'] > 0),
True
]
results = list(range(len(conditions) - 1, -1, -1)) # define results for activation target
df.loc[index, 'Score'] = np.select(conditions, results)
df['Score'] = df['Score'].astype(int)
else:
conditions = [
(df['retailer_id'].isin(droppers['retailer_id'])), # Dropped From MP
(df['visits_this_month'] == 0) & (df['peak_week'] == wom) & ((df['months_sr'] >= 0.4) & (df['months_sr'] <= 1)), # This week is his peak week
(df['historical_sr'] >= 0.4), # Overall Strike Rate is greater than 40%
(df['orders_one_month_ago'].isin([1,2,3,4])) & (df['nmv_one_month_ago'] >= 1500),
(df['orders_one_month_ago'].isin([1,2,3,4])),
(df['orders_two_months_ago'].isin([1,2,3,4])),
(df['orders_three_months_ago'].isin([1,2,3,4])),
(df['last_visit_date'].dt.year == 2022) & (df['last_order_date'].dt.year == 2022), # Last Order Date And last Visit Date is in 2022
(df['last_visit_date'].dt.year == 2023) & (df['last_order_date'].dt.year == 2023),
True
]
results = list(range(len(conditions) - 1, -1, -1))
df['Score'] = np.select(conditions, results)
正如你所看到的,我给每个零售商一个分数,它以前工作过,我认为如果我迭代 Dataframe 的行并分配一个分数,它会给予我这个特定目标下零售商的最终分数。然而,它从错误中返回一个列表(我想):
ValueError:在使用可迭代对象进行设置时,必须具有相等的len键和值
你能告诉我在单独的行上使用np.select
的正确方法吗?
1条答案
按热度按时间g52tjvyc1#
据我所知,你的分数只是适用条件的反向指数。我将
Score
设置为None作为默认值,然后从最不重要到最重要依次应用规则,分别设置每个Score
。我认为问题是你在迭代整个 Dataframe ,而这似乎是不必要的。由于没有提供示例数据,我只能给予一个我会做什么的虚拟示例:
产量:
保持此顺序可确保较高的分数将覆盖较低的分数。未被任何规则命中的条目将为None,但可以通过初始化为0轻松设置为0。
为了更好地帮助您,如果您能提供示例数据,将是有益的。