pandas Python中的组合优化

myzjeezk  于 2023-05-27  发布在  Python
关注(0)|答案(1)|浏览(121)

我计划和6个人一起去打高尔夫球,试着优化配对。我从组合的数量开始,然后随机抽样5轮。我的目标是创建行[a,B,c,d,e,f]与列[a,b,c,d,e,f]的交叉矩阵,然后找到最小化1数量的分组组合。

import pandas
from itertools import permutations, combinations
players = ['a','b','c','d','e','f']
z = pd.DataFrame(combinations(players,3)
for i in z.index:
    players = ['a','b','c','d','e','f']
    players.remove(z.loc[i,0])
    players.remove(z.loc[i,1])
    players.remove(z.loc[i,2])
    z.loc[i,3] = players[0]
    z.loc[i,4] = players[1]
    z.loc[i,5] = players[2] #just to fill out the rest of the matrix
z = 
   0  1  2  3  4  5
0  a  b  c  d  e  f
1  a  b  d  c  e  f
2  a  b  e  c  d  f
3  a  b  f  c  d  e
4  a  c  d  b  e  f
5  a  c  e  b  d  f
6  a  c  f  b  d  e
7  a  d  e  b  c  f
8  a  d  f  b  c  e
9  a  e  f  b  c  d

v = z.sample(5)

opt = pd.DataFrame([], index = ['a','b','c','d','e','f'],columns = ['a','b','c','d','e','f'])
g = pd.concat([v[1].value_counts(),v[2].value_counts()]).sort_index().groupby(level = 0).sum() #pairings count for A
g
Out[116]: 
b    3
c    1
d    1
e    3
f    2

有什么想法/功能可以帮助我吗?我可以在第1和第2列使用value_counts()来获得第一列,因为第0列总是'a',但不确定如何填写矩阵的其余部分。TIA!
解决方案如下所示:

a    b    c    d    e    f
a  0  0.0  0.0  0.0  0.0  0.0
b  3  0.0  0.0  0.0  0.0  0.0
c  1  1    0.0  0.0  0.0  0.0
d  1  3    3    0.0  0.0  0.0
e  3  1    3    1    0.0  0.0
f  2  1    2    2    2    0.0

对于此示例

0  1  2  3  4  5
3  a  b  f  c  d  e
2  a  b  e  c  d  f
5  a  c  e  b  d  f
1  a  b  d  c  e  f
9  a  e  f  b  c  d
ubbxdtey

ubbxdtey1#

最后得到了这个解决方案:

opt_score = 11
opt_v = ''
opt_cross = ''
k=0
while k < 50000:
    s = 2
    v = z.sample(s)
    v = v[['group1','group2']]
    p = ['a','b','c','d','e','f']
    all_strings = []
    for a in p:
        letter_string = ''
        for i in v.index:
            if a in v.loc[i,'group1']:
                letter_string += v.loc[i,'group1']
            else:
                letter_string += v.loc[i,'group2']
        letter_string = letter_string.replace(a,'')
        all_strings.append(letter_string)
    g = pd.DataFrame(all_strings, columns = ['C'])
    g2 = g['C'].str.split('',expand = True)
    g2 = g2.T
    g2.drop([0,s*2+1], axis = 0, inplace = True)
    g2.columns = p
    opt = pd.DataFrame([], index = p,columns = p)
    opt.fillna(0, inplace = True)
    for a in g2.columns:
        n = g2[a].value_counts().to_frame()
        n.loc[a] = 0
        opt = pd.concat([opt,n]).sort_index().groupby(level = 0).sum()
    if (opt[opt == 1.].count().sum()/2 < opt_score) & (opt[opt == 0.].count().sum() == 6):
    #if (opt.loc['g','h'] == 3.) & (opt.loc['a','b'] >= 1.) & (opt.loc['a','e'] >= 1.) & (opt.loc['c','d'] >= 2.):
        opt_score = opt[opt == 1.].count().sum()/2
        opt_v = v[['group1','group2']]
        opt_cross = opt
        
    k+=1

本质上是创建所有分组组合的字符串,然后删除您正在查找的字母,然后从那里执行value_counts。超级混乱,但得到了我正在寻找的解决方案。关键字是sort_index().groupby(level = 0).sum()。从优化得分11开始,尝试最小化零的数量。

相关问题