numpy 如何在Pytorch中解决使用矩阵分解和自定义损失的自定义优化问题?

bq3bfh9z  于 2023-04-30  发布在  其他
关注(0)|答案(1)|浏览(148)

我试图复制论文Recommendation for Effective Standardized Exam Preparation,其中他们使用了一个特殊的术语,称为正确性概率函数,并被描述为:
正确性概率函数p,我们使用this paper中引入的矩阵分解模型。利用用户的问题-响应序列,我们找到了分解X = L*R_Transpose,该分解使用Frobenius范数正则化来最小化二进制交叉熵(BCE)损失,其中L = (L_uj )表示学生u对隐藏概念j的理解,R = (R_qj )表示隐藏概念j对问题q的贡献。条目X = (X_uq)表示学生u对问题q的理解,并且基于IRT中的M2 PR潜在特质模型的变化使用X_uq计算回答正确概率p(q|I_u ) = P_uq,给予nin this paper
然后他们继续定义确切的过程,如下图所示:

现在他们有了一个名为phi的Sigmoid修改版本,它使用了3个额外的参数作为phi_a = 0.25, phi_b = 0.5, phi_c = 10。我对phi的实现如下:

# Modified sigmoid function
def phi(x, phi_a = phi_a, phi_b = phi_b, phi_c = phi_c):
    '''
    Putting value of X in phi will give the probability score of that question
    According to the paper it is given that Custom sigmoid = phi:
    φ(x) = φ_a + ((1 - φ_a) / (1 + e^ -φ_c(x - φ_b ))) and values of (φ_a , φ_b , φ_c ) are (0.25, 0.5, 10) respectively
    '''
    return phi_a + (1 - phi_a) / (1 + torch.exp(-phi_c * (x - phi_b)))

他们有一个自定义的Optimization Problem,我们称之为custom loss,我定义如下:

# Custom loss function
def custom_loss(Y_true, Y_pred, L, R, mu):
    '''
    In the paper, it given as:
    BCE Loss + ((mu/2) * (Frobenius norm of L squared + Frobenius norm of R squared))
    and Frobenius_norm(L) = √(Σ (L_ij)^2) so the authors might have used square just to eliminate the square root value so it becomes: Σ(L_ij)^2
    '''
    bce_loss = torch.nn.BCELoss()(Y_pred, Y_true)
    frobenius_norm_L = torch.sum(L ** 2)
    frobenius_norm_R = torch.sum(R ** 2)
    reg_loss = (mu / 2) * (frobenius_norm_L + frobenius_norm_R)
    return bce_loss + reg_loss

有一些条件在下面的图像中给出:

  1. 0 <= L_uj <= 1Student - Latent Topic矩阵L中的所有条目将在[0,1]之间。
  2. 0 <= R_qj <= 1Latent Topic - Question矩阵R中的所有条目将在[0,1]之间
    1.每行中R_qj的所有元素之和必须等于1。这意味着如果每个问题都由n_topics表示,那么每个问题的n_topics的总和将完全等于1(像Softmax)
    1.论文采用随机梯度下降(SGD)方法求解矩阵分解问题
    然后不知何故(在很多帮助下不择手段),我到达了一个点,我遇到了一些解决方案:
# Dummy numbers
n_students = 50
n_questions = 700
n_concepts = 15

# Hyperparameters 
n_epochs = 10
learning_rate = 0.01
mu = 0.1  # regularization 

# Given constants for the phi() function
phi_a, phi_b, phi_c = 0.25, 0.5, 10 

# ---------------------------------------------------
# Generate random student responses (1 if correct, 0 otherwise)
Y = np.random.randint(0, 2, (n_students, n_questions))
L = np.random.rand(n_students, n_concepts) # Initialize L and R matrices randomly
R = np.random.rand(n_questions, n_concepts)

optimizer = torch.optim.SGD([L, R], lr=learning_rate) # SGD as given

# -------------------------------------------------------------

# Training loop
for epoch in range(n_epochs):
    optimizer.zero_grad()
    X = torch.matmul(L, torch.transpose(R, 0, 1))
    Y_pred = phi(X) # predicted score
    loss = custom_loss(Y, Y_pred, L, R, mu)
    loss.backward()
    optimizer.step()

    # Enforce 3 conditions
    R.data = R.data / R.data.sum(axis=1, keepdims=True)  # Normalize rows of R to sum up to 1
    L.data = torch.clamp(L.data, min=0, max=1) # 0 ≤ L[u, j] ≤ 1 
    R.data = torch.clamp(R.data, min=0, max=1) # 0 ≤ R[q, j] ≤ 1

# Calculate the optimized understanding matrix X_opt (n_students x n_questions)
X_opt = torch.matmul(L, torch.transpose(R, 0, 1))

# --- Evaluate  --------------------------------

# Calculate the probability score for a student u and question q using the modified sigmoid function
u, q = 0, 4
probability_score = phi(X_opt[u, q]) # Given in the formula
print(f"Probability score for student {u} and question {q}: {probability_score.item()}")

现在的问题是,我不知道它是否是正确的解决方案,并给出了它应该给出的结果

PS:如果有其他方法可以解决这个问题,请一定要让我知道。我只是想用Any库复制这张纸。

guicsvcw

guicsvcw1#

你所做的一切似乎都是合理的。
缺少的部分是合成数据生成:
随机化LR,然后获得

X = torch.matmul(L, torch.transpose(R, 0, 1))
Y_train = phi(X)

然后以同样的方式生成Y_validation
Y_train用于训练循环,Y_validation用于评估最终性能。

相关问题