在基于项目的推荐系统中,knn的正确方法是什么?

5tmbdcev  于 2021-06-30  发布在  Java
关注(0)|答案(0)|浏览(222)

如果我为产品制作一个应用程序,系统中的每个用户都可以对产品进行评分。我想建立一个推荐系统,根据用户对其他产品的评价,向活跃用户推荐产品。使用基于项目的协作过滤使用knn。当我们发现产品之间的相似性并选择前k项时,哪种方法是正确的?
1-计算所有产品之间的相似性,然后对用户评价较高的每个产品取前k(数据集是一个矩阵(表示来自每个用户的每个项目的评级值)
2-knn应用于用户喜欢的所有产品,一个接一个,我们发现每个产品和用户没有评分的产品之间的相似性,因此对于每个产品,我们为用户评分高的每个产品取相似产品(尚未评分的用户)的前k,然后展示给用户(每次应用knn时的数据集都是一个矩阵(包含用户对每个项目的评分以及用户尚未评分的所有其他项目)。
我在申请中采用了第二种方法,如下所示:
准备数据集:

//data is a matrix contains the rating for each product from each user
   private void prepareDataSet(Double[][] data, int userIndex) {
        Double[][] dataset;
        ArrayList<String> productsID=new ArrayList<String>();
        int index=0;
        for(int i=0;i<data.length;i++){     // traverse on the dataset matrix rows (traverse on products' rating values)
            if(data[i][userIndex]>=3){    //check if the active user rated the product i , and the rating value equal 3 or more (the user liked the product)
                ArrayStructure DataSet=new ArrayStructure();
                for(int col=0;col<data[i].length;col++) {
                    DataSet.add(index, col, data[i][col]);
                    index++;
                }
                for(int j=0;j<data.length;j++){ // traverse on the dataset matrix rows from the beginning to find the products which the active user not rate it
                    // check if the post rating is zero which means that the user did not rate it
                    if(data[j][userIndex]==0.0) {
                        for(int col=0;col<data[j].length;col++) {
                            DataSet.add(index, col, data[j][col]);
                            productsID.add(products.get(j));  //"products" ArrayList has all products' ID in the system
                        }
                        index++;
                    }  // end if

                } //end forloop j
                 dataset=DataSet.toArray();
                int k=determineK(dataset.length-1);  // the sample size
                K_Nearest_Algorithm(dataset,k,productsID);
                productsID.clear();
            }  //end if
        }  //end for loop i

        readPosts();  //after all calculations
        }

knn算法:

// our algorithm
    // the algorithm takes the dataset and the # of neighbors
    private void K_Nearest_Algorithm(Double[][] dataset, int k, ArrayList<String> productsID){
        //create an arrayList to save the similarity calculations
        //item_record holds the distance between the product I want to find its similarities and other products and their IDs
        ArrayList<item_record> item_similarity_values = new ArrayList <item_record> ();
        for(int i=1;i<dataset.length;i++){     // traverse on the dataset matrix rows (traverse on products' rating values)
                                               // we started traversing from i=1 to rule out item[0]--> main item

                for(int j=0;j<dataset[i].length;j++){ // traverse on the dataset matrix rows from the beginning to find the products which the active user not rate it

                        double similarity=calcCosineSimilarity(dataset[0],dataset[i]);  //calculate the cosine similarity between the two products (two vectors)
                        item_record Record=new item_record(productsID.get(i-1),similarity);  // create an object to save the item id and its similarity to the main item
                        item_similarity_values.add(Record);
                } //end forloop j

                Collections.sort(item_similarity_values, Collections.reverseOrder());  //descending order: the items with larger similarities must be on the beginning

                for (int item=0;item<k;item++){        //add the top k items to the ArrayList which has all recommended items to the Active user
                    if(!recommended_items.contains(item_similarity_values.get(item).getId()))
                    recommended_items.add(item_similarity_values.get(item).getId());
                }
            }  //end if

            item_similarity_values.clear();   // cleared to use it for another rated item

        }  //end for loop i

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题