协作过滤spark python

pqwbnv8z 于 2021-05-29 发布在 Spark

关注(0)|答案(1)|浏览(445)

我只想把10行dataframe保存到json中。但是他保存的不是10行而是所有的东西。

userRecs = model.recommendForAllUsers(10)

这显示10然后我保存：

userRecs.coalesce(1).write.mode('overwrite').json("gs://imdbcc1/ML/userrecs")

但它给了我20万张唱片。我只想存10英镑

(training, test) = ratings.randomSplit([0.8, 0.2])
als = ALS(maxIter=10, regParam=1, userCol="user_id", itemCol="tconst", ratingCol="rating", coldStartStrategy="drop")
model = als.fit(training)

# Evaluate the model by computing the RMSE on the test data

predictions = model.transform(test)
evaluator = RegressionEvaluator(metricName="rmse", labelCol="rating", predictionCol="prediction")
rmse = evaluator.evaluate(predictions)
print("Root-mean-square error = " + str(rmse))

# Generate top 10 movie recommendations for each user

userRecs = model.recommendForAllUsers(10)
userRecs.coalesce(1).write.mode('overwrite').json("gs://imdbcc1/ML/userrecs")

JSON apache-spark pyspark apache-spark-sql collaborative-filtering

来源：https://stackoverflow.com/questions/62271101/collaborative-filtering-spark-python

1条答案

按热度按时间

kse8i1jr1#


# Generate top 10 movie recommendations for each user

userRecs = model.recommendForAllUsers(10)

对所有用户来说，这意味着你将获得十大电影推荐。但所有唱片都会有十大电影推荐。
你必须使用 limit(10) 对于10个用户（在前10个电影推荐数据中） coalese 这样地

userRecs.limit(10).coalesce(1).write.mode('overwrite').json("gs://imdbcc1/ML/userrecs")

赞(0）回复(0）举报 2021-05-29

我来回答

协作过滤spark python

1条答案

相关问题

热门标签

最新问答