pandas ValueError:使用panda枢轴时不允许使用负尺寸

oprakyz7  于 2023-02-27  发布在  其他
关注(0)|答案(2)|浏览(184)

我正在尝试做一个协作推荐系统。我使用MovieLens完整的数据集https://grouplens.org/datasets/movielens/latest/。我想做一个csr矩阵,用户ID在列中,电影ID在行中,收视率作为值。有一个代码:

import pandas as pd
import numpy as np

movies = pd.read_csv('movies.csv')
ratings = pd.read_csv('ratings.csv')
movies.drop(['genres'], axis=1, inplace=True)
ratings.drop(['timestamp'], axis=1, inplace=True)

user_movie_matrix = ratings.pivot(index='movieId', columns='userId', values='rating')

我得到了这个:

Traceback (most recent call last):
  File "C:\Users\Dmitr\PycharmProjects\RecomBot\main.py", line 13, in <module>
    user_movie_matrix = ratings.pivot(index='movieId', columns='userId', values='rating')
  File "C:\Users\Dmitr\PycharmProjects\RecomBot\venv\lib\site-packages\pandas\util\_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Dmitr\PycharmProjects\RecomBot\venv\lib\site-packages\pandas\core\frame.py", line 8567, in pivot
    return pivot(self, index=index, columns=columns, values=values)
  File "C:\Users\Dmitr\PycharmProjects\RecomBot\venv\lib\site-packages\pandas\util\_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Dmitr\PycharmProjects\RecomBot\venv\lib\site-packages\pandas\core\reshape\pivot.py", line 540, in pivot
    return indexed.unstack(columns_listlike)  # type: ignore[arg-type]
  File "C:\Users\Dmitr\PycharmProjects\RecomBot\venv\lib\site-packages\pandas\core\series.py", line 4455, in unstack
    return unstack(self, level, fill_value)
  File "C:\Users\Dmitr\PycharmProjects\RecomBot\venv\lib\site-packages\pandas\core\reshape\reshape.py", line 489, in unstack
    unstacker = _Unstacker(
  File "C:\Users\Dmitr\PycharmProjects\RecomBot\venv\lib\site-packages\pandas\core\reshape\reshape.py", line 137, in __init__
    self._make_selectors()
  File "C:\Users\Dmitr\PycharmProjects\RecomBot\venv\lib\site-packages\pandas\core\reshape\reshape.py", line 185, in _make_selectors
    mask = np.zeros(np.prod(self.full_shape), dtype=bool)
ValueError: negative dimensions are not allowed

我用的是python 3.9,Pandas 1.5.3和Pycharm.
我发现这是由不同的df形状引起的,但我不明白为什么没有Nan值,以及如何修复它。

pgvzfuti

pgvzfuti1#

尝试使用pandas.pivot_table方法而不是pivot()方法,该方法将使您能够指定如何处理缺失值。在以下代码中,fill_value参数将使用0填充缺失值

user_movie_matrix = ratings.pivot_table(index='movieId', columns='userId', values='rating', fill_value=0)
user_movie_matrix = user_movie_matrix.merge(movies[['movieId', 'title']], on='movieId')

user_movie_matrix.set_index('title', inplace=True)
3pvhb19x

3pvhb19x2#

嗯,我发现如果你使用谷歌合作实验室,这个方法可以正确工作。为什么它在pycharm中不起作用,这对我来说仍然是个谜:')

相关问题