如何将python Pandas的value_counts()的输出设置为int64以外的数据类型？

jaql4c8m 于 2023-10-21 发布在 Python

关注(0)|答案(1)|浏览(112)

我正在使用value count函数来分析一个大型pandas框架（n = 2483903973）中的数据。我的df的基本设计是：

df = {'Value': [0, 1, 1, 0], 'Class': pd.Series([1, 2, 1, 1)}

Value    int32
Class      int8
dtype: object

我有重复的价值观和重复的课程。然后我运行以下代码来计算每个Value：Class组合出现的次数。

df = df.value_counts().rename_axis(['Value','Class']).reset_index(name='counts')

   Value  Class  Counts
0  0      1      2
1  1      1      1
2  1      2      1

Value     int64
Class      int64
Counts    int64
dtype: object

我的问题是value_counts（）的输出是一个int64的dtype，它占用了太多的内存。我正在寻找一种方法来设置输出的dtype更小的东西，老实说，像int16将工作。我查看了方法文档，但没有看到要传递的参数。任何帮助将不胜感激。

python-3.x

来源：https://stackoverflow.com/questions/77311263/how-to-set-the-output-of-python-pandas-value-counts-to-a-datatype-other-than

1条答案

按热度按时间

htrmnn0y1#

你可以试试这个：

它可能不适用于大整数

import pandas as pd
df = pd.DataFrame({'Value': [0, 1, 1, 0], 'Class': pd.Series([1, 2, 1, 1])})
df = df.value_counts().rename_axis(['Value','Class']).reset_index(name='counts').astype('int16')

print(df)

  Value  Class  counts
0      0      1       2
1      1      1       1
2      1      2       1

print(df.dtypes)

Value     int16
Class     int16
counts    int16
dtype: object

赞(0）回复(0）举报 2023-10-21

我来回答

如何将python Pandas的value_counts()的输出设置为int64以外的数据类型？

1条答案

它可能不适用于大整数

相关问题

热门标签

最新问答