在Pandas列中查找混合类型

bttbmeg0 于 2023-02-02 发布在其他

关注(0)|答案(4)|浏览(134)

我在解析数据文件时经常收到这样的警告：

WARNING:py.warnings:/usr/local/python3/miniconda/lib/python3.4/site-
packages/pandas-0.16.0_12_gdcc7431-py3.4-linux-x86_64.egg/pandas
/io/parsers.py:1164: DtypeWarning: Columns (0,2,14,20) have mixed types. 
Specify dtype option on import or set low_memory=False.
          data = self._reader.read(nrows)

但是，如果数据很大（我有50k行），我如何在数据中找到dtype发生变化的位置？

pandas

来源：https://stackoverflow.com/questions/29376026/find-mixed-types-in-pandas-columns

4条答案

按热度按时间

u2nhd7ah1#

我不完全确定你要找的是什么，但是很容易找到那些包含与第一行类型不同的元素的行，例如：

>>> df = pd.DataFrame({"A": np.arange(500), "B": np.arange(500.0)})
>>> df.loc[321, "A"] = "Fred"
>>> df.loc[325, "B"] = True
>>> weird = (df.applymap(type) != df.iloc[0].apply(type)).any(axis=1)
>>> df[weird]
        A     B
321  Fred   321
325   325  True

赞(0）回复(0）举报 2023-02-02

mgdq6dx12#

除了DSM的答案之外，对于多列 Dataframe ，查找更改类型的列也很有帮助，如下所示：

for col in df.columns:
    weird = (df[[col]].applymap(type) != df[[col]].iloc[0].apply(type)).any(axis=1)
    if len(df[weird]) > 0:
        print(col)

赞(0）回复(0）举报 2023-02-02

lvjbypge3#

这种方法使用pandas.api.types.infer_dtype来查找混合数据类型的列，它在Python 3.8下的Pandas 1中进行了测试。
注意，这个答案有多种赋值表达式的用法，而这些表达式只能在Python 3.8或更新版本中使用，不过，它可以被简单地修改为不使用它们。

if mixed_dtypes := {c: dtype for c in df.columns if (dtype := pd.api.types.infer_dtype(df[c])).startswith("mixed")}:
    raise TypeError(f"Dataframe has one more mixed dtypes: {mixed_dtypes}")

然而，这种方法不能找到数据类型改变的行。

赞(0）回复(0）举报 2023-02-02

fwzugrvs4#

创建具有两种数据类型的列的示例数据

import seaborn
iris = seaborn.load_dataset("iris")
# Change one row to another type
iris.loc[0,"sepal_length"] = iris.loc[0,"sepal_length"].astype(str)

当列使用多种类型时，打印列名称和使用的类型：

for col in iris.columns:
    unique_types = iris[col].apply(type).unique()
    if len(unique_types) > 1:
        print(col, unique_types)

要修复列类型，您可以：

使用df[col] = df[col].astype(str)来改变数据类型。
或者，如果 Dataframe 是从csv文件读取的，则在列字典中定义d'type参数。

赞(0）回复(0）举报 2023-02-02

我来回答

在Pandas列中查找混合类型

4条答案

相关问题

热门标签

最新问答