pandas 获取“类型错误:无法识别输出参数的类型:〈class 'str'>”当使用带有Pandera装饰器的类函数时

0yycz8jy  于 2022-12-21  发布在  其他
关注(0)|答案(2)|浏览(198)

我正在尝试使用Python包“Pandera”中的装饰器,但我在让它们与类一起工作时遇到了麻烦。
首先,我为Pandera创建模式:

from pandera import Column, Check
import yaml
in_ = pa.DataFrameSchema(
    {
        "Name": Column(object, nullable=True),
        "Height": Column(object, nullable=True),
    })

with open("./in_.yml", "w") as file:
    yaml.dump(in_, file)

out_ = pa.DataFrameSchema(
    {
        "Name": Column(object, nullable=True),
        "Height": Column(object, nullable=True),
    })
with open("./out_.yml", "w") as file:
    yaml.dump(out_, file)

接下来,我创建test.py文件,其类为:

from pandera import check_io
import pandas as pd

class TransformClass():

    with open("./in_.yml", "r") as file:
        in_ = file.read()
    with open("./out_.yml", "r") as file:
        out_ = file.read()

    @staticmethod
    @check_io(df=in_, out=out_)
    def func(df: pd.DataFrame) -> pd.DataFrame:
        return df

最后我导入这个类:

from test import TransformClass
data = {'Name': [np.nan, 'Princi', 'Gaurav', 'Anuj'],
        'Height': [5.1, 6.2, 5.1, 5.2],
        'Qualification': ['Msc', 'MA', 'Msc', 'Msc']}
df = pd.DataFrame(data)
TransformClass.func(df)

我得到:

File C:\Anaconda3\envs\py310\lib\site-packages\pandera\decorators.py:464, in check_io.<locals>._wrapper(fn, instance, args, kwargs)
    462     out_schemas = []
    463 else:
--> 464     raise TypeError(
    465         f"type of out argument not recognized: {type(out)}"
    466     )
    468 wrapped_fn = fn
    469 for input_getter, input_schema in inputs.items():
    470     # pylint: disable=no-value-for-parameter

TypeError: type of out argument not recognized: <class 'str'>

任何帮助都将不胜感激

mw3dktmi

mw3dktmi1#

check_io装饰器需要类型为pandera.DataFrameSchema的参数。但是,由于它是file.read()的输出,因此传递的是类型为str_out
Pandera文档解释了check_io装饰器需要哪些类型。
一个解决方案是将file.read()行的输出传递给Pandera构造函数,可能需要进行一些转换:

out_ = yaml.safe_load(file.read())
unftdfkk

unftdfkk2#

感谢@grbeazley,以下是完整的解决方案:

from pandera import Column, Check
import yaml

in_ = pa.DataFrameSchema(
    {
        "Name": Column(object, nullable=True),
        "Height": Column(object, nullable=True),
    })
with open("in_.yml", "w") as file:
    yaml.dump(in_.to_yaml(), file)

with open("./in_.yml", "r") as file:
    in_ = yaml.safe_load(file.read())

_ = pa.DataFrameSchema.from_yaml(in_)

相关问题