pandas 如何解决Python中的属性错误'float'对象没有属性'split'?

bqf10yzr  于 2023-04-18  发布在  Python
关注(0)|答案(4)|浏览(325)

当我运行下面的代码时,它给我一个错误,说有属性错误:“float”对象在python中没有属性“split”。
我想知道为什么会出现这种错误。

def text_processing(df):

    """""=== Lower case ==="""
    '''First step is to transform comments into lower case'''
    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))

    return df

df = text_processing(df)

错误的完整追溯:

Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1664, in <module>
    main()
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1658, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1068, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 53, in <module>
    df = text_processing(df)
  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in text_processing
    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
  File "C:\Users\L31307\AppData\Roaming\Python\Python37\site-packages\pandas\core\series.py", line 3194, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/_libs/src\inference.pyx", line 1472, in pandas._libs.lib.map_infer
  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in <lambda>
    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
AttributeError: 'float' object has no attribute 'split'
332nm8kg

332nm8kg1#

有同样的问题('float'对象没有属性'split'),这是我如何管理它:
df['column_name'].astype(str)

3df52oht

3df52oht2#

错误指向这一行:

df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() \
                                    if x not in stop_words))

split在这里被用作Python的内置str类的方法。您的错误指示df['content']中的一个或多个值的类型为float。这可能是因为存在null值,即NaN,或非null浮点值。
一个解决方法是在使用split之前,在x上应用str,这将字符串化浮点数:

df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in str(x).split() \
                                    if x not in stop_words))

或者,也可能是更好的解决方案,显式地使用带有try/except子句的命名函数:

def converter(x):
    try:
        return ' '.join([x.lower() for x in str(x).split() if x not in stop_words])
    except AttributeError:
        return None  # or some other value

df['content'] = df['content'].apply(converter)

由于pd.Series.apply只是一个有开销的循环,你可能会发现列表解析或map更有效:

df['content'] = [converter(x) for x in df['content']]
df['content'] = list(map(converter, df['content']))
nx7onnlm

nx7onnlm3#

split()是一个python方法,它只适用于字符串。看起来你的列“content”不仅包含字符串,还包含其他值,比如浮点数,你不能应用.split()方法。
尝试使用str(x).split()将值转换为字符串,或者先将整个列转换为字符串,这会更有效。您可以按如下方式执行此操作:

df['column_name'].astype(str)
w6lpcovy

w6lpcovy4#

有同样的问题('float'对象没有属性'split'),这是我如何管理它:
初始代码...:

df = pd.read_excel("Forbes Athlete List 2012-2019.xlsx")
df.Pay = df.Pay.apply(lambda x: float(x.split(" ")[0].split("$")[1]))

...导致错误:“float”对象没有属性“split”
所以我把代码改成这样:

df.Pay = df.Pay.apply(lambda x: float(x.split(" ")[0].split("$")[1] if type (x) == str else str (x)))

下面是第二个同样的例子:
显示错误的初始代码:

df.Endorsements = df.Endorsements.apply(lambda x: float(x.split(" ")[0].split("$")[1]))

修改后的代码运行良好:

df.Endorsements = df.Endorsements.apply (lambda x: float(x.split(" ")[0].split("$")[1] if type (x) == str else str (x)))

所以,有这个问题的人可以尝试在代码中添加'if type(x)== str else str(x)'部分,可能会解决你的问题。
干杯!

相关问题