清除optimusDataframe的错误消息

3vpjnl9f  于 2021-05-27  发布在  Spark
关注(0)|答案(0)|浏览(385)

我最近了解了擎天柱,我试图用它来清理推文。
我将我的tweet导入csv数据框,然后将数据保存到optimus。

我偶然发现了以下清理tweets的代码,并在jupyter中运行了它:

clean_tweets = df.cols.remove_accents("tweet") \ 
              .cols.remove_special_chars("tweet")

我收到以下错误消息:

ValueError                             Traceback(most recent call last) 
  <ipython-input-32-156da3e90955> in <module> 
----> 1 clean_tweets = df.cols.remove_accents("tweet") \ 
      2 .cols.remove_special_chars("tweet")

    ~\AppData\Roaming\Python\Python36\site-packages\optimus\helpers\decorators.py in wrapper(args, *kwargs) 
        47 def wrapper(args, *kwargs): 
        48 start_time = timeit.default_timer() 
   ---> 49 f = func(args, *kwargs) 
        50 _time = round(timeit.default_timer() - start_time, 2) 
        51 if log_time:

    ~\AppData\Roaming\Python\Python36\site-packages\optimus\dataframe\columns.py in remove_accents(input_cols, output_cols) 
        954 return with_out_accents 
        955 
    --> 956 df = apply(input_cols, _remove_accents, "string", output_cols=output_cols, meta=Actions.REMOVE_ACCENTS.value) 
        957 return df 
        958

    ~\AppData\Roaming\Python\Python36\site-packages\optimus\dataframe\columns.py in apply(input_cols, func, func_return_type, args, func_type, when, filter_col_by_dtypes, output_cols, skip_output_cols_processing, meta) 
        240 
        241 for input_col, output_col in zip(input_cols, output_cols): 
    --> 242 df = df.withColumn(output_col, expr(when)) 
        243 df = df.preserve_meta(self, meta, output_col) 
        244

    ~\AppData\Roaming\Python\Python36\site-packages\optimus\dataframe\columns.py in expr(_when) 
        232 
        233 def expr(_when): 
    --> 234 main_query = audf(input_col, func, func_return_type, args, func_type) 
        235 if when is not None: 
        236 # Use the data type to filter the query

    ~\AppData\Roaming\Python\Python36\site-packages\optimus\audf.py in abstract_udf(col, func, func_return_type, attrs, func_type) 
        30 types = ["column_exp", "udf", "pandas_udf"] 
        31 if func_type not in types: 
   ---> 32 RaiseIt.value_error(func_type, types) 
        33 
        34 # It handle if func param is a plain expression or a function returning and expression

    ~\AppData\Roaming\Python\Python36\site-packages\optimus\helpers\raiseit.py in value_error(var, data_values) 
        76 type=divisor.join(map( 
        77 lambda x: "'" + x + "'", 
   ---> 78 data_values)), var_type=one_list_to_val(var))) 
        79 
        80 @staticmethod

    ValueError: 'func_type' must be 'column_exp', 'udf', 'pandas_udf', received 'None'

我怎样才能纠正这个错误?从我浏览过的所有文档和页面来看,这一步似乎不应该出现任何错误。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题