pandas panda序列化为csv不保留 Dataframe ,绘制失败

j7dteeu8  于 2022-12-31  发布在  其他
关注(0)|答案(1)|浏览(160)

我的代码调用一个API,该API返回一个JSON,该JSON随后在返回到我的应用程序代码之前被转换为PANDAS Dataframe 。
考虑到API响应对于相同的参数应该是不可变的,我想在我的文件系统中缓存 Dataframe ,这样我就可以重用以前的调用响应。为此,我按照Pandas文档的建议将其序列化为csv。
对于获取的数据,我使用finplot绘制了一个图。问题是,无论何时从文件系统加载 Dataframe ,调用plotting函数都会失败,但当 Dataframe 直接从API返回时,调用plotting函数总是成功的。这表明使用panda csv方法的序列化/反序列化正在改变我的 Dataframe 的某些方面。但我不知道是哪一个。完整的代码片段如下:

def plot(df):
    ax = fplt.create_plot("MY SYMBOL", rows=1)
    dfn = df[['Open','Close','High','Low']]
    fplt.candlestick_ochl(dfn)
    fplt.plot(df['MA100'], ax=ax, legend='ma-100', color='#927', width=3)
    fplt.plot(df['MA50'], ax=ax, legend='ma-50', color='#188bc2', width=3)
    fplt.show()

def get_time_series_data(symbol):
    if os.path.exists(symbol_path):
        df = pd.read_csv(symbol_path)
        return df
    else:        
        df = call_api(symbol)
        df.to_csv(symbol_path)        
        return df

df = get_time_series_data(symbol)
plot(df)

当代码通过文件系统加载路径时,打印调用失败,并显示:

Traceback (most recent call last):
  File "$USER_PATH\Desktop\stock_backtest\yahoo_finance_main.py", line 89, in <module>
    plot(df)
  File "$USER_PATH\Desktop\stock_backtest\yahoo_finance_main.py", line 38, in plot
    fplt.show()
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 1841, in show
    refresh()
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 1835, in refresh
    _repaint_candles()
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 2425, in _repaint_candles
    _end_visual_update(item)
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 2389, in _end_visual_update
    item.repaint()
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 1089, in repaint
    self.paint(self.painter)
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 1094, in paint
    self.update_dirty_picture(self.viewRect())
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 1103, in update_dirty_picture
    self._generate_picture(visibleRect)
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 1110, in _generate_picture
    self.generate_picture(self.cachedRect)
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 1155, in generate_picture
    df,origlen = self.datasrc.rows(5, left, right, yscale=self.ax.vb.yscale)
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 453, in rows
    return self._rows(df, colcnt, yscale=yscale, lod=lod), origlen
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 460, in _rows
    dfr = df.iloc[:,colidxs]
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\indexing.py", line 1067, in __getitem__
    return self._getitem_tuple(key)
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\indexing.py", line 1566, in _getitem_tuple
    tup = self._validate_tuple_indexer(tup)
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\indexing.py", line 873, in _validate_tuple_indexer
    self._validate_key(k, i)
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\indexing.py", line 1484, in _validate_key
    raise IndexError("positional indexers are out-of-bounds")
IndexError: positional indexers are out-of-bounds

我尝试更改read csv以将read列指定为0:

df = pd.read_csv(symbol_path, index_col=0)

这仍然失败,但现在显示错误消息:

Traceback (most recent call last):
  File "$USER_PATH\Desktop\stock_backtest\yahoo_finance_main.py", line 89, in <module>
    plot(df)
  File "$USER_PATH\Desktop\stock_backtest\yahoo_finance_main.py", line 35, in plot
    fplt.candlestick_ochl(dfn)
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 1462, in candlestick_ochl
    datasrc = _create_datasrc(ax, datasrc)
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 2144, in _create_datasrc
    datasrc = do_create(iargs)
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 2140, in do_create
    return PandasDataSource(args[0])
  File "$USER_PATH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\finplot\__init__.py", line 248, in __init__
    if type(df.index) == pd.DatetimeIndex or df.index[-1]>1e7 or '.RangeIndex' not in str(type(df.index)):
TypeError: '>' not supported between instances of 'str' and 'float'

问题是什么以及如何解决?

rkue9o1l

rkue9o1l1#

问题是当阅读带有Pandas的CSV时,日期索引被默认为字符串类型,这使得绘图库失败,因为它不能处理字符串索引的数据框。
解决方案是在阅读时指示Pandas将索引列解析为日期,这并不像将'parse_dates'标志设置为true那么简单,因为在这种情况下,Pandas会将其解析为python datetime而不是pandas timestamp,后者也会失败。
对我有效的方法是指定要使用的显式日期解析器:

pd.read_csv(symbol_path, index_col=0, date_parser=pd.Timestamp)

相关问题