python-3.x Pandas dataframe : TypeError: unorderable types: str() >= datetime.date()

flmtquvp 于 2023-01-10 发布在 Python

关注(0)|答案(2)|浏览(118)

我正在将一个.csv文件读入一个Pandas Dataframe （CorpActionsDf）。其标题是：

date  factor_value reference             factor
unique_id                                                             
BBG.XAMS.ASML.S  24/04/2015          0.70    Annual       Regular Cash
BBG.XAMS.ASML.S  25/04/2014          0.61    Annual       Regular Cash
BBG.XAMS.ASML.S  26/04/2013          0.53    Annual       Regular Cash
BBG.XAMS.ASML.S  26/11/2012          9.18      None  Return of Capital
BBG.XAMS.ASML.S  27/04/2012          0.46    Annual       Regular Cash

然后我尝试过滤 Dataframe ，这样我只保留两个日期之间的数据。

startDate=02-01-2008
endDate=20-02-2008

但我得到了以下错误：

TypeError: <class 'datetime.date'> type object 2008-01-02

我有另一个进程，它使用startDate和endDate来过滤信息，但由于某种原因，这次我无法让过滤工作。我的代码如下：

def getCorpActionsData(rawStaticDataPath,startDate,endDate):
    pattern = 'CorporateActions'+ '.csv'
    staticPath = rawStaticDataPath
    
    with open(staticPath+pattern,'rt') as f:
      
         CorpActionsDf = pd.read_csv(f,engine='c',header=None,usecols=[0,1,2,3,4],parse_dates=[1], 
                                     dayfirst=True,index_col=[1],names=['unique_id', 'date','factor_value','reference','factor'])       
         print(CorpActionsDf.head())
       
         CorpActionsDf = CorpActionsDf[(CorpActionsDf.index >= startDate) & (CorpActionsDf.index <= endDate)]

我将parse_dates设置为等于第1列，所以我不确定我做错了什么。

python-3.x

来源：https://stackoverflow.com/questions/36813653/pandas-dataframe-typeerror-unorderable-types-str-datetime-date

2条答案

按热度按时间

oxf4rvwz1#

- 更新日期：**

我猜你的索引是字符串（对象）类型-因为下面的条件(CorpActionsDf.index >= startDate)给你str() >= datetime.date()错误消息.
CorpActionsDf.index.dtype的输出是什么？

- 旧答案：**

确保startDate和endDate具有正确数据类型：

startDate=pd.to_datetime('02-01-2008')
endDate=pd.to_datetime('20-02-2008')

赞(0）回复(0）举报 2023-01-10

vcirk6k62#

您可以尝试先转换stringsto_datetime，然后按以下值使用索引：

import pandas as pd
import io

temp=u"""
BBG.XAMS.ASML.S,24/04/2015,0.70,Annual,Regular Cash
BBG.XAMS.ASML.S,25/04/2014,0.61,Annual,Regular Cash
BBG.XAMS.ASML.S,26/04/2013,0.53,Annual,Regular Cash
BBG.XAMS.ASML.S,26/11/2012,9.18,None,Return of Capital
BBG.XAMS.ASML.S,27/04/2012,0.46,Annual,Regular Cash
"""
#after testing replace io.StringIO(temp) to filename
CorpActionsDf = pd.read_csv(io.StringIO(temp), 
                 header=None,
                 usecols=[0,1,2,3,4],
                 parse_dates=[1],
                 dayfirst=True,
                 index_col=[1],
                 names=['unique_id', 'date','factor_value','reference','factor'])
print CorpActionsDf
                  unique_id  factor_value reference             factor
date                                                                  
2015-04-24  BBG.XAMS.ASML.S          0.70    Annual       Regular Cash
2014-04-25  BBG.XAMS.ASML.S          0.61    Annual       Regular Cash
2013-04-26  BBG.XAMS.ASML.S          0.53    Annual       Regular Cash
2012-11-26  BBG.XAMS.ASML.S          9.18      None  Return of Capital
2012-04-27  BBG.XAMS.ASML.S          0.46    Annual       Regular Cash    
startDate=pd.to_datetime('2014-04-25')
endDate=pd.to_datetime('2012-11-26')

print CorpActionsDf[startDate:endDate]
                  unique_id  factor_value reference             factor
date                                                                  
2014-04-25  BBG.XAMS.ASML.S          0.61    Annual       Regular Cash
2013-04-26  BBG.XAMS.ASML.S          0.53    Annual       Regular Cash
2012-11-26  BBG.XAMS.ASML.S          9.18      None  Return of Capital

有趣的是，如果使用strings，最后一行被省略：

print CorpActionsDf['2014-04-25':'2012-11-26']
                  unique_id  factor_value reference        factor
date                                                             
2014-04-25  BBG.XAMS.ASML.S          0.61    Annual  Regular Cash
2013-04-26  BBG.XAMS.ASML.S          0.53    Annual  Regular Cash

编辑：
您必须sort_index才能正确选择：

print CorpActionsDf
                  unique_id  factor_value reference             factor
date                                                                  
2015-04-24  BBG.XAMS.ASML.S          0.70    Annual       Regular Cash
2014-04-25  BBG.XAMS.ASML.S          0.61    Annual       Regular Cash
2013-04-26  BBG.XAMS.ASML.S          0.53    Annual       Regular Cash
2012-11-26  BBG.XAMS.ASML.S          9.18      None  Return of Capital
2012-04-27  BBG.XAMS.ASML.S          0.46    Annual       Regular Cash

CorpActionsDf = CorpActionsDf.sort_index()
print CorpActionsDf

date                                                                  
2012-04-27  BBG.XAMS.ASML.S          0.46    Annual       Regular Cash
2012-11-26  BBG.XAMS.ASML.S          9.18      None  Return of Capital
2013-04-26  BBG.XAMS.ASML.S          0.53    Annual       Regular Cash
2014-04-25  BBG.XAMS.ASML.S          0.61    Annual       Regular Cash
2015-04-24  BBG.XAMS.ASML.S          0.70    Annual       Regular Cash

print CorpActionsDf['2012-11-2':'2014-04-25']
                  unique_id  factor_value reference             factor
date                                                                  
2012-11-26  BBG.XAMS.ASML.S          9.18      None  Return of Capital
2013-04-26  BBG.XAMS.ASML.S          0.53    Annual       Regular Cash
2014-04-25  BBG.XAMS.ASML.S          0.61    Annual       Regular Cash

truncate的另一种解决方案：

print CorpActionsDf.truncate(before='2012-11-2', after='2014-04-25')
                  unique_id  factor_value reference             factor
date                                                                  
2012-11-26  BBG.XAMS.ASML.S          9.18      None  Return of Capital
2013-04-26  BBG.XAMS.ASML.S          0.53    Annual       Regular Cash
2014-04-25  BBG.XAMS.ASML.S          0.61    Annual       Regular Cash

赞(0）回复(0）举报 2023-01-10

我来回答

python-3.x Pandas dataframe : TypeError: unorderable types: str() >= datetime.date()

2条答案

相关问题

热门标签

最新问答