python Pandas过滤和比较日期

2hh7jdfx 于 2023-01-29 发布在 Python

关注(0)|答案(3)|浏览(117)

我有一个sql文件，其中包括下面的数据，我读到Pandas。

df = pandas.read_sql('Database count details', con=engine,
                     index_col='id', parse_dates='newest_available_date')

- 产出**

id       code   newest_date_available
9793708  3514   2015-12-24
9792282  2399   2015-12-25
9797602  7452   2015-12-25
9804367  9736   2016-01-20
9804438  9870   2016-01-20

下一行代码是获取上周的日期

date_before = datetime.date.today() - datetime.timedelta(days=7) # Which is 2016-01-20

我尝试做的是，将date_before与df进行比较，并输出小于date_before的所有行

if (df['newest_available_date'] < date_before):
    print(#all rows)

显然，这会返回一个错误

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我该怎么做呢？

python

来源：https://stackoverflow.com/questions/36104500/pandas-filtering-and-comparing-dates

3条答案

按热度按时间

y0u0uwnf1#

我会做一个面具像这样：

a = df[df['newest_date_available'] < date_before]

如果date_before = datetime.date(2016, 1, 19)，则返回：

id  code newest_date_available
0  9793708  3514            2015-12-24
1  9792282  2399            2015-12-25
2  9797602  7452            2015-12-25

赞(0）回复(0）举报 2023-01-29

xienkqul2#

使用datetime.date(2019, 1, 10)是有效的，因为pandas在后台将日期强制为日期时间，但是在pandas的未来版本中将不再是这种情况。
从版本0.24及更高版本开始，它现在会发出警告：
未来警告：正在将日期时间系列与"datetime.date"进行比较。当前，"datetime.date"被强制为日期时间。以后，panda将不会强制，并且将引发TypeError。
更好的解决方案是在its official documentation上提出的，作为**Pandas替代python datetime. datetime对象**。
为了提供一个引用OP的初始数据集的示例，您可以这样使用它：

import pandas
cond1 = df.newest_date_available < pd.Timestamp(2016,1,10)
df.loc[cond1, ]

赞(0）回复(0）举报 2023-01-29

zpgglvta3#

有点晚了，但我认为这是值得一提的。如果你正在寻找一个动态考虑一周前的日期的解决方案，这可能会有帮助：

In [3]: df = pd.DataFrame({'alpha': list('ABCDE'), 'num': range(5), 'date': pd.date_range('2022-06-30', '2022-07-04')})

In [4]: df
Out[4]: 
  alpha  num       date
0     A    0 2022-06-30
1     B    1 2022-07-01
2     C    2 2022-07-02
3     D    3 2022-07-03
4     E    4 2022-07-04

In [5]: df.query('date < "%s"' % (pd.Timestamp.now().normalize() - pd.Timedelta(7, 'd')))
Out[5]: 
  alpha  num       date
0     A    0 2022-06-30
1     B    1 2022-07-01

- 说明：**

我使用更新的日期创建了一个新的df。今天是 * 2022 - 07 - 09 *（pd.Timestamp.now().normalize()），七天前是 * 2022 - 07 - 02 *（pd.Timestamp.now().normalize() - pd.Timedelta(7, 'd')）。query()使用字符串格式化操作符%仅返回列date中的日期小于 * 2022 - 07 - 02 * 的观测。
normalize()在此处非常重要，可将时间重置为午夜。否则，query()也将返回等于 * 2022 - 07 - 02 * 的观测值，因为：

# Timestamp('2022-07-09 17:53:03.078172') > Timestamp('2022-07-09 00:00:00')
In [6]: pd.Timestamp.now() > pd.Timestamp.now().normalize()
Out[6]: True

赞(0）回复(0）举报 2023-01-29

我来回答

python Pandas过滤和比较日期

3条答案

相关问题

热门标签

最新问答