如何在Pandas中获得 Dataframe 的移位索引值?

qfe3c7zg  于 2022-12-16  发布在  其他
关注(0)|答案(3)|浏览(144)

请看下面这个简单的例子:

date = pd.date_range('1/1/2011', periods=5, freq='H')

df = pd.DataFrame({'cat' : ['A', 'A', 'A', 'B',
                         'B']}, index = date)
df
Out[278]: 
                    cat
2011-01-01 00:00:00   A
2011-01-01 01:00:00   A
2011-01-01 02:00:00   A
2011-01-01 03:00:00   B
2011-01-01 04:00:00   B

我想创建一个包含指数滞后/领先值的变量,如下所示:

df['index_shifted']=df.index.shift(1)

例如,在时间2011-01-01 01:00:00,我期望变量index_shifted2011-01-01 00:00:00
我该怎么做?谢谢!

n1bvdmb6

n1bvdmb61#

我认为您需要Index.shift-1

df['index_shifted']= df.index.shift(-1)
print (df)
                    cat       index_shifted
2011-01-01 00:00:00   A 2010-12-31 23:00:00
2011-01-01 01:00:00   A 2011-01-01 00:00:00
2011-01-01 02:00:00   A 2011-01-01 01:00:00
2011-01-01 03:00:00   B 2011-01-01 02:00:00
2011-01-01 04:00:00   B 2011-01-01 03:00:00

对我来说,它的工作没有freq,但也许这是必要的,在真实的数据:

df['index_shifted']= df.index.shift(-1, freq='H')
print (df)
                    cat       index_shifted
2011-01-01 00:00:00   A 2010-12-31 23:00:00
2011-01-01 01:00:00   A 2011-01-01 00:00:00
2011-01-01 02:00:00   A 2011-01-01 01:00:00
2011-01-01 03:00:00   B 2011-01-01 02:00:00
2011-01-01 04:00:00   B 2011-01-01 03:00:00

编辑:
如果DatetimeIndexfreqNone,则需要将freq添加到shift

import pandas as pd

date = pd.date_range('1/1/2011', periods=5, freq='H').union(pd.date_range('5/1/2011', periods=5, freq='H'))

df = pd.DataFrame({'cat' : ['A', 'A', 'A', 'B',
                         'B','A', 'A', 'A', 'B',
                         'B']}, index = date)

print (df.index)
DatetimeIndex(['2011-01-01 00:00:00', '2011-01-01 01:00:00',
               '2011-01-01 02:00:00', '2011-01-01 03:00:00',
               '2011-01-01 04:00:00', '2011-05-01 00:00:00',
               '2011-05-01 01:00:00', '2011-05-01 02:00:00',
               '2011-05-01 03:00:00', '2011-05-01 04:00:00'],
              dtype='datetime64[ns]', freq=None)

df['index_shifted']= df.index.shift(-1, freq='H')
print (df)
                    cat       index_shifted
2011-01-01 00:00:00   A 2010-12-31 23:00:00
2011-01-01 01:00:00   A 2011-01-01 00:00:00
2011-01-01 02:00:00   A 2011-01-01 01:00:00
2011-01-01 03:00:00   B 2011-01-01 02:00:00
2011-01-01 04:00:00   B 2011-01-01 03:00:00
2011-05-01 00:00:00   A 2011-04-30 23:00:00
2011-05-01 01:00:00   A 2011-05-01 00:00:00
2011-05-01 02:00:00   A 2011-05-01 01:00:00
2011-05-01 03:00:00   B 2011-05-01 02:00:00
2011-05-01 04:00:00   B 2011-05-01 03:00:00
i34xakig

i34xakig2#

df['index_shifted']=df.index.shift(-1)有什么问题?
(真诚的问题,不确定我是否错过了什么)

0s0u357o

0s0u357o3#

这是一个老问题,但如果您的时间戳有间隙,或者您不想指定频率,并且您不处理时区,则以下方法将起作用:

df['index_shifted'] = pd.Series(df.index).shift(-1).values

如果您正在处理时区,则可以使用以下方法:

df['index_shifted'] = pd.to_datetime(pd.Series(df.index).shift(-1).values, utc=True).tz_convert('America/New_York')

相关问题