python Pandas月的续集还没拍完

lnlaulya  于 2023-01-01  发布在  Python
关注(0)|答案(2)|浏览(97)

我的sql表

SDATETIME   FE014BPV    FE011BPV
0   2022-05-28   5.770000   13.735000
1   2022-05-30  16.469999   42.263000
2   2022-05-31  56.480000  133.871994
3   2022-06-01  49.779999  133.561996
4   2022-06-02  45.450001  132.679001
..         ...        ...         ...
93  2022-09-08   0.000000    0.050000
94  2022-09-09   0.000000    0.058000
95  2022-09-10   0.000000    0.051000
96  2022-09-11   0.000000    0.050000
97  2022-09-12   0.000000    0.038000

我的代码:

import pandas as pd
import pyodbc

monthSQL = pd.read_sql_query('SELECT SDATETIME,max(FE014BPV) as flow,max(FE011BPV) as steam  FROM [SCADA].[dbo].[TOTALIZER] GROUP BY SDATETIME ORDER BY SDATETIME ASC', conn)

monthdata = monthSQL.groupby(monthSQL['SDATETIME'].dt.strftime("%b-%Y"), sort=True).sum()
print(monthdata)

产生此错误输出

flow        steam
SDATETIME                          
Aug-2022   1800.970001  2580.276996
Jul-2022   1994.300014  2710.619986
Jun-2022   3682.329998  7633.660018
May-2022   1215.950003  3098.273025
Sep-2022      0.000000     1.705000

我想输出一些东西如下

SDATETIME         flow        steam
May-2022   1215.950003  3098.273025
Jun-2022   3682.329998  7633.660018
Jul-2022   1994.300014  2710.619986
Aug-2022   1800.970001  2580.276996
Sep-2022      0.000000     1.705000

此外,还需要过去12个月数据的总和

mkshixfv

mkshixfv1#

您正在按字母顺序对***日期名称***进行排序-您需要指定要排序的列。您可以看到这一点,因为它是(日期的起始字母):

SDATETIME                          
Aug-2022 # A goes before J, M, S in the alphabet
Jul-2022 # J goes after A, but before M and S in the alphabet
Jun-2022 # J goes after A, but before M and S in the alphabet
May-2022 # M goes after A, J but before S in the alphabet
Sep-2022 # S goes after A, J, M in the alphabet
sczxawaw

sczxawaw2#

输出是正确的,只是顺序不符合您的期望。请尝试以下操作:

# This keep the SDATETIME as datetime, not string
monthdata = monthSQL.groupby([pd.Grouper(key="SDATETIME", freq="MS")]).sum()

# Add rolling sum of the last 12 months
pd.concat(
    [
        monthdata,
        monthdata.add_suffix("_LAST12").rolling("366D").sum(),
    ],
    axis=1,
)

关于rolling(...)操作:很容易认为rolling(12)提供的是过去12个月的滚动总和。实际上,它返回的是过去12行的滚动总和。这一点很重要,因为如果数据中存在间隙,12行可能涵盖12个月以上的时间。rolling("366D")确保只计算过去366天内的行,这是任何12个月期间的最大长度。
我们不能使用rolling("12M"),因为月份没有固定的持续时间。一个月有28到31天。

相关问题