pandas 对DataFrame中的多个列执行value_counts()会在每次迭代后将count列向右移动

mnemlml8 于 2022-12-16 发布在其他

关注(0)|答案(1)|浏览(137)

从DataFrame的子集：

>>> df[['Source','Destination','Attack Name']].head()

               Source                     Destination    Attack Name
0              10.x.x.116                 10.x.x.71      RDP Enforcement Violation
1              43.x.x.233                 152.x.x.148    Scanner Enforcement Violation
2  hn.kd.dhcp (61.x.x.192)                152.x.x.148    NaN
3             104.x.x.241                 152.x.x.116    Scanner Enforcement Violation
4              117.x.x.61                 152.x.x.52     NaN

我想统计每个目标来自前10个来源的攻击数量。
我试过这样的方法：

import pandas as pd

outReport='test.xlsx'
df = pd.read_csv("IPSLogs2.csv")

def statsPerAttacker():
        topSrc = df['Source'].value_counts()[:10]
        mastaSR = pd.Series()
        for ip in topSrc.to_dict():
                df_statsPerAttacker = df[df['Source']==ip][['Source', 'Destination', 'Attack Name']].value_counts().to_frame()
                mastaSR = pd.concat([mastaSR, df_statsPerAttacker], axis=1)

        with pd.ExcelWriter(outReport, engine='openpyxl') as writer:
                mastaSR.to_excel(writer, startcol=2, startrow=2, header=False)

if __name__ == '__main__':
    statsPerAttacker()

我确实得到了结果，但是最后一列在每次源IP迭代后向右移动一个位置（见屏幕截图）：
https://postimg.cc/8FBnq07g
我做错什么了？谢谢

pandas

来源：https://stackoverflow.com/questions/74768772/performing-value-counts-on-multiple-columns-in-dataframe-shifts-count-column-t

1条答案

按热度按时间

332nm8kg1#

问题可能是由于我对Series和DataFrame对象的无知引起的。我使用不同的方法解决了我的问题：

def statsPerAttacker():
    topSrc = df['Source'].value_counts()[:10]    
    stats = df.groupby(['Source','Destination'])['Attack Name'].value_counts()
    ips = topSrc.index.tolist()

    statsPerSourceIP = stats.loc[ips]
    with pd.ExcelWriter(outReport, engine='openpyxl', mode='a', if_sheet_exists='overlay') as writer:
        statsPerSourceIP.to_excel(writer, sheet_name='StatisticsByCriticality', startcol=2, startrow=35, header=False)

赞(0）回复(0）举报 2022-12-16

我来回答

pandas 对DataFrame中的多个列执行value_counts()会在每次迭代后将count列向右移动

1条答案

相关问题

热门标签

最新问答