从DataFrame的子集:
>>> df[['Source','Destination','Attack Name']].head()
Source Destination Attack Name
0 10.x.x.116 10.x.x.71 RDP Enforcement Violation
1 43.x.x.233 152.x.x.148 Scanner Enforcement Violation
2 hn.kd.dhcp (61.x.x.192) 152.x.x.148 NaN
3 104.x.x.241 152.x.x.116 Scanner Enforcement Violation
4 117.x.x.61 152.x.x.52 NaN
我想统计每个目标来自前10个来源的攻击数量。
我试过这样的方法:
import pandas as pd
outReport='test.xlsx'
df = pd.read_csv("IPSLogs2.csv")
def statsPerAttacker():
topSrc = df['Source'].value_counts()[:10]
mastaSR = pd.Series()
for ip in topSrc.to_dict():
df_statsPerAttacker = df[df['Source']==ip][['Source', 'Destination', 'Attack Name']].value_counts().to_frame()
mastaSR = pd.concat([mastaSR, df_statsPerAttacker], axis=1)
with pd.ExcelWriter(outReport, engine='openpyxl') as writer:
mastaSR.to_excel(writer, startcol=2, startrow=2, header=False)
if __name__ == '__main__':
statsPerAttacker()
我确实得到了结果,但是最后一列在每次源IP迭代后向右移动一个位置(见屏幕截图):
https://postimg.cc/8FBnq07g
我做错什么了?谢谢
1条答案
按热度按时间332nm8kg1#
问题可能是由于我对Series和DataFrame对象的无知引起的。我使用不同的方法解决了我的问题: