如何从json类型系列中添加新的dataframe列(字符串)

iyfjxgzm  于 2022-12-30  发布在  其他
关注(0)|答案(2)|浏览(121)

我想解析和处理我的 Dataframe 数据。
我尝试使用join、assign ...等。我使用以下代码成功解析了“允许”列:

allowed_expanded = df1.allowed.apply(lambda x:pd.Series(x))
allowed_expanded.columns = ['{}.{}'.format('allowed',i) for i in allowed_expanded]

结果是:

# allowed_expanded

                                             allowed.0                  allowed.1   allowed.2
0           {'IPProtocol': 'tcp', 'ports': ['53']}                        NaN         NaN
1   {'IPProtocol': 'tcp', 'ports': ['22', '3389']}                    NaN         NaN
2                               {'IPProtocol': 'icmp'}     {'IPProtocol': 'sctp'}         NaN
3                            {'IPProtocol': 'all'}                        NaN         NaN

但这不是我想要的。
我该怎么办?
现在我的数据看起来:

# print(df)
          network                                            allowed
0           vpc-1           [{'IPProtocol': 'tcp', 'ports': ['53']}]
1           vpc-1   [{'IPProtocol': 'tcp', 'ports': ['22', '3389']}]
2           vpc-1   [{'IPProtocol': 'icmp'}, {'IPProtocol': 'sctp'}]
3           vpc-1                            [{'IPProtocol': 'all'}]
...

还有我想要的

# print(df)
          network           allowed.IPProtocol    allowed.ports
0           vpc-1                          tcp               53
1           vpc-1                          tcp         22, 3389
2           vpc-1                   icmp, sctp                -
3           vpc-1                          all                -
...
jjjwad0x

jjjwad0x1#

def func(row):
    IPProtocol = []
    ports = []
    for item in row:
        IPProtocol.append(item.get('IPProtocol', None))
        ports.append(item.get('ports', None))
    return pd.Series([IPProtocol, ports])

df[['allowed.IPProtocol', 'allowed.ports']] = df['allowed'].apply(lambda x: func(x))

希望能有所帮助!

krcsximq

krcsximq2#

你能试试这个吗:

import numpy as np

df['allowed.IPProtocol']=df['allowed'].apply(lambda x: ', '.join([i['IPProtocol'] for i in x]))
df['allowed.ports']=df['allowed'].apply(lambda x: ', '.join([', '.join(i['ports']) if 'ports' in list(i.keys()) else 'nan' for i in x]))

输出

|    | network   | allowed                                          | allowed.IPProtocol   | allowed.ports   |
|---:|:----------|:-------------------------------------------------|:---------------------|:----------------|
|  0 | vpc-1     | [{'IPProtocol': 'tcp', 'ports': ['53']}]         | tcp                  | 53              |
|  1 | vpc-1     | [{'IPProtocol': 'tcp', 'ports': ['22', '3389']}] | tcp                  | 22, 3389        |
|  2 | vpc-1     | [{'IPProtocol': 'icmp'}, {'IPProtocol': 'sctp'}] | icmp, sctp           | nan, nan        |
|  3 | vpc-1     | [{'IPProtocol': 'all'}]                          | all                  | nan             |

相关问题