pandas 根据字符串中的内容，使用乘法将一列整数和整数+字符串转换为所有整数

ki0zmccv 于 2023-01-01 发布在其他

关注(0)|答案(3)|浏览(238)

如何将这一列的值，大部分是整数，还有一些字符串，转换成全整数。
该列看起来像这样，

x1
___
128455551
92571902
123125
985166
np.NaN
2241
1.50000MMM
2.5255MMM
1.2255MMMM
np.NaN
...

我想让它看起来像这样，其中的行与MMM，字符被删除，数字乘以十亿（109），并转换为整数。
有MMMM的行，字符被丢弃，数字乘以万亿（1012）并转换为整数。
基本上，每个M表示1，000。还有其他列，所以我不能删除np.NaN。

x1
___
128455551
92571902
123125
985166
np.NaN
2241
1500000000
2525500000
1225500000000
np.NaN
...

我试过了，

df['x1'] =np.where(df.x1.astype(str).str.contains('MMM'), (df.x1.str.replace('MMM', '').astype(float) * 10**9).astype(int), df.x1)

当我只使用2行时，它工作正常，但当我使用整个 Dataframe 时，我得到这个错误，IntCastingNaNError: Cannot convert non-finite values (NA or inf) to integer。
我该怎么补救？

pandas

来源：https://stackoverflow.com/questions/74951025/covert-a-column-of-integers-and-interger-strings-into-all-integers-using-multi

3条答案

按热度按时间

vx6bjr1n1#

可能的解决方案：

def f(x):
    if isinstance(x, str):
        ms = x.count('M')
        return float(x.replace('M' * ms, '')) * 10**(3 * ms)
    else:
        return x

df['x1'] = df['x1'].map(f).astype('Int64')

输出：

x1
0      128455551
1       92571902
2         123125
3         985166
4           <NA>
5           2241
6     1500000000
7     2525500000
8  1225500000000
9           <NA>

赞(0）回复(0）举报 2023-01-01

wbgh16ku2#

您也可以尝试以下解决方案：

import numpy as np

(df.x1.str.extract('([^M]+)(M+)?').replace({np.NaN : None})
 .assign(power = lambda x: 10 ** (3 * x.loc[:, 1].str.count('M').fillna(0)))
 .pipe(lambda d: d.loc[:, 0].replace({'np.NaN' : None}).astype(float).mul(d.power)))

0    1.284556e+08
1    9.257190e+07
2    1.231250e+05
3    9.851660e+05
4             NaN
5    2.241000e+03
6    1.500000e+09
7    2.525500e+09
8    1.225500e+12
9             NaN
dtype: float64

赞(0）回复(0）举报 2023-01-01

t1rydlwq3#

当考虑包含M的字符串值时，可以将净化值乘以1000，次数为M（根据您的条件 "基本上每个M都意味着1，000"）：

df['x1'] = np.where(df.x1.str.contains('M'),
                    (df.x1.str.replace('M', '').astype(float) \
                     * pow(1000, df.x1.str.count('M'))).astype('Int64'), df.x1)

print(df)

x1
0      128455551
1       92571902
2         123125
3         985166
4            NaN
5           2241
6     1500000000
7     2525500000
8  1225500000000
9            NaN

赞(0）回复(0）举报 2023-01-01

我来回答

pandas 根据字符串中的内容，使用乘法将一列整数和整数+字符串转换为所有整数

3条答案

相关问题

热门标签

最新问答