regex Python dataframe在替换原始上下文中的点后返回空格

aiqt4smr 于 2023-02-10 发布在 Python

关注(0)|答案(3)|浏览(115)

原始 Dataframe 在数字中包含点，例如：3.200.000。在本例中，点代表千位分隔符而不是逗号，我尝试使用以下代码删除千位分隔符：

pattern_shareholding_numbers = re.compile(r'[\d.]*\d+')

shareholding_percentage_df = df[(~df["Jumlah Lembar Saham"].str.startswith("Saham") & (df["Jabatan"] == "-"))]
shareholding_percentage_df = df[(~df["Jumlah Lembar Saham"].str.startswith("Jumlah Lembar Saham") & (df["Jabatan"] == "-"))]
shareholding_percentage_df.reset_index(drop=True, inplace=True)
shareholding_percentage_list = df["Jumlah Lembar Saham"].to_list()
shareholding_percentage_string = ' '.join(shareholding_percentage_list)
matches = pattern_shareholding_numbers.findall(shareholding_percentage_string)

matches_dot_removed = []
for dot in matches:
    dot_removed = []
    for e in dot:
        e = e.replace('.', '')
        e = e.replace('.', '')
        dot_removed.append(e)
    matches_dot_removed.append(dot_removed)

shareholding_percentage_float = str(matches_dot_removed).rstrip('')
print(shareholding_percentage_float)

上面的代码成功地替换了千位分隔符，现在它返回类似于以下内容的内容：

[['3', '', '2', '0', '0', '', '0', '0', '0'], ['2', '', '9', '0', '0', '', '0', '0', '0'], ['2', '', '9', '0', '0', '', '0', '0', '0'], ['1', '', '0', '0', '0', '', '0', '0', '0']]

我正试图找到一种方法来消除间隔，并挤压在一起的数字，使它将是这样的东西：

['3200000'], ['2900000'], ['2900000'], ['1000000']

regex

来源：https://stackoverflow.com/questions/75394095/python-dataframe-returns-empty-spacings-after-replacing-dots-from-the-original-c

3条答案

按热度按时间

gijlo24d1#

可以在替换点之前将列的数据类型转换为字符串。您可以使用dataframe的astype（）方法来完成此操作：

df['column_name'] = df['column_name'].astype(str)

df['column_name'] = df['column_name'].str.replace('.', '')

将列的数据类型转换为字符串后，可以执行字符串操作而不会出现任何问题。完成后，可以根据需要将数据类型转换回原始数据类型。

赞(0）回复(0）举报 2023-02-10

nuypyhwy2#

如果这些数字肯定是整数，则：

numberstring = '320.000.000'
numbers = numberstring.split('.')
dot_removed= ''.join(numbers)
print(dot_removed)
# 320000000

赞(0）回复(0）举报 2023-02-10

but5z9lq3#

你可以用你已经有的replace语句遍历混合值数字字符串的列表。

num_list = ['3.200.000.', '2.200.000.', '2.900.000.', '4.300.000.']

那么

numeric = []
for num in num_list:
    numeric.append(num.replace('.', ''))

print(numeric)

意志输出

['3200000', '2200000', '2900000', '4300000']

赞(0）回复(0）举报 2023-02-10

我来回答

regex Python dataframe在替换原始上下文中的点后返回空格

3条答案

相关问题

热门标签

最新问答