regex 使用正则表达式清除字典值中的不规则性

b1payxdu 于 2022-11-18 发布在其他

关注(0)|答案(1)|浏览(75)

我需要从一个包含命名多边形坐标的文本文件创建一个字典。输出需要是一个字典，其中多边形名称是键，对应的x和y坐标是值。文件中的大多数条目遵循如下标准布局：

Name of polygon
(12.345, 1.2567)
(5.6789, 2.9876)
(9.0345, 3.7654)
(3.4556, 2.3445)

Name of next polygon
(x, y values)

但是，有些条目存在不规则性，例如所有值都在一行上，或者在括号之间有额外的字符。我需要对这些值进行循环，并拆分括号中包含的值。
到目前为止，我已经在文件的第一遍中创建了字典，并尝试使用正则表达式根据括号的内容拆分值：

with open(fpath, 'r') as infile:
     d = {}

     #split the data into keys and values
     for group in infile.read().split('\n\n'):
     entry = group.split('\n')
     key, *val = entry
            
     d[key] = val
     for value in d.values():
         value = re.split("*[\(.+$\)]*", str(value))

print(d)

我希望这样可以清理值，并为括号中包含的每组坐标创建单独的值，但是我得到了以下错误：
re.error: nothing to repeat at position 0

regex

来源：https://stackoverflow.com/questions/74275870/clean-up-irregularities-in-dictionary-values-using-regex

1条答案

按热度按时间

roejwanj1#

我想我已经找到了解决问题的方法。我需要在循环中考虑每个键的多个值，并使用re.findall()而不是re.split()。因此，我的最终循环如下所示：

for key, *value in d.items():
    d[key] = re.findall("\(.+\)", str(value))

赞(0）回复(0）举报 2022-11-18

我来回答

regex 使用正则表达式清除字典值中的不规则性

1条答案

相关问题

热门标签

最新问答