我正在用RE.VERBOSE从HTM文本中提取一些信息。因为HKDMOPrate并不总是出现在代码中,所以我将其作为一个可选组。然而,代码并没有像预期的那样工作。下面是我的代码:
def get_primerate_result(text):
pattern="""
(effect\sfrom\s)
(?P<Date>[A-Za-z0-9\,\s]+)
(\s\()
(.*USD\sprime\srate)
(.*to\s)
(?P<USDrate>[0-9\.\%]+)
(\sp\.a\.)
((.*HKD\sand\sMOP\sprime\srate)
(.*to\s)
(?P<HKDMOPrate>[0-9\.\%]+)
(\sp\.a\.))?
"""
dict_result=[i.groupdict() for i in re.finditer(pattern, text, re.VERBOSE)]
return dict_result
字符串
以下是两个示例输入:
输入:
正文1:
'Dear Customers,\nWith the Federal Reserve System raising its federal funds rate by 0.25%, our bank is here to announce that with effect from July 28, 2023 (Friday), our USD prime rate will be increased from 8.25% p.a. to 8.50% p.a., our HKD and MOP prime rate will be increased from 6.00% p.a. to 6.125% p.a.\nBank of China Limited Macau Branch\nBank of China (Macau) Limited\nJuly 27, 2023\nPlease click to check:\n\nPrime Rate\n'
型
正文二:
'Dear Customers,\nWith the Federal Reserve System raising its federal funds rate by 0.25%, our bank is here to announce that with effect from March 24, 2023 (Friday), our USD prime rate will be increased from 7.75% p.a. to 8.00% p.a.\nBank of China Limited Macau Branch\nBank of China (Macau) Limited\nMarch 24, 2023\nPlease click to check\n\nPrime Rate\n'
型
以下是我想要的输出:
result 1: [{'Date': 'July 28, 2023', 'USDrate': '8.50%', 'HKDMOPrate': '6.125%'}]
result 2: [{'Date': 'March 24, 2023', 'USDrate': '8.00%', 'HKDMOPrate': None}]
型
实际产量
result 1: [{'Date': 'July 28, 2023', 'USDrate': '6.125%', 'HKDMOPrate': None}]
result 2: [{'Date': 'March 24, 2023', 'USDrate': '8.00%', 'HKDMOPrate': None}]
型
1条答案
按热度按时间bvuwiixz1#
正如InSync所建议的那样,通过将
.*
before设置为lazy来解决这个问题:(.*?to\s)
的值。