regex 非贪婪正则表达式返回错误结果

lc8prwob  于 2023-04-07  发布在  其他
关注(0)|答案(1)|浏览(112)

我试图通过找出出现在点和冒号之间的文本以及后面的大写字符来清理摘要。为此,我使用正则表达式:
re.findall(r"\.\s(.*?):\s?[A-Z]", text)用于文本

text = 'Background: Flavonoids constitute one of the best-characterized groups of plant secondary metabolites with enormous pharmaceutical potential. A flavone type of plant flavonoid, cirsilineol, has been reported to exhibit proapoptotic effects against malignant human cells. Objectives: The present study was designed to investigate the antiproliferative effects of cirsilineol against human gastric cancer cells. Materials and Methods: Cell viability was assessed by 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) and colony formation assays. Apoptosis was detected by acridine orange/ethidium bromide (AO/EB) and annexin V/propidium iodide (PI) assay. Protein expression was examined by western blotting analysis. Results: The results showed cirsilineol inhibits the proliferation of human gastric cancer cells. The IC50 of cirsilineol against human gastric cancer cells (BGC-823, SGC-7901, and MGC-803) ranged from 8 to 10 mu M. Nonetheless, cirsilineol exhibited comparatively lower antiproliferative effects against normal GES-1 cells. The IC50 of cirsilineol against normal GES-1 cells was found to be 120 mu M. Colony formation assay showed that cirsilineol suppressed the colony formation of BGC-823 and MGC-803 cells in a dose-dependent manner. Acridine orange and ethidium bromide (AO/EB) staining showed that cirsilineol induced apoptosis in BGC-823 and MGC-803 cells. The percentage of apoptosis increased from 7.4% in control to 40.5% in BGC-823 cells and from 6.56% in control to 33.53% in MGC-803 cells at 8 mu M cirsilineol. Western blotting showed cirsilineol caused an increase in Bax and cleaved caspase-3 and a decrease in Bcl-2 expression in both BGC-823 and MGC-803 cells. Conclusion: Together, the results are indicative of the proapoptotic and antitumor potential of cirsilineol against gastric cancer cells, suggestive of its possible therapeutic significance in future.'

然而,第一提取的模式是:

'A flavone type of plant flavonoid, cirsilineol, has been reported to exhibit proapoptotic effects against malignant human cells. Objectives',

而它应该是' Objectives '
我错过了什么?

mnemlml8

mnemlml81#

惰性修饰符是指如果匹配现在可以停止,则显示停止,不再查看。它不影响匹配的开始位置。
为了达到你所描述的,你需要从匹配中排除.。所以在这种情况下,你的正则表达式将是:

\.\s([^.]*?):\s?[A-Z]

这样,除了开始的一个点,在你的比赛中不允许有点。
你也可以用

(?<=\.\s)[^.]+(?=:\s?[A-Z])

这种方式的匹配结果将只包含一个点和一个冒号后面跟着大写字母之间的文本,但不包含那些点,冒号和大写字母,如果你需要使用其他语言。
对于python***,它可以双向工作***!

相关问题