regex 在Python中使用正则表达式排除特定字符串

rkkpypqq  于 2023-04-07  发布在  Python
关注(0)|答案(2)|浏览(192)

我想对下面的代码应用正则表达式,这样我就可以删除逗号和单词“AS”之间出现的任何字符串。

Select customer_name, customer_type, COUNT(*) AS volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10

预期输出:

Select customer_name, customer_type, volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10

我尝试了下面的,但没有给予所需的输出

result = re.sub(r",\s*COUNT\(\*\)\s*AS\s*\w+", "", text)
fumotvh3

fumotvh31#

您可以使用捕获组并在替换中使用该组。

(,\s*)[^,]*\sAS\b\s*

说明

  • (,\s*)捕获组1,匹配逗号和可选空格字符
  • [^,]*匹配除逗号以外的任何字符
  • \sAS\b\s*匹配一个空格字符,然后AS后跟可选空格

Regex demo|Python demo

import re
 
pattern = r"(,\s*)[^,]*\sAS\b\s*"
s = ("Select customer_name, customer_type, COUNT(*) AS volume\\nFROM table\\nGROUP BY customer_name, customer_type\\nORDER BY volume DESC\\nLIMIT 10\n")
 
print(re.sub(pattern, r"\1", s))

输出

Select customer_name, customer_type, volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10
1hdlvixo

1hdlvixo2#

我会用途:

text = "Select customer_name, customer_type, COUNT(*) AS volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10"
result = re.sub(r',\s*\S+\s+AS\b\s*', ', ', text)
print(result)

这将打印:

Select customer_name, customer_type, volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10

这里使用的正则表达式模式表示匹配:

  • ,一个逗号
  • \s*可选空格
  • \S+非空白术语
  • \s+一个或多个空白字符
  • AS文字“AS”
  • \b字边界
  • \s*更多可选空格

相关问题