regex 在python正则表达式练习中从匹配中排除特定字符

tktrz96b  于 2023-06-25  发布在  Python
关注(0)|答案(1)|浏览(157)

我有以下练习。我只需要填写正则表达式列表。我不碰代码的任何其他部分。我被数字5(既不是a也不是b,但“”允许)和7(正好两个词,不管白色)卡住了。

import re

# List of strings used for testing.
slist = [ "aaabbb", "aaaaaa", "abbaba", "aaa", "gErbil ottEr",
    "tango samba rumba", " hello world ", " Hello World " ]

# List of regular expressions to be completed by the student.
relist = [
    r"",  # 1. Only a's followed by only b's, including ""
    r"",  # 2. Only a's, including ""
    r"",  # 3. Only a's and b's, in any order, including "" 
    r"",  # 4. Exactly three a's
    r"",  # 5. Neither a's nor b's, but "" allowed
    r"",  # 6. An even number of a's (and nothing else)
    r"",  # 7. Exactly two words, regardless of white spaces
    r"",  # 8. Contains a word that ends in "ba"
    r""   # 9. Contains a word that starts with a capital
]

for s in slist:
    print( s, ':', sep='', end=' ' )
    for i in range( len( relist ) ):
        m = re.search( relist[i], s )
        if m:
            print( i+1, end=' ' )
    print()

我得出了以下结论:

import re

# List of strings used for testing.
slist = [ "aaabbb", "aaaaaa", "abbaba", "aaa", "gErbil ottEr",
    "tango samba rumba", " hello world ", " Hello World " ]

# List of regular expressions to be completed by the student.
relist = [
    r"(^a+)(b*$)"           ,  # 1. Only a's followed by only b's, including ""
    r"(^a)(a+$)"            ,  # 2. Only a's, including ""
    r"^[a|b]+[a|b]$"        ,  # 3. Only a's and b's, in any order, including "" 
    r"^a{3}$"               ,  # 4. Exactly three a's
    r"[^a|^b]"              ,  # 5. Neither a's nor b's, but "" allowed
    r"^(a{2})+$"            ,  # 6. An even number of a's (and nothing else)
    r"^\s*\w+\s+\w+$"       ,  # 7. Exactly two words, regardless of white spaces
    r".*ba\b"               ,  # 8. Contains a word that ends in "ba"
    r".*\b\s+[A-Z]"            # 9. Contains a word that starts with a capital
]

for s in slist:
    print( s, ':', sep='', end=' ' )
    for i in range( len( relist ) ):
        m = re.search( relist[i], s )
        if m:
            print( i+1, end=' ' )
    print()

这并未给予预期结果,即:

aaabbb: 1 3   
aaaaaa: 1 2 3 6    
abbaba: 3 8    
aaa: 1 2 3 4     
bEver ottEr: 7    
tango samba rumba: 8     
 hello world : 5 7     
 Hello World : 5 7 9

我已经找了几个小时,虽然有很多正则表达式查询,从他们,我不能得到解决我的问题。问题是

r"[^a|^b]"              ,  # 5. Neither a's nor b's, but "" allowed

我只是不知道如何排除一个角色如果我和abc比较,程序会很高兴地匹配c并允许abc。有没有一种方法可以说,如果有一个特定的字母,比它是没有匹配,故事结束?

r"^\s*\w+\s+\w+$"       ,  # 7. Exactly two words, regardless of white spaces

我都不知道我在找什么。这段代码将gErbil ottEr识别为两个单词,但是尽管有\s*,它还是无法识别以空格开头的Hello World。我只是不明白问题出在哪里。
请注意,除了这些位,我不能更改代码。因此,任何其他re.模块都不是解决方案。我也遇到过消极的前瞻,但我的课程还没有(还没有?))触及这些。这一章我重读了好几遍。因此,我也不认为我应该用那个。我要用途:

  • 特殊序列(\B \B \d \D \n \r \s \S \t \w \W / \“' ^ $ .)
  • 重复(* +?{p,q} {p,} {p})
  • 还解释说:(B[aio]ll(ab)+(a| B)(a-dghy-z))

我知道可能有更简单或不同的方法来解决这个问题,但这个练习是关于在进入本章的下一部分之前学习上述方法。只有上面提到的表达式,我如何解决第5点和第7点?

wr98u20j

wr98u20j1#

7是唯一的技巧问题,因为它抛出的空格是不相关的。实际上,它与空白无关。

including ""的问题表示必须考虑的空字符串。

import re

# List of strings used for testing.
slist = [ "aaabbb", "aaaaaa", "abbaba", "aaa", "gErbil ottEr",
    "tango samba rumba", " hello world ", " Hello World " ]

# List of regular expressions to be completed by the student.
relist = [
    r"^(a+b*)?$"              ,  # 1. Only a's followed by only b's, including ""
    r"^(a+)?$"                ,  # 2. Only a's, including ""
    r"^([ab]+)?$"             ,  # 3. Only a's and b's, in any order, including "" 
    r"^a{3}$"                 ,  # 4. Exactly three a's
    r"^([^ab]+)?$"            ,  # 5. Neither a's nor b's, but "" allowed
    r"^(aa)+$"                ,  # 6. An even number of a's (and nothing else)
    r"^\W*(\w+\b\W*){2}$"     ,  # 7. Exactly two words, regardless of white spaces
    r"ba\b"                   ,  # 8. Contains a word that ends in "ba"
    r"\b[A-Z]"                   # 9. Contains a word that starts with a capital
]

for s in slist:
    print( s, ':', sep='', end=' ' )
    for i in range( len( relist ) ):
        m = re.search( relist[i], s )
        if m:
            print( i+1, end=' ' )
    print()

输出量

aaabbb: 1 3   
aaaaaa: 1 2 3 6    
abbaba: 3 8    
aaa: 1 2 3 4     
bEver ottEr: 7    
tango samba rumba: 8     
 hello world : 5 7     
 Hello World : 5 7 9

相关问题