我想得到所有的字符串,看起来像这样:
ph <- c ("ioL421.63", #6 chars.2 chars
"jur421.73.0o4435", #6 chars.2 chars.6 chars
"koL421.2p.9i4675.k23", #6 chars.2 chars.6 chars.3 chars
"6775po.78.678959.p2p.913", #6 chars.2 chars.6 chars.3 chars.3 chars
"193485.k2.l3.34.67", #6 chars.2 chars.2 chars.2 chars.2 chars
"ioL421.6", #6 chars.1 chars
"jur421.3.0o4", #6 chars.1 chars.3 chars
"koL421.2.9i5.k2390", #6 chars.1 chars.3 chars.5 chars
"6775po.8.678.p2p91.674e", #6 chars.1 chars.3 chars.5 chars.4 chars
#***** Then only with these lengths ******
"842f45", #6 chars
"234567890123567hk", #17 chars
"234567890123567hkiq", #19 chars
"234567890123567hkiq5" #20 chars
)
以下是无效字符串:
invalid_ph <- c("23289jh", # 7 chars
"2382h", #5 chars
"2934567890123567h8", # 18 chars
"234567890123q3",
"234567890123567hkiq57878787",
"ZX3.235.9845.3843924.39403",
"sjkfuju2rwrrlnmld828384230403208402834fs",
"TY5648.235.123456",
"ABC3.235.9845",
"361 234 4356",
"a1.02.b3.00",
"01.01.01",
"23289jhd",
"01",
"01.02",
"01.01.01",
"aa.bb",
"ac.21",
"aa.01-02",
"123.2.10.834.18934",
"a1."
)
ph <- append(ph, invalid_ph)
我第一次使用正则表达式,并提出了下面的,想知道我如何才能巩固他们和纠正一旦不产生正确的输出。library(stringr)
使用stringr包提取字符串。
str_extract(ph, "^([a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{2}|[a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{1})$")
str_extract(ph, "^([a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{2}([.])[a-zA-Z0-9]{6}|[a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{1}([.])[a-zA-Z0-9]{3})$")
str_extract(ph, "^([a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{2}([.])[a-zA-Z0-9]{6}|[a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{1}([.])[a-zA-Z0-9]{3})$")
str_extract(ph, "^([a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{2}([.])[a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{3}|[a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{1}([.])[a-zA-Z0-9]{3}([.])[a-zA-Z0-9]{5})$")
str_extract(ph, "^([a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{2}([.])[a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{3}([.])[a-zA-Z0-9]{3}|[a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{1}([.])[a-zA-Z0-9]{3}([.])[a-zA-Z0-9]{5}([.])[a-zA-Z0-9]{4})$")
str_extract(ph, "^([a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{2}|[a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{1})$")
str_extract(ph, "^([a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{2}|[a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{1})$")
str_extract(ph, "^([a-zA-Z0-9]{6}([.])[a-zA-Z0-9]{2}([.])[a-zA-Z0-9]{2}([.])[a-zA-Z0-9]{2}([.])[a-zA-Z0-9]{2})$")
2条答案
按热度按时间vbopmzt11#
使用函数创建所有有效模式:
nkcskrwz2#
将其拆分为2个正则表达式模式,但测试结果仅与非匹配字符串列表一样好:
测试字符串:
创建于2023-06-06带有reprex v2.0.2