我试图找到包含任何类型的am
或pm
的时间模式,并希望将整个模式替换为--
。
我所想的是找到包含am
或pm
的字符串,它们之前/之间/之后可能包含或不包含点.
,然后与它们之前的任何数字模式一起提取,直到我到达一白色。
下面是原始数据t0
:
t0 <- c("29th October 2022 5-6pm", "12-1pm 02/11/22", "10:25 bike rack at bexley college erith", "November 2nd 2022, apm shop ", " between 7pm Thursday 27th October to Saturday 29th October 9am", "04/09/2022 at 4 a.m.", "4/09/2022 at 4.a.m.", "04/09/2022 at 4.a.m" , "28.10.22 between 1.30pm and midnight", " Sunday 30th October 2022 between 11am and 3pm", "30th October, approx 6pm", "03/11/2022", "02/11/22 at campus", "Between 15:15 and 21:10", "03/11/2022 7pm", " Between 5:30pm and 6:30pm on 31/10/2022", "10am-2pm 31 oct 2022", "31/10/22 5.15am", " Tuesday 25th October 2022. 10:30pm", "30/10/2022 6pm")
然后我创建两个变量t1
和t2
来存储搜索结果和gsub
结果,得到的结果如下:
library("stringr")
t1 <- t0[str_detect(t0, "\\s[\\s|0-9|\\.|:]+a\\.?m\\.?|p\\.?m\\.?")]
t2 <- t1 %>% gsub("\\s[\\s|0-9|\\.|:]+a\\.?m\\.?|p\\.?m\\.?","--", .)
> t1
[1] "29th October 2022 5-6pm" "12-1pm 02/11/22"
[3] "November 2nd 2022, apm shop " " between 7pm Thursday 27th October to Saturday 29th October 9am"
[5] "04/09/2022 at 4 a.m." "4/09/2022 at 4.a.m."
[7] "04/09/2022 at 4.a.m" "28.10.22 between 1.30pm and midnight"
[9] " Sunday 30th October 2022 between 11am and 3pm" "30th October, approx 6pm"
[11] "03/11/2022 7pm" " Between 5:30pm and 6:30pm on 31/10/2022"
[13] "10am-2pm 31 oct 2022" "31/10/22 5.15am"
[15] " Tuesday 25th October 2022. 10:30pm" "30/10/2022 6pm"
> t2
[1] "29th October 2022 5-6--" "12-1-- 02/11/22"
[3] "November 2nd 2022, a-- shop " " between 7-- Thursday 27th October to Saturday 29th October--"
[5] "04/09/2022 at 4 a.m." "4/09/2022 at--"
[7] "04/09/2022 at--" "28.10.22 between 1.30-- and midnight"
[9] " Sunday 30th October 2022 between-- and 3--" "30th October, approx 6--"
[11] "03/11/2022 7--" " Between 5:30-- and 6:30-- on 31/10/2022"
[13] "10am-2-- 31 oct 2022" "31/10/22--"
[15] " Tuesday 25th October 2022. 10:30--" "30/10/2022 6--"
而期望的结果是:
> t2
[1] "29th October 2022--" "-- 02/11/22"
[3] " between-- Thursday 27th October to Saturday 29th October--" "04/09/2022 at--"
[5] "4/09/2022 at--" "04/09/2022 at--"
[7] "28.10.22 between-- and midnight" " Sunday 30th October 2022 between-- and--"
[9] "30th October, approx--" "03/11/2022--"
[11] " Between-- and-- on 31/10/2022" "----- 31 oct 2022"
[13] "31/10/22--" " Tuesday 25th October 2022.--"
[15] "30/10/2022--"
我应该如何更正正则表达式模式?
1条答案
按热度按时间j91ykkif1#
这和你声称的“期望结果”之间的唯一区别是
[12]
,