考虑a = paste(1:10,collapse=", "),其结果为
a = paste(1:10,collapse=", ")
a = "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"
我想每出现第n次(比如说第4次)“”,就替换一次,并替换为其他内容(比如说“\n”)。所需的输出将是:
"1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
我正在寻找一个代码,使用gsub(或等效的东西)和某种形式的regular expression来实现这一目标。
gsub
regular expression
sf6xfgos1#
您可以将((?:\d+, ){3}\d),替换为\1\n您基本上捕获了group1中的所有内容,直到第四个逗号,然后将其替换为\1\n,\1\n将匹配的文本替换为group1文本和换行符,从而得到预期的结果。
((?:\d+, ){3}\d),
\1\n
gsub("((?:\\d+, ){3}\\d),", "\\1\n", "1, 2, 3, 4, 5, 6, 7, 8, 9, 10")
指纹,
[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
为了将上述解决方案推广到任何文本,我们可以将\d更改为[^,]
\d
[^,]
gsub("((?:[^,]+, ){3}[^,]+),", "\\1\n", "1, 2, 3, 4, 5, 6, 7, 8, 9, 10") gsub("((?:[^,]+, ){3}[^,]+),", "\\1\n", "a, bb, ccc, dddd, 500, 600, 700, 800, 900, 1000")
输出,
[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10" [1] "a, bb, ccc, dddd\n 500, 600, 700, 800\n 900, 1000"
oknwwptz2#
regmatches作为另一个替换:
regmatches
a <- "1, 2, 3, 4, 5, 6, 7, 8, 9, 10" fn <- "," rp <- "\n" n <- 4 regmatches(a, gregexpr(fn, a)) <- list(c(rep(fn,n-1),rp)) a #[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
作为函数:
a <- "1, 2, 3, 4, 5, 6, 7, 8, 9, 10" replN <- function(x, fn, rp, n) { regmatches(x, gregexpr(fn, x)) <- list(c(rep(fn,n-1),rp)) x } replN(a, ",", "\n", 4) #[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10
你甚至可以扩展它,通过replacement参数进行向量化:
a = "1, 2, 3, 4, 5, 6, 7, 8, 9, 10" replN <- function(x,fn,rp,n) { sel <- rep(fn, n*length(rp)) sel[seq_along(rp)*n] <- rp regmatches(x, gregexpr(fn, x)) <- list(sel) x } replN(a, fn=",", rp=c("1st","2nd"), n=4) #[1] "1, 2, 3, 41st 5, 6, 7, 82nd 9, 10"
gblwokeq3#
同时使用regex和gsub。
regex
a = paste(1:10,collapse=", ") x <- gsub("([^,]*,[^,]*,[^,]*,[^,]*),", '\\1\n', a) x #> [1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
w46czmvw4#
regex是最好的选择,尽管如此,这里还有另一种不使用regex的方法
> str_vec <- strsplit(a, " ")[[1]] > where <- seq_along(str_vec) %% 4 == 0 > str_vec[where] <- sub(",", "\n", str_vec[where]) > paste(str_vec, collapse=" ") [1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
8iwquhpp5#
这个可以用字符串代替字符。我做了一个函数,你可以很容易地使用:)A demo here to understand the regex
> a = paste(1:10,collapse=", ") > a [1] "1, 2, 3, 4, 5, 6, 7, 8, 9, 10" > # if you want the 2nd occurence > gsub("(.*?,.*?),(.*)", "\\1\n\\2", a) [1] "1, 2\n 3, 4, 5, 6, 7, 8, 9, 10" > # if you want the 3rd occurence > gsub("(.*?,.*?,.*?),(.*)", "\\1\n\\2", a) [1] "1, 2, 3\n 4, 5, 6, 7, 8, 9, 10" > # if you want the 4rd occurence > gsub("(.*?,.*?,.*?,.*?),(.*)", "\\1\n\\2", a) [1] "1, 2, 3, 4\n 5, 6, 7, 8, 9, 10" > # if you want the last occurence > gsub("(.*,.*),(.*)", "\\1\n\\2", a) [1] "1, 2, 3, 4, 5, 6, 7, 8, 9\n 10" > > > replace.occurence <- function(x, pattern, replacement, which.occu) { + if( which.occu == "last" ) { + gsub(paste0("(.*", pattern, ".*)", pattern, "(.*)"), paste0("\\1", replacement, "\\2"), x) + } else { + gsub(paste0("(.*?", paste0(rep(paste0(pattern, ".*?"), which.occu - 1), collapse = ""), ")", pattern, "(.*)"), paste0("\\1", replacement, "\\2"), x) + } + } > > replace.occurence(a, pattern = ",", replacement = "\n", which.occu = 2) [1] "1, 2\n 3, 4, 5, 6, 7, 8, 9, 10" > replace.occurence(a, pattern = ",", replacement = "\n", which.occu = 3) [1] "1, 2, 3\n 4, 5, 6, 7, 8, 9, 10" > replace.occurence(a, pattern = ",", replacement = "\n", which.occu = 4) [1] "1, 2, 3, 4\n 5, 6, 7, 8, 9, 10" > replace.occurence(a, pattern = ",", replacement = "\n", which.occu = "last") [1] "1, 2, 3, 4, 5, 6, 7, 8, 9\n 10" > > replace.occurence(a, pattern = ", 3, 4,", replacement = ", 4, 3,", which.occu = 1) [1] "1, 2, 4, 3, 5, 6, 7, 8, 9, 10"
5条答案
按热度按时间sf6xfgos1#
您可以将
((?:\d+, ){3}\d),
替换为\1\n
您基本上捕获了group1中的所有内容,直到第四个逗号,然后将其替换为
\1\n
,\1\n
将匹配的文本替换为group1文本和换行符,从而得到预期的结果。指纹,
为了将上述解决方案推广到任何文本,我们可以将
\d
更改为[^,]
输出,
oknwwptz2#
regmatches
作为另一个替换:作为函数:
你甚至可以扩展它,通过replacement参数进行向量化:
gblwokeq3#
同时使用
regex
和gsub
。w46czmvw4#
regex是最好的选择,尽管如此,这里还有另一种不使用regex的方法
8iwquhpp5#
这个可以用字符串代替字符。我做了一个函数,你可以很容易地使用:)
A demo here to understand the regex