R语言筛选“当前”和“下一”行中与字符串匹配的行

i5desfxk 于 2023-06-03 发布在其他

关注(0)|答案(1)|浏览(150)

我用readr:: read_tsv ("file.txt", show_col_types = F)在R中读取了一个连续文本文件作为 Dataframe （df）。结果df有1列和80，398行。我想过滤并保留包含"Run = \\d{1,3}"的行，但前提是在另外8行之后后面跟着".*Final Intermolecular Energy ="。
有人能给我点光吗？

r

来源：https://stackoverflow.com/questions/76369145/filter-rows-that-match-string-within-the-current-and-next-rows

1条答案

按热度按时间

k97glaaz1#

一个选项是使用两个标准filter()，一个用于“Run = \d{1，3}”，另一个用于“Final Intermolecular...”，使用来自dplyr包的lead() function，以确保“Final Intermolecular...”在“Run = \d{1，3}”之前8行，例如

library(tidyverse)

df <- data.frame(x = c(0, "Run = 123", 1, 2, 3, 4, 5, 6, 7, 8, "xxyyzz Final Intermolecular Energy     = 123", 0,
                       0, "Run = 345", 1, 2, 3, 4, 5, 6, 7, 8, "not final energy", 0))
df
#>                                               x
#> 1                                             0
#> 2                                     Run = 123
#> 3                                             1
#> 4                                             2
#> 5                                             3
#> 6                                             4
#> 7                                             5
#> 8                                             6
#> 9                                             7
#> 10                                            8
#> 11 xxyyzz Final Intermolecular Energy     = 123
#> 12                                            0
#> 13                                            0
#> 14                                    Run = 345
#> 15                                            1
#> 16                                            2
#> 17                                            3
#> 18                                            4
#> 19                                            5
#> 20                                            6
#> 21                                            7
#> 22                                            8
#> 23                             not final energy
#> 24                                            0

df %>%
  filter(str_detect(x, "Run = \\d{1,3}") & 
           str_detect(lead(x, n = 9), ".*Final Intermolecular Energy     ="))
#>           x
#> 1 Run = 123

# doesn't detect "Run = 345" as it doesn't match the second criteria

创建于2023-05-31带有reprex v2.0.2

赞(0）回复(0）举报 2023-06-03

我来回答

R语言筛选“当前”和“下一”行中与字符串匹配的行

1条答案

相关问题

热门标签

最新问答

R语言 筛选“当前”和“下一”行中与字符串匹配的行

1条答案

相关问题

热门标签

最新问答

R语言筛选“当前”和“下一”行中与字符串匹配的行