我在R中有一个 Dataframe ,它包含一些序列样本的覆盖率信息,列中有很多文本数据,我只想从中提取覆盖率数字
这是密码
df <- data.frame(sampleA = c("There is a 91.24% of reference with a coverageData >= 1X", "There is a 90.89% of reference with a coverageData >= 2X", "There is a 90.46% of reference with a coverageData >= 3X"),
sampleB = c("There is a 91.22% of reference with a coverageData >= 1X", "There is a 90.99% of reference with a coverageData >= 2X", "There is a 90.77% of reference with a coverageData >= 3X")
)
这是数据框的外观
sampleA
1 There is a 91.24% of reference with a coverageData >= 1X
2 There is a 90.89% of reference with a coverageData >= 2X
3 There is a 90.46% of reference with a coverageData >= 3X
sampleB
1 There is a 91.22% of reference with a coverageData >= 1X
2 There is a 90.99% of reference with a coverageData >= 2X
3 There is a 90.77% of reference with a coverageData >= 3X
我希望得到如下输出
sampleA sampleB
1 91.24 91.22
2 90.89 90.99
3 90.46 90.77
我看到可以使用mutate_all
。但不确定语法
2条答案
按热度按时间jjhzyzn01#
我们可以在
dplyr::across()
中使用readr::parse_number()
:数据来自OP
由reprex package(v2.0.1)于2023年2月21日创建
cgfeq70w2#
使用
sub
和%
作为标记,将百分比值作为其他数字的目标。