给定R中的一个 Dataframe ,其中不同列可以作为因变量,我尝试创建一个函数来接收 Dataframe “df”,列表或向量,因变量为“vars”,时间变量为“time”,状态变量为“status”,使用“survfit”返回生存结果,使用ggsurvplot返回Kaplan-meier曲线。
这样做的目的是避免过多的复制和粘贴代码。
以下面的数据为例:
library(ggplot2)
library(survival)
library("dplyr")
df <- lung %>%
transmute(time,
status, # censoring status 1=censored, 2=dead
Age = age,
Sex = factor(sex, labels = c("Male", "Female")),
ECOG = factor(lung$ph.ecog),
`Meal Cal` = as.numeric(meal.cal))
# help(lung)
# Turn status into (0=censored, 1=dead)
df$status <- ifelse(df$status == 2, 1, 0)
我当然可以做这样的生存分析:
fit <- survfit(Surv(time, status) ~ ECOG, data = df)
ggsurvplot(fit,
pval = TRUE, pval.coord = c(750, 0.3),
conf.int = FALSE,
surv.median.line = "hv",
legend = c(0.8, 0.6),
legend.title = "",
risk.table = "absolute",
risk.table.y.text = FALSE,
xlab = "Time (days)", ylab = "Survival",
palette="jco",
title="Overall Survival", font.title = c(16, "bold", "black"),
)
然而,如果我想对Sex做同样的事情,我必须再次复制和粘贴所有的东西,所以我想在R中创建一个函数,它将数据框“df”、因变量列表“vars”、时间变量“time”和状态变量“status”作为输入,并使用“survfit”返回生存结果,使用“ggsurvplot”返回Kaplan-Meier曲线,如下所示:
vars <- c("ECOG", "Sex")
surv_plot_func <- function(df, vars, time, status) {
results_list <- lapply(vars, function(var, time, status) {
# Fit a survival model
fit <- survfit(Surv(as.numeric(df[[time]]), as.logical(df[[status]])) ~ as.factor(df[[var]]), data = df)
# Plot the Kaplan-Meier curve using ggsurvplot
ggsurv <- ggsurvplot(fit, pval = TRUE, conf.int = TRUE,
risk.table = TRUE, legend.title = "",
surv.median.line = "hv", xlab = "Time", ylab = "Survival Probability")
# Return the fit and ggsurv as a list
list(fit = fit, ggsurv = ggsurv)
})
# Return the list of results
results_list
}
res_list <- surv_plot_func(df, vars, "time", "status")
但是,没有成功。有什么办法吗?
1条答案
按热度按时间qoefvg9y1#
下面的代码对我有效。