我有一个树木数据库,其中包含了多年来不同树木在不同生长阶段(扩大,增厚和成熟)的细胞数量。1月1日将是DOY 1,1月2日将是DOY 2,等等)。我简化了它,像这样做一个可重复的例子:
df <- data.frame("Year" = c(2012, 2012, 2012, 2012, 2012, 2012, 2012,
2012, 2012, 2012, 2013, 2013, 2013,
2013, 2013, 2013, 2013, 2013, 2013, 2013),
"Tree" = c(15, 15, 15, 15, 15, 22, 22, 22, 22, 22, 41, 41,
41, 41, 41, 53, 53, 53, 53, 53),
"DOY" = c(65, 97, 125, 177, 214, 65, 97, 125, 177, 214,
61, 99, 118, 166, 221, 61, 99, 118, 166, 221),
"Enlarging" = c(0, 2, 4, 5, 0, 0, 3, 6, 3, 0, 0, 5, 4, 4, 0, 0, 4, 7, 5, 0),
"Thickening" = c(0, 0, 2, 4, 0, 0, 0, 4, 3, 0, 0, 0, 3, 2, 0, 0, 2, 4, 2, 0),
"Maturing" = c(0, 0, 3, 7, 0, 0, 0, 3, 4, 0, 0, 3, 6, 8, 0, 0, 0, 4, 7, 0))
df <- df %>%
mutate(Year = as.factor(Year),
Tree = as.factor(Tree),
DOY = as.numeric(DOY),
Enlarging = as.numeric(Enlarging),
Maturing = as.numeric(Maturing))
print(df)
Year Tree DOY Enlarging Thickening Maturing
1 2012 15 65 0 0 0
2 2012 15 97 2 0 0
3 2012 15 125 4 2 3
4 2012 15 177 5 4 7
5 2012 15 214 0 0 0
6 2012 22 65 0 0 0
7 2012 22 97 3 0 0
8 2012 22 125 6 4 3
9 2012 22 177 3 3 4
10 2012 22 214 0 0 0
11 2013 41 61 0 0 0
12 2013 41 99 5 0 3
13 2013 41 118 4 3 6
14 2013 41 166 4 2 8
15 2013 41 221 0 0 0
16 2013 53 61 0 0 0
17 2013 53 99 4 2 0
18 2013 53 118 7 4 4
19 2013 53 166 5 2 7
20 2013 53 221 0 0 0
我想在细胞的每个生长阶段nº和DOY之间应用单独的逻辑回归(对于放大,例如:放大~ DOY)为每一个不同的树,每年.我已经尝试了几件事,例如按年份和树分组,并应用逻辑回归为每个生长阶段,一个接一个:
df_enlarging <- df %>%
select(Tree, Year, Enlarging)%>%
group_by(Tree, Year)%>%
mutate(the_glm = glm(Enlarging ~ DOY, family = "binomial", data = df),
Fitted = predict(the_glm, type = "response"))
我还试着旋转我的数据,嵌套它(这样我就可以同时对每年的每棵树的三个生长阶段应用逻辑回归),然后做同样的事情,就像这样:
df_long <- df %>%
pivot_longer(Enlarging:Mature,
names_to = 'Growth_Phase',
values_to = 'Count') %>%
ungroup()
df_nested <- df_long %>%
nest_by(Year, Tree, as.factor(Growth_Phase)) #tried converting growth_phase to factor also
df_glm <- df_nested %>%
rowwise() %>%
mutate(the_glm = list(glm(Count ~ DOY, family = "binomial", data = data)),
Fitted = list(predict(the_glm, type = "response")))
这一切都不起作用,在这两种情况下,我得到了相同的错误:Problem while computing
the_glm = glm(Enlarging ~ DOY, family = "binomial", data = data). Caused by error: ! y values must be 0 <= y <= 1`.有人知道我能做些什么来修复这个吗?非常感谢。
1条答案
按热度按时间mpgws1up1#
第一次每joran -是的,这将是最容易的,如果你改变你的数据为1/0。
我通过删除带有0的数据,然后“不计数”数据,将新行定义为1,然后将0数据添加回。之后,它是一个简单的分组,然后通过
purrr
Map模型,使用broom
清理结果并预测数据集。