R语言 基于现有列计算数据框中的新列

63lcw9qa  于 2023-03-05  发布在  其他
关注(0)|答案(3)|浏览(140)

我有这个数据框5列的股票是当前_股票。我想一个新的列股票_超过_时间,然后计算股票_超过_时间=股票-销售+购买。

df=tibble(article=rep("article one",5), 
week=c(1,2,3,4,5), 
sales=10, 
purchase=c(5,0,5,5,0), 
stock=c(50))

# A tibble: 5 x 5
  article      week sales purchase stock
  <chr>       <dbl> <dbl>    <dbl> <dbl>
1 article one     1    10        5    50
2 article one     2    10        0    50
3 article one     3    10        5    50
4 article one     4    10        5    50
5 article one     5    10        0    50

我的最终 Dataframe 应该如下所示:

# A tibble: 5 x 5
  article      week sales purchase stock stock_over_time
  <chr>       <dbl> <dbl>    <dbl> <dbl>  <dbl>
1 article one     1    10        5    50     NA
2 article one     2    10        0    50     45
3 article one     3    10        5    50     35
4 article one     4    10        5    50     30
5 article one     5    10        0    50     25

...其中库存_超时计算为:

50 - 10 + 5 = 45
45 - 10 + 0 = 35
35 - 10 + 5 = 30
30 - 10 + 5 = 25

我该怎么做?

xiozqbni

xiozqbni1#

您可以使用cumsum()

library(dplyr)

df %>% 
  mutate(stock_over_time = lag(stock + cumsum(purchase - sales)))

# A tibble: 5 x 6
  article      week sales purchase stock stock_over_time
  <chr>       <dbl> <dbl>    <dbl> <dbl>           <dbl>
1 article one     1    10        5    50              NA
2 article one     2    10        0    50              45
3 article one     3    10        5    50              35
4 article one     4    10        5    50              30
5 article one     5    10        0    50              25
z18hc3ub

z18hc3ub2#

我们可以使用递归的方法来做这件事,它也应该工作的复杂情况

df$stock_over_time <- df$stock
for(i in 2:nrow(df)) {
    df$stock_over_time[i] <- df$stock_over_time[i-1] - 
           df$sales[i-1] + df$purchase[i-1]
 }
 
df
# A tibble: 5 x 6
#  article      week sales purchase stock stock_over_time
#  <chr>       <dbl> <dbl>    <dbl> <dbl>           <dbl>
#1 article one     1    10        5    50              50
#2 article one     2    10        0    50              45
#3 article one     3    10        5    50              35
#4 article one     4    10        5    50              30
#5 article one     5    10        0    50              25

或者另一个选项是purrr中的accumulate

library(purrr)
library(dplyr)
df %>% 
    mutate(stock_over_time = accumulate((purchase- sales)[-1], 
            ~ .x + .y, .init = first(stock)))
# A tibble: 5 x 6
#  article      week sales purchase stock stock_over_time
#  <chr>       <dbl> <dbl>    <dbl> <dbl>           <dbl>
#1 article one     1    10        5    50              50
#2 article one     2    10        0    50              40
#3 article one     3    10        5    50              35
#4 article one     4    10        5    50              30
#5 article one     5    10        0    50              20

也可以写成

df %>% 
    mutate(stock_over_time = accumulate(c(first(stock), 
         (purchase- sales)[-1]), ~ .x + .y))
h79rfbju

h79rfbju3#

下面的方法怎么样?

df$stock_over_time <- df$stock - df$sales + df$purchase

如果在dfs列中存在计算所需的任何NA,我将在以下之前执行此操作:

df[is.na(df)] <- 0

相关问题