r从随机数创建一个序列

cbeh67ev  于 2023-03-15  发布在  其他
关注(0)|答案(2)|浏览(158)

我的数据集中有一个列Col1,其中包含随机数。

ID  Date        Col1     Score
  13  2002-08-01  18221    60.75
  13  2010-08-12  18448    65.33
  11  2009-11-06  -65145   61.13
  11  2009-11-06  -65145   59.91
  12  2011-05-10   93910   62.10
  14  2009-05-29   13000   70.28
  15  2008-12-03   19423   39.72

我的目标是将这些看似随机的数字转换成有序序列(SequenceNum),就像这样。

ID  Date        Col1     Score    SequenceNum
  13  2002-08-01  18221    60.75    1 
  13  2010-08-12  18448    65.33    2
  11  2009-11-06  -65145   61.13    2
  11  2009-11-06  -65145   59.91    1
  12  2011-05-10   93910   62.10    1
  14  2008-12-03   19423   39.72    2
  14  2009-05-29   13000   70.28    1

规则是,
对于每个ID,

Rule 1)  if the Col1 values are different, then create a sequence 
based on the ascending order of values in Col1. For example ID 13. 
The values in Col1 are 18221 and 18448 so  the expected result is

  ID  Date        Col1     Score    SequenceNum
  13  2002-08-01  18221    60.75    1 
  13  2010-08-12  18448    65.33    2

Rule 2) If the Col1 values are same, then use the values in column 
`Score` and create a sequence based on ascending order of values in 
column `Score`. For ID 11, the values in Col1 are same(-65145) so the 
sequence number will be based on ascending order of values in column 
`Score`

  11  2009-11-06  -65145   61.13    2
  11  2009-11-06  -65145   59.91    1

在此问题上提前感谢任何帮助。

p5fdfcr1

p5fdfcr11#

df %>%
   mutate(seq_num = order(Col1, Score), .by=ID)

  ID       Date   Col1 Score seq_num
1 13 2002-08-01  18221 60.75       1
2 13 2010-08-12  18448 65.33       2
3 11 2009-11-06 -65145 61.13       2
4 11 2009-11-06 -65145 59.91       1
5 12 2011-05-10  93910 62.10       1
6 14 2009-05-29  13000 70.28       1
7 15 2008-12-03  19423 39.72       1

第二个数据与第一个数据不同:

df %>%
   mutate(seq_num = order(Col1, Score), .by=ID)

  ID       Date   Col1 Score seq_num
1 13 2002-08-01  18221 60.75       1
2 13 2010-08-12  18448 65.33       2
3 11 2009-11-06 -65145 61.13       2
4 11 2009-11-06 -65145 59.91       1
5 12 2011-05-10  93910 62.10       1
6 14 2008-12-03  19423 39.72       2
7 14 2009-05-29  13000 70.28       1
lsmepo6l

lsmepo6l2#

我们可以用

library(dplyr)
df1 %>%
  mutate(rn = row_number()) %>%
  arrange(ID, Col1, Score) %>% 
   mutate(SequenceNum = row_number(), .by = ID) %>%
  arrange(rn) %>%
  select(-rn)
  • 输出
ID       Date   Col1 Score SequenceNum
1 13 2002-08-01  18221 60.75           1
2 13 2010-08-12  18448 65.33           2
3 11 2009-11-06 -65145 61.13           2
4 11 2009-11-06 -65145 59.91           1
5 12 2011-05-10  93910 62.10           1
6 14 2009-05-29  13000 70.28           1
7 14 2008-12-03  19423 39.72           2

相关问题