R语言 如何从出生日期中获取月龄?[duplicate]

pjngdqdw  于 2023-02-01  发布在  其他
关注(0)|答案(3)|浏览(192)
    • 此问题在此处已有答案**:

Number of months between two dates(11个答案)
2天前关闭。
我有一列参与者的出生日期,我需要以 * 月*为单位来测量他们的年龄。我想知道是否有一种自动的方法来解释事实,如月份有不同的天数,年份也有不同的天数(等等)?我的意思是:我知道我可以设置一个随机的30或31为月或365或366为年,但我想知道是否有办法让R得到实际的日期,就像Sys.Date()那样,这样我就可以更精确地知道。这是我在其他问题中没有看到的[编辑]

  • 数据:
head(data)
  ID      BIRTH YEAR
1  A 23/04/2009 2009
2  B 24/03/2010 2010
3  C 28/12/2009 2009
  • 我需要获得参与者从特定日期起的月龄。例如,从他们出生到2020年8月31日(注:日期的巴西表示法是日/月/年)。

我看过很多有趣的帖子,比如this one,但是他们没有解决我需要的,所以我希望这不是重复的。我也看过一些建议,比如difftime("23/04/2009", "31/08/2020", units = 'weeks'),但是它几个月都不起作用。

  • 数据:
> dput(data)
structure(list(ID = c("A", "B", "C", "D", "E", "F", "G", "H", 
"I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U"
), BIRTH = c("23/04/2009", "24/03/2010", "28/12/2009", "19/08/2009", 
"02/12/2009", "12/05/2010", "26/02/2010", "07/10/2009", "22/04/2010", 
"01/04/2010", "31/03/2010", "27/01/2010", "23/09/2009", "28/09/2009", 
"28/10/2009", "21/06/2009", "28/10/2009", "19/08/2009", "10/09/2009", 
"13/07/2009", "22/09/2009"), YEAR = c("2009", "2010", "2009", 
"2009", "2009", "2010", "2010", "2009", "2010", "2010", "2010", 
"2010", "2009", "2009", "2009", "2009", "2009", "2009", "2009", 
"2009", "2009")), row.names = c(NA, -21L), class = "data.frame")
i2byvkas

i2byvkas1#

首先确保你的两个日期都是日期格式(一个选择是使用lubridatedmy函数)。有了日期格式,我们可以做数学运算,意思是从一个日期减去另一个日期。诀窍是把整个事情围绕as.numeric来得到numeric类。顺便说一下,从天得到月,我们必须除以365/12:

library(lubridate)
library(dplyr)
data %>% 
  mutate(Age_in_Months = as.numeric(dmy("31/08/2020") - dmy(BIRTH)) / 365/12)
ID      BIRTH YEAR Age_in_Months
1   A 23/04/2009 2009     0.9470320
2   B 24/03/2010 2010     0.8705479
3   C 28/12/2009 2009     0.8901826
4   D 19/08/2009 2009     0.9200913
5   E 02/12/2009 2009     0.8961187
6   F 12/05/2010 2010     0.8593607
7   G 26/02/2010 2010     0.8764840
8   H 07/10/2009 2009     0.9089041
9   I 22/04/2010 2010     0.8639269
10  J 01/04/2010 2010     0.8687215
11  K 31/03/2010 2010     0.8689498
12  L 27/01/2010 2010     0.8833333
13  M 23/09/2009 2009     0.9121005
14  N 28/09/2009 2009     0.9109589
15  O 28/10/2009 2009     0.9041096
16  P 21/06/2009 2009     0.9335616
17  Q 28/10/2009 2009     0.9041096
18  R 19/08/2009 2009     0.9200913
19  S 10/09/2009 2009     0.9150685
20  T 13/07/2009 2009     0.9285388
21  U 22/09/2009 2009     0.9123288
ovfsdjhp

ovfsdjhp2#

使用answer here中的函数,您可以继续以月份为单位计算年龄差,如Dif_months1中所示。要获得更准确的结果,您可以使用lubridate中的interval,请参阅Dif_months2:

library(lubridate)
data <- structure(list(ID = c("A", "B", "C", "D", "E", "F", "G", "H", 
                      "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U"
), BIRTH = as.Date(c("23/04/2009", "24/03/2010", "28/12/2009", "19/08/2009", 
             "02/12/2009", "12/05/2010", "26/02/2010", "07/10/2009", "22/04/2010", 
             "01/04/2010", "31/03/2010", "27/01/2010", "23/09/2009", "28/09/2009", 
             "28/10/2009", "21/06/2009", "28/10/2009", "19/08/2009", "10/09/2009", 
             "13/07/2009", "22/09/2009"), format="%d/%m/%Y"), YEAR = c("2009", "2010", "2009", 
                                                   "2009", "2009", "2010", "2010", "2009", "2010", "2010", "2010", 
                                                   "2010", "2009", "2009", "2009", "2009", "2009", "2009", "2009", 
                                                   "2009", "2009")), row.names = c(NA, -21L), class = "data.frame")

# turn a date into a 'monthnumber' relative to an origin
monnb <- function(d) { lt <- as.POSIXlt(as.Date(d, origin="1900-01-01")); lt$year*12 + lt$mon } 
# compute a month difference as a difference between two monnb's
mondf <- function(d1, d2) { monnb(d2) - monnb(d1) }

data %>% mutate(Dif_months1 = mondf(BIRTH, Sys.Date()), 
                Dif_months2 = interval(BIRTH, Sys.Date()) %/% days(1) / (365/12))

请注意,我将原始日期的格式设置为format="%d/%m/%Y"
输出:

ID      BIRTH YEAR Dif_months Dif_months2
1   A 2009-04-23 2009        165    165.3370
2   B 2010-03-24 2010        154    154.3233
3   C 2009-12-28 2009        157    157.1507
4   D 2009-08-19 2009        161    161.4575
5   E 2009-12-02 2009        157    158.0055
6   F 2010-05-12 2010        152    152.7123
7   G 2010-02-26 2010        155    155.1781
8   H 2009-10-07 2009        159    159.8466
9   I 2010-04-22 2010        153    153.3699
10  J 2010-04-01 2010        153    154.0603
11  K 2010-03-31 2010        154    154.0932
12  L 2010-01-27 2010        156    156.1644
13  M 2009-09-23 2009        160    160.3068
14  N 2009-09-28 2009        160    160.1425
15  O 2009-10-28 2009        159    159.1562
16  P 2009-06-21 2009        163    163.3973
17  Q 2009-10-28 2009        159    159.1562
18  R 2009-08-19 2009        161    161.4575
19  S 2009-09-10 2009        160    160.7342
20  T 2009-07-13 2009        162    162.6740
21  U 2009-09-22 2009        160    160.3397
kninwzqo

kninwzqo3#

如您所述,difftime()不适用于“month”,因为不可能给予“明确”的答案:可以除以30或31,或者使用考虑闰年或不考虑闰年的一年中的天数,等等。
否则,您的问题是一个双行程序。假设您的数据位于data中。frame D

D <- within(D, bd <- as.Date(BIRTH, "%d/%m/%Y")) 
D <- within(D, dm <- as.numeric(difftime(as.Date("2020-08-31"), bd))/30)

其中我们首先解析为Date,然后使用difftime,并将(任意选择的)一个月缩放30天。

> D
   ID      BIRTH YEAR         bd      dm
1   A 23/04/2009 2009 2009-04-23 138.267
2   B 24/03/2010 2010 2010-03-24 127.100
3   C 28/12/2009 2009 2009-12-28 129.967
4   D 19/08/2009 2009 2009-08-19 134.333
5   E 02/12/2009 2009 2009-12-02 130.833
6   F 12/05/2010 2010 2010-05-12 125.467
7   G 26/02/2010 2010 2010-02-26 127.967
8   H 07/10/2009 2009 2009-10-07 132.700
9   I 22/04/2010 2010 2010-04-22 126.133
10  J 01/04/2010 2010 2010-04-01 126.833
11  K 31/03/2010 2010 2010-03-31 126.867
12  L 27/01/2010 2010 2010-01-27 128.967
13  M 23/09/2009 2009 2009-09-23 133.167
14  N 28/09/2009 2009 2009-09-28 133.000
15  O 28/10/2009 2009 2009-10-28 132.000
16  P 21/06/2009 2009 2009-06-21 136.300
17  Q 28/10/2009 2009 2009-10-28 132.000
18  R 19/08/2009 2009 2009-08-19 134.333
19  S 10/09/2009 2009 2009-09-10 133.600
20  T 13/07/2009 2009 2009-07-13 135.567
21  U 22/09/2009 2009 2009-09-22 133.200
>

当然,你也可以替换BIRTH,或者只使用向量。我喜欢把数据放在一起。如果你愿意,你可以把引用数据作为帮助函数的函数参数。
两行程序不需要其他包,而且,如果你愿意,一行程序也可以在引用日期仍然是输入数据之外使用的一个参数的情况下工作:

D <- within(D, dm <- as.numeric(difftime(as.Date("2020-08-31"),
        as.Date(BIRTH, "%d/%m/%Y"))/30))

(我打破了两行显示,但实际上只有一行)。

相关问题