我需要在mongodb集合中的过去文档的帮助下插入日期之间的文档

yyhrrdl8  于 2023-08-04  发布在  Go
关注(0)|答案(1)|浏览(117)

我在mongodb集合中有一些从2013年7月13日到2013年7月18日的文档。但采集时遗漏了几天的数据。我需要在最后五个文档的帮助下插入该数据。
我的数据看起来像

{ 
  _id: 1,
  "timestamp": "2023-07-13T10:00:00Z"
  "code": "bc",
  "energy": 2333
},
{ 
  _id: 2,
  "timestamp": "2023-07-13T10:10:00Z"
  "code": "abc",
  "energy": 2340
},
{ 
  _id: 3,
  "timestamp": "2023-07-14T10:30:00Z"
  "code": "abc",
  "energy": 2349
},
{ 
  _id: 4,
  "timestamp": "2023-07-14T10:40:00Z"
  "code": "ah",
  "energy": 2355
},
{ 
  _id: 5,
  "timestamp": "2023-07-14T10:50:00Z"
  "code": "kk",
  "energy": 2360
},
{ 
  _id: 6,
  "timestamp": "2023-07-14T11:00:00Z"
  "code": "la",
  "energy": 2370
},
{ 
  _id: 7,
  "timestamp": "2023-07-14T10:00:00Z"
  "code": "as",
  "energy": 3455
},
{ 
  _id: 8,
  "timestamp": "2023-07-17T10:10:00Z"
  "code": "uj",
  "energy": 4567659
},
{ 
  _id: 9,
  "timestamp": "2023-07-17T10:30:00Z"
  "code": "la",
  "energy": 564546
},
{ 
  _id: 10,
  "timestamp": "2023-07-17T10:40:00Z"
  "code": "ws",
  "energy": 5654348
},
{ 
  _id: 11,
  "timestamp": "2023-07-18T10:50:00Z"
  "code": "lk",
  "energy": 6765436
},
{ 
  _id: 12,
  "timestamp": "2023-07-18T11:00:00Z"
  "code": "pl",
  "energy": 7654223
}

字符串
此处第15和第16个数据缺失,因此我需要插入这些文档并更新能量字段值,取最后5个文档能量字段值的平均值,然后在第15个日期文档中提及该值,与下一个文档相同,请对此提供帮助。
输出量:

{ 
      _id: 1,
      "timestamp": "2023-07-13T10:00:00Z"
      "code": "bc",
      "energy": 2333
    },
    { 
      _id: 2,
      "timestamp": "2023-07-13T10:10:00Z"
      "code": "abc",
      "energy": 2340
    },
    { 
      _id: 3,
      "timestamp": "2023-07-14T10:30:00Z"
      "code": "abc",
      "energy": 2349
    },
    { 
      _id: 4,
      "timestamp": "2023-07-14T10:40:00Z"
      "code": "ah",
      "energy": 2355
    },
    { 
      _id: 5,
      "timestamp": "2023-07-14T10:50:00Z"
      "code": "kk",
      "energy": 2360
    },
    { 
      _id: 6,
      "timestamp": "2023-07-14T11:00:00Z"
      "code": "la",
      "energy": 2370
    },
    { 
      _id: 7,
      "timestamp": "2023-07-14T10:00:00Z"
      "code": "as",
      "energy": 2380
    },
     { 
      _id: 8,
      "timestamp": "2023-07-15T10:00:00Z"
      "code": "as",
      "energy": 9910
    },
     { 
      _id: 9,
      "timestamp": "2023-07-15T10:15:00Z"
      "code": "as",
      "energy": 11447
    },
     { 
      _id: 10,
      "timestamp": "2023-07-15T10:30:00Z"
      "code": "as",
      "energy": 19309.4
    },
     { 
      _id: 11,
      "timestamp": "2023-07-15T10:45:00Z"
      "code": "as",
      "energy": 29968
    },
     { 
      _id: 12,
      "timestamp": "2023-07-15T11:00:00Z"
      "code": "as",
      "energy": 49040
    },
     { 
      _id: 13,
      "timestamp": "2023-07-16T10:00:00Z"
      "code": "as",
      "energy": 80442
    },
     { 
      _id: 14,
      "timestamp": "2023-07-16T10:15:00Z"
      "code": "as",
      "energy": 1025852
    },
    { 
      _id: 16,
      "timestamp": "2023-07-17T10:10:00Z"
      "code": "uj",
      "energy": 4567659
    },
    { 
      _id: 17,
      "timestamp": "2023-07-17T10:30:00Z"
      "code": "la",
      "energy": 564546
    },
    { 
      _id: 18,
      "timestamp": "2023-07-17T10:40:00Z"
      "code": "ws",
      "energy": 5654348
    },
    { 
      _id: 19,
      "timestamp": "2023-07-18T10:50:00Z"
      "code": "lk",
      "energy": 6765436
    },
    { 
      _id: 20,
      "timestamp": "2023-07-18T11:00:00Z"
      "code": "pl",
      "energy": 7654223
    }

6rqinv9w

6rqinv9w1#

你可以试试这个:

db.collection.aggregate([
   {
      $set: {
         day: {
            $dateTrunc: {
               date: "$timestamp",
               unit: "day",
               timezone: "Europe/Zurich"
            }
         }
      }
   },
   {
      $setWindowFields: {
         sortBy: { timestamp: 1 },
         output: {
            average: {
               $avg: "$energy",
               window: { documents: [-5, "current"] }
            }
         }
      }
   },
   {
      $densify: {
         field: "day",
         range: {
            step: 1,
            unit: "day",
            bounds: "full"
         }
      }
   },
   {
      $fill: {
         sortBy: { day: 1, timestamp: 1 },
         output: {
            average: { method: "locf" }
         }
      }
   },
   {
      $set: {
         timestamp: { $ifNull: ["$timestamp", "$day"] },
         average: "$$REMOVE",
         energy: { $ifNull: ["$energy", "$average"] },
         day: "$$REMOVE",
      }
   }
])

字符串
Mongo Playground
注意,$fill还提供了一个线性回归函数,它可能拟合得更好。

更新

如果你每15分钟需要一次数据,那么就更简单了:

db.collection.aggregate([
   {
      $setWindowFields: {
         sortBy: { timestamp: 1 },
         output: {
            average: {
               $avg: "$energy",
               window: { documents: [-5, "current"] }
            }
         }
      }
   },
   {
      $densify: {
         field: "timestamp",
         range: {
            step: 15,
            unit: "minute",
            bounds: "full"
         }
      }
   },
   {
      $fill: {
         sortBy: { timestamp: 1 },
         output: {
            average: { method: "locf" }
         }
      }
   },
   {
      $set: {
         average: "$$REMOVE",
         energy: { $ifNull: ["$energy", "$average"] }
      }
   }
])

相关问题