MongoDB -对于每个组,选择具有最大值的记录

8fsztsew  于 2022-11-22  发布在  Go
关注(0)|答案(1)|浏览(109)

在MongoDB中,我试图通过各自的组来过滤集合,只保留那些包含最新日期的文档。
在传统SQL中,我会执行以下操作:

Select *
From table a
Join (Select my_group, max(date) as max_date
      From table group by my_group) b 
      ON a.my_group = b.my_group AND
         a.date = b.max_date

用以下样品采集:

[
  {
    "_id": "123",
    "item1": "group 1",
    "item2": "abc",
    "item3": "abc",
    "date": "2022-01-01"
  },
  {
    "_id": "234",
    "item1": "group 1",
    "item2": "abc",
    "item3": "abc",
    "date": "2022-01-02"
  },
  {
    "_id": "345",
    "item1": "group 1",
    "item2": "abc",
    "item3": "abc",
    "date": "2022-01-02"
  },
  {
    "_id": "789",
    "item1": "group 2",
    "item2": "abc",
    "item3": "abc",
    "date": "2022-01-01"
  },
  {
    "_id": "678",
    "item1": "group 2",
    "item2": "abc",
    "item3": "abc",
    "date": "2022-01-02"
  },
  {
    "_id": "456",
    "item1": "group 2",
    "item2": "abc",
    "item3": "abc",
    "date": "2022-01-02"
  }
]

预期输出为:

[
    {
        "_id": "234",
        "date": "2022-01-02",
        "item1": "group 1",
        "item2": "abc",
        "item3": "abc"
    },
    {
        "_id": "345",
        "date": "2022-01-02",
        "item1": "group 1",
        "item2": "abc",
        "item3": "abc"
    },
    {
        "_id": "678",
        "date": "2022-01-02",
        "item1": "group 2",
        "item2": "abc",
        "item3": "abc"
    },
    {
        "_id": "456",
        "date": "2022-01-02",
        "item1": "group 2",
        "item2": "abc",
        "item3": "abc"
    }
]

我目前的最佳尝试是:

db.collection.aggregate([
  {
    $group: {
      "_id": "$item1",
      "max_date": {
        $max: "$date"
      },
      "records": {
        $push: "$$ROOT"
      }
    }
  },
  {
    "$project": {
      items: {
        "$filter": {
          "input": "$records",
          "as": "records",
          "cond": {
            $eq: [
              "$$records.date",
              "$max_date"
            ]
          }
        }
      }
    }
  },
  {
    $replaceRoot: {
      newRoot: {
        results: "$items"
      }
    }
  }
])

不幸的是,这会返回按组划分的结果。我已经尝试了其他帖子建议的一些替代方法,得到了类似的问题,例如:

下面是一个包含查询和示例数据的playground example

igetnqfo

igetnqfo1#

你快找到答案了。
对于最后2个阶段:

  1. $unwind?将items数组字段分解为多个文档.
  2. $replaceWith?将输出文档替换为items文档。
db.collection.aggregate([
  {
    $group: {
      "_id": "$item1",
      "max_date": {
        $max: "$date"
      },
      "records": {
        $push: "$$ROOT"
      }
    }
  },
  {
    "$project": {
      items: {
        "$filter": {
          "input": "$records",
          "as": "records",
          "cond": {
            $eq: [
              "$$records.date",
              "$max_date"
            ]
          }
        }
      }
    }
  },
  {
    $unwind: "$items"
  },
  {
    $replaceWith: "$items"
  }
])

Sample Mongo Playground

奖金

虽然上面的查询比较好,也想分享一下MongoDB查询,它类似于SQL实现。

  1. $group-按item1分组并获取date
  2. $lookup-使用item1date自联接集合.并返回items数组字段.
  3. $match-过滤items不是空数组的文档。
  4. $unwind?将items数组分解为多个文档.
  5. $replaceWith?将输出文档替换为items文档。
db.collection.aggregate([
  {
    $group: {
      "_id": "$item1",
      "max_date": {
        $max: "$date"
      }
    }
  },
  {
    $lookup: {
      from: "collection",
      let: {
        item1: "$_id",
        max_date: "$max_date"
      },
      pipeline: [
        {
          $match: {
            $expr: {
              $and: [
                {
                  $eq: [
                    "$item1",
                    "$$item1"
                  ]
                },
                {
                  $eq: [
                    "$date",
                    "$$max_date"
                  ]
                }
              ]
            }
          }
        }
      ],
      as: "items"
    }
  },
  {
    $match: {
      items: {
        $ne: []
      }
    }
  },
  {
    $unwind: "$items"
  },
  {
    $replaceWith: "$items"
  }
])

Sample Mongo Playground (Bonus)

相关问题