首先,这可能是一个误入歧途的问题,如果是这样的话,我将感谢一些指导,我应该如何进行。
从我在网上发现的情况来看,mongodb/mongoose mapreduce似乎是最好的方法,但我一直在努力了解它,我很难理解它的任何一点,我想知道是否有人可以帮助解释我的问题。我不一定要寻找一个完整的解决方案。我真的很欣赏解释得很好的伪代码。我想让我特别困惑的是如何处理2个或多个子文档集的聚合和组合。
我也知道这可能是由于一个坏的模型/集合设计,但不幸的是,这是完全超出我的手,所以请不要建议重塑。
我的特别问题是,我们有一个现有的模型,看起来像以下:
survey: {
_id: 1111,
name: "name",
questions: [
{_id: 1, text: "a,b, or c?", type: "multipleChoice", options: [a, b, c,]},
{_id: 2, text: "what do you think", type: "freeform"}
],
participants: [{_id: 1, name: "user 1"}, {_id: 2, name: "user 2"}],
results: [{_id: 123, userId: 1, questionId: 1, answer: "a"},
{_id: 124, userId: 2, questionId: 1, answer: "b"},
{_id: 125, userId: 1, questionId: 2, answer: "this is some answer"},
{_id: 126, userId: 2, questionId: 2, answer: "this is another answer"}]
}
然后我们有另一个单独开发的模型,用于跟踪用户在整个调查过程中的进度(这只是一个基本的子集,我们还跟踪不同的事件)
trackings:{
_id:123,
surveyId: 1,
userId: 123,
starttime: "2015-05-13 10:46:20.347Z"
endtime: "2015-05-13 10:59:20.347Z"
}
我想做的是得到如下结果:
{
survey: "survey name",
_id : 1,
totalAverageTime: "00:23:00",
fastestTime : "00:23:00",
slowestTime: "00:25:00",
questions: [
{
_id: 1, text: "a,b, or c?",
type: "multipleChoice",
mostPopularAnswer: "a",
averageTime: "00:13:00",
anwers : [{ userId: 1, answer: "a", time:"00:14:00"},
{ userId: 2, answer: "a", time:"00:12:00"}]
},{
_id: 2, text:"what do you think",
type:"freeform",
averageTime : "00:10:00",
answers : [{ userId: 1, answer: "this is some answer", time:"00:11:00"},
{ userId: 2, answer: "this is another answer", time:"00:09:00"}]
}
]
}
1条答案
按热度按时间gywdnpxw1#
下面的方法使用聚合框架来提出更接近所需输出的解决方案。这取决于第三个集合,该集合可视为两个集合之间的合并
survey
以及trackings
.首先也是最重要的一点,假设您有以下集合,其中包含基于问题中示例的测试文档:
创建第三个集合(我们称之为
output_collection
),则需要在trackings
使用find()
光标的forEach()
方法,将带有日期字符串的字段转换为实际的isodate对象,创建一个存储survey
结果,然后将合并对象保存到第三个集合中。下面演示此操作:将两个集合合并为输出集合后,使用
db.output_collection.findOne()
将产生:然后可以在此集合上应用聚合。聚合管道应该由四个部分组成
$unwind
**运算符阶段,将数组从输入文档中解构出来,为每个元素输出一个文档。每个输出文档用一个元素值替换数组。下一个
$project
operator stage重塑流中的每个文档,例如添加一个新字段duration
它计算starttime和endtime日期字段之间的时间差(以分钟为单位),并使用算术运算符进行计算。在这之后就是
$group
运算符管道阶段,按"survey"
键并将累加器表达式应用于每个组。使用所有输入文档,并为每个不同的组输出一个文档。所以聚合管道应该是这样的:
/* 0 */
{
"result" : [
{
"_id" : 1111,
"survey" : "name",
"totalAverageTime" : 23.18333333333334,
"fastestTime" : 13,
"slowestTime" : 33.36666666666667,
"questions" : [
{
"_id" : 2,
"text" : "what do you think",
"type" : "freeform"
},
{
"_id" : 1,
"text" : "a,b, or c?",
"type" : "multipleChoice",
"options" : [
"a",
"b",
"c"
]
}
],
"answers" : [
{
"_id" : 126,
"userId" : 2,
"questionId" : 2,
"answer" : "this is another answer"
},
{
"_id" : 124,
"userId" : 2,
"questionId" : 1,
"answer" : "b"
},
{
"_id" : 125,
"userId" : 1,
"questionId" : 2,
"answer" : "this is some answer"
},
{
"_id" : 123,
"userId" : 1,
"questionId" : 1,
"answer" : "a"
}
]
}
],
"ok" : 1
}
db.survey_results.find().forEach(function(doc){
var questions = [];
doc.questions.forEach(function(q){
var answers = [];
doc.answers.forEach(function(a){
if(a.questionId === q._id){
delete a.questionId;
answers.push(a);
}
});
q.answers = answers;
questions.push(q);
});
});
/* 0 */
{
"_id" : 1111,
"survey" : "name",
"totalAverageTime" : 23.18333333333334,
"fastestTime" : 13,
"slowestTime" : 33.36666666666667,
"questions" : [
{
"_id" : 2,
"text" : "what do you think",
"type" : "freeform",
"answers" : [
{
"_id" : 126,
"userId" : 2,
"answer" : "this is another answer"
},
{
"_id" : 125,
"userId" : 1,
"answer" : "this is some answer"
}
]
},
{
"_id" : 1,
"text" : "a,b, or c?",
"type" : "multipleChoice",
"options" : [
"a",
"b",
"c"
],
"answers" : [
{
"_id" : 124,
"userId" : 2,
"answer" : "b"
},
{
"_id" : 123,
"userId" : 1,
"answer" : "a"
}
]
}
]
}