LINQ聚合并按时间段分组

ljsrvy3e  于 2022-12-15  发布在  其他
关注(0)|答案(9)|浏览(117)

我试图理解如何使用LINQ按时间间隔对数据进行分组;然后理想地聚集每组。
我发现了大量具有明确日期范围的示例,我尝试按时间段进行分组,如5分钟、1小时、1天。
例如,我有一个类,它用一个值 Package DateTime:

public class Sample
{
     public DateTime timestamp;
     public double value;
}

这些观测作为一个系列包含在List集合中:

List<Sample> series;

因此,要按小时时间段分组并按平均值合计值,我将尝试执行以下操作:

var grouped = from s in series
              group s by new TimeSpan(1, 0, 0) into g
              select new { timestamp = g.Key, value = g.Average(s => s.value };

这是一个根本性的缺陷,因为它将TimeSpan本身分组。我不明白如何在查询中使用TimeSpan(或任何表示间隔的数据类型)。

mepcadol

mepcadol1#

您可以将时间戳舍入到下一个边界(即,向下舍入到过去最接近的5分钟边界),并将其用作您的分组:

var groups = series.GroupBy(x =>
{
    var stamp = x.timestamp;
    stamp = stamp.AddMinutes(-(stamp.Minute % 5));
    stamp = stamp.AddMilliseconds(-stamp.Millisecond - 1000 * stamp.Second);
    return stamp;
})
.Select(g => new { TimeStamp = g.Key, Value = g.Average(s => s.value) })
.ToList();

以上通过在分组中使用修改的时间戳来实现这一点,该修改的时间戳将分钟设置为前5分钟边界并移除秒和毫秒。当然,相同的方法可用于其他时间段,即小时和天。

编辑:

基于该补充样本输入:

var series = new List<Sample>();
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(3) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(4) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(5) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(6) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(7) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(15) });

为我生成了3个组,一个组的时间戳为3:05,一个组的时间戳为3:10,一个组的时间戳为3:20 pm(您的结果可能会因当前时间而异)。

5vf7fwbs

5vf7fwbs2#

我很晚才开始这个游戏,但我在寻找其他东西的时候遇到了这个,我想我有一个更好的方法。

series.GroupBy (s => s.timestamp.Ticks / TimeSpan.FromHours(1).Ticks)
        .Select (s => new {
            series = s
            ,timestamp = s.First ().timestamp
            ,average = s.Average (x => x.value )
        }).Dump();

下面是一个linqpad示例程序,您可以验证和测试它

void Main()
{
    List<Sample> series = new List<Sample>();

    Random random = new Random(DateTime.Now.Millisecond);
    for (DateTime i = DateTime.Now.AddDays(-5); i < DateTime.Now; i += TimeSpan.FromMinutes(1))
    {
        series.Add(new UserQuery.Sample(){ timestamp = i, value = random.NextDouble() * 100 });
    }
    //series.Dump();
    series.GroupBy (s => s.timestamp.Ticks / TimeSpan.FromHours(1).Ticks)
        .Select (s => new {
            series = s
            ,timestamp = s.First ().timestamp
            ,average = s.Average (x => x.value )
        }).Dump();
}

// Define other methods and classes here
public class Sample
{
     public DateTime timestamp;
     public double value;
}
g2ieeal7

g2ieeal73#

要按小时分组,您需要按时间戳的小时部分分组,具体操作如下:

var groups = from s in series
  let groupKey = new DateTime(s.timestamp.Year, s.timestamp.Month, s.timestamp.Day, s.timestamp.Hour, 0, 0)
  group s by groupKey into g select new
                                      {
                                        TimeStamp = g.Key,
                                        Value = g.Average(a=>a.value)
                                      };
0ejtzxu1

0ejtzxu14#

我建议使用new DateTime()避免任何亚毫秒级差异问题

var versionsGroupedByRoundedTimeAndAuthor = db.Versions.GroupBy(g => 
new
{
                UserID = g.Author.ID,
                Time = RoundUp(g.Timestamp, TimeSpan.FromMinutes(2))
});

private DateTime RoundUp(DateTime dt, TimeSpan d)
        {
            return new DateTime(((dt.Ticks + d.Ticks - 1) / d.Ticks) * d.Ticks);
        }

注意:我在这里按作者ID和四舍五入的时间戳分组。
取整函数取自@dtb此处应答https://stackoverflow.com/a/7029464/661584
了解精确到毫秒的相等在这里并不总是相等Why does this unit test fail when testing DateTime equality?

xxls0lw8

xxls0lw85#

我改进了BrokenGlass的答案,使其更加通用,并增加了安全措施。对于他目前的答案,如果你选择间隔9,它将不会像你所期望的那样。任何数字60都不能被整除也是如此。在这个例子中,我使用9,从午夜(0:00)开始。

  • 从0:00到0:08.999的所有内容都将按照您的预期放入0:00组中。它将一直这样做,直到您到达从0:54开始的分组。
  • 在0:54时,它只会将0:54到0:59.999之间的内容分组,而不会向上分组到01:03.999。

对我来说,这是一个巨大的问题。
我不知道该怎么解决,但你可以增加安全措施。
变更:

  1. 60% [interval]等于0的任何分钟都是可接受的间隔。下面的if语句保护了这一点。
    1.小时间隔也可以。
double minIntervalAsDouble = Convert.ToDouble(minInterval);
        if (minIntervalAsDouble <= 0)
        {
            string message = "minInterval must be a positive number, exiting";
            Log.getInstance().Info(message);
            throw new Exception(message);
        }
        else if (minIntervalAsDouble < 60.0 && 60.0 % minIntervalAsDouble != 0)
        {
            string message = "60 must be divisible by minInterval...exiting";
            Log.getInstance().Info(message);
            throw new Exception(message);
        }
        else if (minIntervalAsDouble >= 60.0 && (24.0 % (minIntervalAsDouble / 60.0)) != 0 && (24.0 % (minIntervalAsDouble / 60.0) != 24.0))
        {
            //hour part must be divisible...
            string message = "If minInterval is greater than 60, 24 must be divisible by minInterval/60 (hour value)...exiting";
            Log.getInstance().Info(message);
            throw new Exception(message);
        }
        var groups = datas.GroupBy(x =>
        {
            if (minInterval < 60)
            {
                var stamp = x.Created;
                stamp = stamp.AddMinutes(-(stamp.Minute % minInterval));
                stamp = stamp.AddMilliseconds(-stamp.Millisecond);
                stamp = stamp.AddSeconds(-stamp.Second);
                return stamp;
            }
            else
            {
                var stamp = x.Created;
                int hourValue = minInterval / 60;
                stamp = stamp.AddHours(-(stamp.Hour % hourValue));
                stamp = stamp.AddMilliseconds(-stamp.Millisecond);
                stamp = stamp.AddSeconds(-stamp.Second);
                stamp = stamp.AddMinutes(-stamp.Minute);
                return stamp;
            }
        }).Select(o => new
        {
            o.Key,
            min = o.Min(f=>f.Created),
            max = o.Max(f=>f.Created),
            o
        }).ToList();

在select语句中输入你想输入的内容!我输入min/max是因为这样更容易测试。

wtzytmuj

wtzytmuj6#

尽管我真的迟到了,这里是我的两分钱:
我想以5分钟为间隔将时间值上下舍入():

10:31 --> 10:30
10:33 --> 10:35
10:36 --> 10:35

这可以通过转换为TimeSpan.Tick并转换回DateTime,然后使用Math.Round()来实现:

public DateTime GetShiftedTimeStamp(DateTime timeStamp, int minutes)
{
    return
        new DateTime(
            Convert.ToInt64(
                Math.Round(timeStamp.Ticks / (decimal)TimeSpan.FromMinutes(minutes).Ticks, 0, MidpointRounding.AwayFromZero)
                    * TimeSpan.FromMinutes(minutes).Ticks));
}

shiftedTimeStamp可用于如上所示的链接分组。

kdfy810k

kdfy810k7#

我知道这并不能直接回答这个问题,但我在Google上四处搜索,寻找一个非常类似的解决方案,以聚合股票/加密货币的蜡烛数据,从一个较小的分钟周期到一个较高的分钟周期(5,10,15,30)。你不能简单地从当前的一分钟回到X,因为聚合周期的时间戳不一致。您还必须注意列表的开始和结束处是否有足够的数据来填充较大周期的完整烛台。鉴于此,我提出的解决方案如下所示。2(它假设较小周期的蜡烛,如rawPeriod所示,按时间戳升序排序。3)

public class Candle
{
    public long Id { get; set; }
    public Period Period { get; set; }
    public DateTime Timestamp { get; set; }
    public double High { get; set; }
    public double Low { get; set; }
    public double Open { get; set; }
    public double Close { get; set; }
    public double BuyVolume { get; set; }
    public double SellVolume { get; set; }
}

public enum Period
{
    Minute = 1,
    FiveMinutes = 5,
    QuarterOfAnHour = 15,
    HalfAnHour = 30
}

    private List<Candle> AggregateCandlesIntoRequestedTimePeriod(Period rawPeriod, Period requestedPeriod, List<Candle> candles)
    {
        if (rawPeriod != requestedPeriod)
        {
            int rawPeriodDivisor = (int) requestedPeriod;
            candles = candles
                        .GroupBy(g => new { TimeBoundary = new DateTime(g.Timestamp.Year, g.Timestamp.Month, g.Timestamp.Day, g.Timestamp.Hour, (g.Timestamp.Minute / rawPeriodDivisor) * rawPeriodDivisor , 0) })
                        .Where(g => g.Count() == rawPeriodDivisor )
                        .Select(s => new Candle
                        {
                            Period = requestedPeriod,
                            Timestamp = s.Key.TimeBoundary,
                            High = s.Max(z => z.High),
                            Low = s.Min(z => z.Low),
                            Open = s.First().Open,
                            Close = s.Last().Close,
                            BuyVolume = s.Sum(z => z.BuyVolume),
                            SellVolume = s.Sum(z => z.SellVolume),
                        })
                        .OrderBy(o => o.Timestamp)
                        .ToList();
        }

        return candles;
    }
46qrfjad

46qrfjad8#

通用解决方案:

static IEnumerable<IGrouping<DateRange, T>> GroupBy<T>(this IOrderedEnumerable<T> enumerable, TimeSpan timeSpan, Func<T, DateTime> predicate)
    {
        Grouping<T> grouping = null;
        foreach (var (a, dt) in from b in enumerable select (b, predicate.Invoke(b)))
        {
            if (grouping == null || dt > grouping.Key.End)
                yield return grouping = new Grouping<T>(new DateRange(dt, dt + timeSpan), a);
            else
                grouping.Add(a);
        }
    }

    class Grouping<T> : IGrouping<DateRange, T>
    {

        readonly List<T> elements = new List<T>();

        public DateRange Key { get; }

        public Grouping(DateRange key) => Key = key;

        public Grouping(DateRange key, T element) : this(key) => Add(element);

        public void Add(T element) => elements.Add(element);

        public IEnumerator<T> GetEnumerator()=> this.elements.GetEnumerator();

        IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
    }

    class DateRange
    {
    
        public DateRange(DateTime start, DateTime end)
        {
            this.Start = start;
            this.End = end;
        }

        public DateTime Start { get; set; }
        public DateTime End { get; set; }
    }

基于问题的测试(使用AutoFixture库)

void Test()
    {
        var many = new Fixture().CreateMany<Sample>(100);

        var groups = many.OrderBy(a => a.timestamp).GroupBy(TimeSpan.FromDays(365), a => a.timestamp).Select(a => a.Average(b => b.value)).ToArray();

    }

    public class Sample
    {
        public DateTime timestamp;
        public double value;
    }
kr98yfug

kr98yfug9#

如果BrokenGlass版本不起作用,那么您的值中也可能有勾号。以下是一个工作版本:

var groups = series.GroupBy(x =>
        {
            var stamp = x.timestamp;
            stamp = stamp.AddMinutes(-(stamp.Minute % 5));
            stamp = stamp.AddTicks(-(stamp.Ticks % TimeSpan.TicksPerMinute));
            return stamp;
        })
        .Select(g => new { TimeStamp = g.Key, Value = g.Average(s => s.value) })
        .ToList();

相关问题