如何使用LINQ在C#中查找时间戳序列之间的平均差异?

weylhg0b  于 2022-12-06  发布在  C#
关注(0)|答案(4)|浏览(139)

我有一个 * 无序 * 的时间戳序列。我需要能够计算每个后续时间戳之间的 * 最小 最大 * 和 * 平均 * 差异。例如:

DateTimeOffset now = new DateTimeOffset(new DateTime(2022, 1, 1, 0, 0, 0, 0));
DateTimeOffset[] timestamps = new[] {
    now,
    now.AddSeconds(5),
    now.AddSeconds(10),
    now.AddSeconds(15),
    now.AddSeconds(30),
    now.AddSeconds(31)
};
    
IEnumerable<DateTimeOffset> timestampsSorted = timestamps.OrderByDescending(x => x);

应产生:

2022-01-01 00:00:31->2022-01-01 00:00:30 | 00:00:01
2022-01-01 00:00:30->2022-01-01 00:00:15 | 00:00:15
2022-01-01 00:00:15->2022-01-01 00:00:10 | 00:00:05
2022-01-01 00:00:10->2022-01-01 00:00:05 | 00:00:05
2022-01-01 00:00:05->2022-01-01 00:00:00 | 00:00:05

Min 00:00:01
Max 00:00:15
Avg 00:00:06.2000000

下面是我提出的过程性解决方案,如果我能使用LINQ简化它,那就太好了。

TimeSpan min = TimeSpan.MaxValue;
TimeSpan max = TimeSpan.MinValue;
List<TimeSpan> deltas = new();

for (int i = timestampsSorted.Length - 1; i > 0; i--)
{
    DateTimeOffset later = timestamps[i];
    DateTimeOffset prev = timestamps[i - 1];

    TimeSpan delta = later - prev;
    
    if (delta > max) { max = delta; }
    if (delta < min) { min = delta; }

    deltas.Add(delta);
    Console.WriteLine($"{later:yyyy-MM-dd HH:mm:ss}->{prev:yyyy-MM-dd HH:mm:ss} | {delta}");
}

var result = new { 
    Min = min,
    Max = max,
    Avg = TimeSpan.FromMilliseconds(deltas.Average(d => d.TotalMilliseconds))
};
xkftehaa

xkftehaa1#

使用LINQ的内置MinMaxAverage函数。

var timestampsSorted = timestamps.OrderByDescending(o => o).ToArray();
var data = timestampsSorted
    .Skip(1)
    .Select((o, i) => timestampsSorted[i] - o)
    .ToArray();
var min = data.Min();
var max = data.Max();
var avg = TimeSpan.FromSeconds(data.Average(o => o.TotalSeconds));

请注意,对这些MinMaxAverage函数的单独调用会导致对data数组中的项进行3次迭代。

vybvopom

vybvopom2#

这远远不是最佳的,因为集合被枚举了几次,但是因为您要求...

// Expensive, multiple enumeration
var diffs = timestampsSorted.Skip(1).Zip(timestampsSorted, (first, second) => second.Subtract(first));
Console.WriteLine($"Min = {timestampsSorted.Min():ss.fff}, Max = {timestampsSorted.Max():ss.fff}, Ave = {diffs.Select(t => t.TotalMilliseconds).Average() / 1000:0.000}");

参考:Zip()Skip()

yeotifhr

yeotifhr3#

First, the date difference with the next item is calculated and stored in the diff list:

var diff = timestampsSorted.Select(x => x- ((timestampsSorted.IndexOf(x) + 1)< timestampsSorted.Count?
           timestampsSorted[timestampsSorted.IndexOf(x) + 1]: now) ).ToList();
diff.RemoveAt(diff.Count-1);

00:00:01
00:00:15
00:00:05
00:00:05
00:00:05
And then the mean , max and average are easily calculated:

Console.WriteLine("Min {0}",diff.Min());

Console.WriteLine("Max {0}", diff.Max());

var averageDates = (long)diff.Select(d => d.Ticks).Average();
Console.WriteLine("Avg {0}", new DateTime(averageDates).ToString("HH:mm:ss.fff"));

Min 00:00:01
Max 00:00:15
Avg 00:00:06.200

c8ib6hqw

c8ib6hqw4#

You don't need to store all of the delta values in a List<TimeSpan> on which to call Average() ; it's more efficient to just keep a running sum and then divide it by the number of pairs compared ( timestamps.Length - 1 ). So this...

// ...
List<TimeSpan> deltas = new();

for (int i = timestamps.Length - 1; i > 0; i--)
{
    // ...
    deltas.Add(delta);
    // ...
}

var result = new {
    // ...
    Avg = TimeSpan.FromMilliseconds(deltas.Average(d => d.TotalMilliseconds))
};

...would be changed to...

// ...
TimeSpan sum = TimeSpan.Zero;

for (int i = timestamps.Length - 1; i > 0; i--)
{
    // ...
    sum += delta;
    // ...
}

var result = new { 
    // ...
    //TODO: Avoid division for sequences with less than 2 elements, if expected
    Avg = TimeSpan.FromMilliseconds(sum.TotalMilliseconds / (timestamps.Length - 1))
};

Aggregate() is what you'd use to accumulate one or more values over the course of a sequence. Here's a method that uses Aggregate() to calculate the same values as your for loop...

static (TimeSpan? Minimum, TimeSpan? Maximum, TimeSpan? Average, int Count) GetDeltaStatistics(IEnumerable<DateTimeOffset> timestamps)
{
    var seed = (
        Previous: (DateTimeOffset?) null,
        Minimum: (TimeSpan?) null,
        Maximum: (TimeSpan?) null,
        Sum: TimeSpan.Zero,
        Count: 0
    );

    return timestamps.Aggregate(
        seed,
        (accumulator, current) => {
            if (accumulator.Previous != null)
            {
                TimeSpan delta = current - accumulator.Previous.Value;

                if (++accumulator.Count > 1)
                {
                    // This is not the first comparison; Minimum and Maximum are non-null
                    if (delta < accumulator.Minimum.Value)
                        accumulator.Minimum = delta;
                    if (delta > accumulator.Maximum.Value)
                        accumulator.Maximum = delta;
                }
                else
                {
                    // No prior comparisons have been performed
                    // Minimum and Maximum must be null so unconditionally overwrite them
                    accumulator.Minimum = accumulator.Maximum = delta;
                }
                accumulator.Sum += delta;

                Console.WriteLine($"{current:yyyy-MM-dd HH:mm:ss}->{accumulator.Previous:yyyy-MM-dd HH:mm:ss} | {delta}");
            }
            accumulator.Previous = current;

            return accumulator;
        },
        accumulator => (
            accumulator.Minimum,
            accumulator.Maximum,
            Average: accumulator.Count > 0
                ? new TimeSpan(accumulator.Sum.Ticks / accumulator.Count)
                : (TimeSpan?) null,
            accumulator.Count
        )
    );
}

The second parameter of this overload of Aggregate() is a Func<> that is passed the current element in the sequence ( current ) and the state that was returned from the previous invocation of the Func<> ( accumulator ). The first parameter provides the initial value of accumulator . The third parameter is a Func<> that transforms the final value of this state to the return value of Aggregate() . The state and return value are all value tuples .
Note that GetDeltaStatistics() only needs an IEnumerable<DateTimeOffset> and not a IList<DateTimeOffset> or DateTimeOffset[] ; since there is no random access to adjacent elements, though, the value of current is carried forward to the next invocation via accumulator.Previous . I also made it the caller's responsibility to provide sorted input, but you could just as easily perform that inside the method.
Calling GetDeltaStatistics() with...

static void Main()
{
    DateTimeOffset now = new DateTimeOffset(new DateTime(2022, 1, 1, 0, 0, 0, 0));
    DateTimeOffset[] timestamps = new[] {
        now,
        now.AddSeconds(5),
        now.AddSeconds(10),
        now.AddSeconds(15),
        now.AddSeconds(30),
        now.AddSeconds(31)
    };

    IEnumerable<IEnumerable<DateTimeOffset>> timestampSequences = new IEnumerable<DateTimeOffset>[] {
        timestamps,
        timestamps.Take(2),
        timestamps.Take(1),
        timestamps.Take(0)
    };
    foreach (IEnumerable<DateTimeOffset> sequence in timestampSequences)
    {
        var (minimum, maximum, average, count) = GetDeltaStatistics(sequence.OrderBy(offset => offset));

        Console.WriteLine($"Minimum: {GetDisplayText(minimum)}");
        Console.WriteLine($"Maximum: {GetDisplayText(maximum)}");
        Console.WriteLine($"Average: {GetDisplayText(average)}");
        Console.WriteLine($"  Count: {count}");
        Console.WriteLine();
    }
}

static string GetDisplayText(TimeSpan? delta) => delta == null ? "(null)" : delta.Value.ToString();

...produces this output...

2022-01-01 00:00:05->2022-01-01 00:00:00 | 00:00:05
2022-01-01 00:00:10->2022-01-01 00:00:05 | 00:00:05
2022-01-01 00:00:15->2022-01-01 00:00:10 | 00:00:05
2022-01-01 00:00:30->2022-01-01 00:00:15 | 00:00:15
2022-01-01 00:00:31->2022-01-01 00:00:30 | 00:00:01
Minimum: 00:00:01
Maximum: 00:00:15
Average: 00:00:06.2000000
  Count: 5

2022-01-01 00:00:05->2022-01-01 00:00:00 | 00:00:05
Minimum: 00:00:05
Maximum: 00:00:05
Average: 00:00:05
  Count: 1

Minimum: (null)
Maximum: (null)
Average: (null)
  Count: 0

Minimum: (null)
Maximum: (null)
Average: (null)
  Count: 0

Whereas the original code would cause an exception to be thrown, for sequences with less than two elements the result has a Count of 0 and the other fields are null .

相关问题