.net 提高枚举器性能?[已关闭]

irlmq6kh  于 2023-01-14  发布在  .NET
关注(0)|答案(1)|浏览(93)

昨天关门了。
Improve this question
直觉
为了简化我的API,我想给用户提供一些枚举器来遍历数据集,但我没有深入到数据集,因为这会超出范围。
我在评测和基准测试时注意到,c#枚举数和自定义(结构)枚举数比原始数组访问要慢得多。
由于我的API和应用程序需要高性能和尽可能快的速度,我正在寻找影响这一点的方法。

问题

MoveNext()Current从未内联,即使标记为inline。这会导致大量地址跳转,从而影响性能。
举个例子:

/// <summary>
///     The <see cref="Enumerator{T}"/> struct
///     represents an enumerator with which one can iterate over all items of an array or span.
/// </summary>
/// <typeparam name="T">The generic type.</typeparam>
public ref struct Enumerator<T>
{
    private readonly Span<T> _span;

    private int _index;
    private readonly int _size;

    /// <summary>
    ///     Initializes a new instance of the <see cref="Enumerator{T}"/> struct.
    /// </summary>
    /// <param name="span">The <see cref="Span{T}"/> with items to iterate over.</param>
    public Enumerator(Span<T> span)
    {
        _span = span;
        _index = -1;
        _size = span.Length;
    }

    /// <summary>
    ///     Initializes a new instance of the <see cref="Enumerator{T}"/> struct.
    /// </summary>
    /// <param name="span">The <see cref="Span{T}"/> with items to iterate over.</param>
    /// <param name="length">Its length or size.</param>
    public Enumerator(Span<T> span, int length)
    {
        _span = span;
        _index = -1;
        _size = length;
    }

    /// <summary>
    ///     Moves to the next item.
    /// </summary>
    /// <returns>True if there still items, otherwhise false.</returns>
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public bool MoveNext()
    {
        return unchecked(++_index) < _size;
    }

    /// <summary>
    ///     Resets this instance.
    /// </summary>
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public void Reset()
    {
        _index = -1;
    }

    /// <summary>
    ///     Returns a reference to the current item.
    /// </summary>
    public readonly ref T Current
    {
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        get => ref _span[_index];
    }
}

public void EnumerateSample(){

    var items = new int[100000];
    var enumerator = new Enumerator<int>(items);

    // MoveNext and .Current will not inline, bad 
    // Adress jumps on every single item, incredible bad
    while(enumerator.MoveNext()){
       ref var i = ref enumerator.Current;
       i++;
    }

}

添加SkipLocalsInit也没有帮助。
foreach(ref var item in myEnumerator)非常慢,因为这些调用没有内联,这在大量迭代中非常明显。

问题

有没有办法进一步影响枚举器的性能,或者真正强制movenext和current调用内联?
欢迎任何帮助或想法!

7rfyedvj

7rfyedvj1#

下面是这个方法在版本配置中反汇编时的样子。唯一的call是用于new int[]的。条件逻辑需要跳转,例如循环。没有调用/返回。我在这里没有看到任何问题。
BenchmarkDotNet显示,如果我将new int[]移出此方法,只留下foreach和大约630 us的内存分配,则此方法大约需要320 us的时间。

public void EnumerateSample()
        {
            int[] items = new int[3];
00007FF9947180A0  sub         rsp,38h  
00007FF9947180A4  vxorps      xmm4,xmm4,xmm4  
00007FF9947180A9  vmovdqa     xmmword ptr [rsp+20h],xmm4  
00007FF9947180B0  xor         eax,eax  
00007FF9947180B2  mov         qword ptr [rsp+30h],rax  
00007FF9947180B7  mov         rcx,7FF994792520h  
00007FF9947180C1  mov         edx,3  
00007FF9947180C6  call        CORINFO_HELP_NEWARR_1_VC (07FF9F429B360h)  
            var enumerator = new Enumerator<int>(items);
00007FF9947180CB  add         rax,10h  
00007FF9947180CF  mov         edx,3  
00007FF9947180D4  mov         ecx,edx  
00007FF9947180D6  lea         r8,[rsp+20h]  
00007FF9947180DB  mov         qword ptr [r8],rax  
00007FF9947180DE  mov         dword ptr [r8+8],ecx  
00007FF9947180E2  mov         dword ptr [rsp+30h],0FFFFFFFFh  
00007FF9947180EA  mov         dword ptr [rsp+34h],edx  
            while (enumerator.MoveNext())
00007FF9947180EE  mov         eax,dword ptr [rsp+30h]  
00007FF9947180F2  inc         eax  
00007FF9947180F4  mov         dword ptr [rsp+30h],eax  
00007FF9947180F8  cmp         eax,dword ptr [rsp+34h]  
00007FF9947180FC  jge         TestArea.EnumeratorTests.EnumerateSample()+08Ah (07FF99471812Ah)  
00007FF9947180FE  nop  
            {
                ref var i = ref enumerator.Current;
00007FF994718100  mov         eax,dword ptr [rsp+30h]  
00007FF994718104  lea         rdx,[rsp+20h]  
00007FF994718109  cmp         eax,dword ptr [rdx+8]  
00007FF99471810C  jae         TestArea.EnumeratorTests.EnumerateSample()+08Fh (07FF99471812Fh)  
00007FF99471810E  mov         rdx,qword ptr [rdx]  
00007FF994718111  movsxd      rax,eax  
00007FF994718114  lea         rax,[rdx+rax*4]  
00007FF994718118  inc         dword ptr [rax]  
            while (enumerator.MoveNext())
00007FF99471811A  mov         eax,dword ptr [rsp+30h]  
00007FF99471811E  inc         eax  
00007FF994718120  mov         dword ptr [rsp+30h],eax  
00007FF994718124  cmp         eax,dword ptr [rsp+34h]  
00007FF994718128  jl          TestArea.EnumeratorTests.EnumerateSample()+060h (07FF994718100h)  
00007FF99471812A  add         rsp,38h  
00007FF99471812E  ret

相关问题