为什么这个本地的C++ lambda这么慢？

nbnkbykc 于 2022-12-27 发布在其他

关注(0)|答案(3)|浏览(174)

为了便于理解，我使用了一个带有递归调用的局部lambda（tail recursive），运行这个（例如在http://cpp.sh/或https://coliru.stacked-crooked.com/上）总是显示lambda调用比其他解决方案慢很多。

我的问题是为什么会这样？

#include <iostream>
#include <chrono>
#include <functional>
 
//tail recursive lamda
long long int Factorial5(long long int n)
{ 
  std::function<long long int(long long int,long long int)> aux
     = [&aux](long long int n, long long int acc)
  { 
    return n < 1 ? acc : aux(n - 1, acc * n);
  };
  return aux(n,1);
}
 
//tail recursive inline class
long long int Factorial6(long long int n)
{ 
  class aux {
    public: inline static long long int tail(long long int n, long long int acc)
    { 
      return n < 1 ? acc : tail(n - 1, acc * n);
    }
  };
  return aux::tail(n,1);
}

int main()
{
    int v = 55;
    {
        auto t1 = std::chrono::high_resolution_clock::now();
        auto result = Factorial5(v);
        auto t2 = std::chrono::high_resolution_clock::now();
        std::chrono::duration<double, std::milli> ms = t2 - t1;
        std::cout << std::fixed << "lamda(5)\tresult " << result
                  << " took " << ms.count() << " ms\n";
    }
    {
        auto t1 = std::chrono::high_resolution_clock::now();
        auto result = Factorial6(v);
        auto t2 = std::chrono::high_resolution_clock::now();
        std::chrono::duration<double, std::milli> ms = t2 - t1;
        std::cout << std::fixed << "inner class(6)\tresult " << result
                  << " took " << ms.count() << " ms\n";
    }
}

输出：

lamda(5)        result 6711489344688881664 took 0.076737 ms
inner class(6)  result 6711489344688881664 took 0.000140 ms

c++

来源：https://stackoverflow.com/questions/57413429/why-is-this-local-recrusive-c-lambda-so-terrible-slow

3条答案

按热度按时间

b0zn9rqh1#

lambda不是std::function，每个lambda都有自己唯一的类型，大致如下：

struct /* unnamed */ {
    auto operator()(long long int n, long long int acc) const
    { 
        return n < 1 ? acc : aux(n - 1, acc * n);
    }
} aux{};

所以原则上，lambda非常快，和任何函数或者非虚成员函数一样快。
缓慢的原因是std::function，它是lambda的类型擦除 Package 器，它将函数调用转换为虚拟调用，并可能动态分配lambda，这代价很高，并且妨碍了内联。
要创建递归lambda，必须使用C++14泛型lambda并将lambda发送给它自己：

auto aux = [](auto aux, long long int n, long long int acc) -> long long int
{ 
  return n < 1 ? acc : aux(aux, n - 1, acc * n);
};

return aux(aux, n, 1);

赞(0）回复(0）举报 2022-12-27

f0ofjuux2#

不管怎样，我的系统结果是：

lamda(5)        result 6711489344688881664 took 0.003501 ms
inner class(6)  result 6711489344688881664 took 0.002312 ms

这远没有那么激烈，尽管仍然有可测量的差异。
产生差异的原因是函数几乎不做任何工作，但是有非常非常多的函数调用。因此，任何函数调用开销都会支配结果。std::function有开销，这在度量中显示出来。
注意，示例程序溢出long long，因此结果没有意义，因为行为完全未定义。正确的结果应该是大约6.689503e+198。理论上，UB也使运行时间测量的结果无效，但实际上不一定。

赞(0）回复(0）举报 2022-12-27

mv1qrgav3#

lambda是一个未命名类型的对象，它调用另一个对象（std::function）的operator()，std::function调用lambda，...
也就是说，存在通过对象的相互递归和间接性，这使得优化非常困难。
您的代码或多或少与下面的代码等效（我将类型缩短为int）：

struct Aux;

struct Fun
{
    const Aux& aux;
    Fun(const Aux& aux) : aux(aux) {}
    int work(int n, int acc) const;
};

struct Aux
{
    const Fun& fun;
    Aux(const Fun& fun) : fun(fun) {}
    int actual_work(int n, int acc) const
    {
        return n < 1 ? acc : fun.work(n-1, acc*n);
    }
};

int Fun::work(int n, int acc) const
{
    return aux.actual_work(n, acc);
}

int factorial5(int n)
{
    Fun fun = Aux(fun);
    return fun.work(n, 1);
}

这就更清楚地表明，有很多看不见的东西在发生。

赞(0）回复(0）举报 2022-12-27

我来回答

为什么这个本地的C++ lambda这么慢？

我的问题是为什么会这样？

3条答案

相关问题

热门标签

最新问答