Erlang中的垃圾收集和内存管理

osh3o9ms  于 2022-12-08  发布在  Erlang
关注(0)|答案(4)|浏览(276)

我想了解Erlang/OTP中垃圾收集(GC)和内存管理的技术细节。
但是,我在erlang.org上找不到它的文档。
我在网上找到了一些文章,它们以一种非常笼统的方式讨论了GC,比如使用了什么垃圾收集算法。

ldioqlga

ldioqlga1#

To classify things, lets define the memory layout and then talk about how GC works.

Memory Layout

In Erlang, each thread of execution is called a process. Each process has its own memory and that memory layout consists of three parts: Process Control Block, Stack and Heap.

  • PCB: Process Control Block holds information like process identifier (PID), current status (running, waiting), its registered name, and other such info.
  • Stack: It is a downward growing memory area which holds incoming and outgoing parameters, return addresses, local variables and temporary spaces for evaluating expressions.
  • Heap: It is an upward growing memory area which holds process mailbox messages and compound terms. Binary terms which are larger than 64 bytes are NOT stored in process private heap. They are stored in a large Shared Heap which is accessible by all processes.

Garbage Collection

Currently Erlang uses a Generational garbage collection that runs inside each Erlang process private heap independently, and also a Reference Counting garbage collection occurs for global shared heap.

  • Private Heap GC: It is generational, so divides the heap into two segments: young and old generations. Also there are two strategies for collecting; Generational (Minor) and Fullsweep (Major). The generational GC just collects the young heap, but fullsweep collect both young and old heap.
  • Shared Heap GC: It is reference counting. Each object in shared heap (Refc) has a counter of references to it held by other objects (ProcBin) which are stored inside private heap of Erlang processes. If an object's reference counter reaches zero, the object has become inaccessible and will be destroyed.

To get more details and performance hints, just look at my article which is the source of the answer: Erlang Garbage Collection Details and Why It Matters

busg9geu

busg9geu2#

算法参考文献:One Pass Real-Time Generational Mark-Sweep Garbage Collection (1995)由Joe Armstrong和Robert Virding于1995年编写(在CiteSeerX上)
摘要:
传统的标记-清扫垃圾收集算法在标记阶段结束之前不允许回收数据.对于不允许破坏性操作的语言类,我们可以安排堆中的所有指针总是向后指向“较旧的”数据.本文提出了一种简单的方案,用一个单遍标记-清扫收集器来回收这类语言的数据.我们还展示了如何修改简单的方案,以便以增量方式进行收集(使其适合于实时收集)。接下来,我们展示了如何修改收集器以进行分代垃圾收集,最后,我们展示了如何将该方案用于具有并发进程的语言。1

mqkwyuun

mqkwyuun3#

Erlang有一些属性使得GC实际上非常简单。
1-每个变量都是不可变的,因此变量永远不能指向在它之后创建的值。
2-在Erlang进程之间复制值,因此进程中引用的内存几乎总是完全隔离的。
这两个方面(尤其是后者)都极大地限制了GC在收集过程中必须扫描的堆的数量。
Erlang使用复制GC。在GC期间,进程停止,然后将活动指针从from-space复制到to-space。我忘记了确切的百分比,但如果在收集期间只能收集25%的堆,则堆将增加。并且如果可以收集进程堆的75%,那么它将减少。当进程堆变满时,将触发收集。
唯一的例外情况是当发送到另一个进程的值很大时。这些值将被复制到共享空间并进行引用计数。当收集到对共享对象的引用时,计数将减少,当该计数为0时,对象将被释放。不会尝试处理共享堆中的碎片。
一个有趣的结果是,对于共享对象,共享对象的大小不会影响进程堆的计算大小,只有引用的大小会影响进程堆的计算大小。这意味着,如果有很多大型共享对象,VM可能会在触发GC之前耗尽内存。
大多数内容摘自Jesper Wilhelmsson在EUC2012上的演讲。

bqjvbblv

bqjvbblv4#

我不知道你的背景,但是除了jj1bdx已经指出的论文之外,你也可以给Jesper Wilhelmsson thesis一个机会。
顺便说一句,如果你想监控Erlang中的内存使用情况,并将其与C++等进行比较,你可以查看:

希望这对你有帮助!

相关问题