在Erlang中实现高可用性的典型方法是什么?
假设某个gen_server
在本地注册为?MODULE
,给定N
是独立的,并且由默认的Erlang节点互连,每个Erlang节点运行该gen_server
的一个示例,如何1)确保没有请求由于某个参与节点的故障而丢失(只要它们中至少有一个在线),2)负载平衡它们,以避免某些节点过载,而其他节点挂起等待新消息?据我所知,不存在内置的负载平衡器:没有pg2
或更新的pg
是足够的(仍然可能是在这个方向上进一步工作的良好基础)。
我敢打赌这是一个常见的问题,并且确实存在经过实战检验的“爱尔兰式”解决方案。
1条答案
按热度按时间qgzx9mmu1#
I think that for 1) to have only-once guarantee you need some kind of distributed transaction algorithm because connections might fail and you don't know the state of the request in the remote node: Is the remote node dead? is it alive and just disconnected because a network failure? how far into the request processing did it go before the failure?
You should check mnesia , it's deeply integrated with Erlang.
If you relax the requirements for 1) (for instance if the requests are idempotent. you only care for at-least-once or the failures are not common), it may suffice with monitoring the remote
gen_server
and just replaying the request if the connection to the remote server is lost for whatever reason.For 2 we use haproxy or nginx webserver in a least-conn fashion in front of the nodes, although I believe that you mean 'inside' Erlang. In that case I'd do the following to have a local ETS with the load info:
MODULE
sidekick that broadcasts the localMODULE
's mailbox size (or other metric) periodically to other sidekicks in the cluster.Regarding OTP23's pg , don't discard it so easily. By the doc
Process Groups implement strong eventual consistency.
you may have overloaded servers leave the process group temporarily and they will eventually stop receiving requests. You can have several servers by node with a low trigger to leave the group for a more uniform distribution.