mesos和marathon不时提到检查点,但我找不到一个很好的解释它在任何地方是如何工作的。还有,它在实践中意味着什么?
1) Is the Task current state continuously being stored, or is only the Task ID stored? Where is it stored and what does it contain?
2) There are two Marathon instances. Marathon has been running Nginx for a week, then goes down. Does that mean that the actual Nginx application state continues running on the second Marathon instance, or does it just restart the task from beginning? If the Task actual state is copied, isn't there a lot of data to be continuously persisted and passed around between slaves?
1条答案
按热度按时间busg9geu1#
从机恢复是mesos的一项功能,它允许:
在从属进程关闭和停止时保持运行的执行器/任务
允许重新启动的从属进程与从属进程上正在运行的执行器/任务重新连接(mesos从机恢复)。
关于你的问题,这意味着:
存储了足够的信息(比taskid多一点),以便新的从属进程可以重新连接到仍在运行的executor/task。
由于任务状态没有检查点,它将从头开始任务。
希望这有帮助,乔格