flink checkpoint里面有一个概念,externalized checkpoints
,这里externalized
是什么意思?有没有对应的概念,可以叫做internal checkpoints
?
即使我不调用enableExternalizedCheckpoints
方法,但是当我在hdfs上指定一个检查点路径时,我认为我在外部持久化检查点,我能说我在做externalized checkpoints
吗?
所以,我有点困惑。
/**
* Enables checkpoints to be persisted externally.
*
* <p>Externalized checkpoints write their meta data out to persistent
* storage and are <strong>not</strong> automatically cleaned up when
* the owning job fails or is suspended (terminating with job status
* {@link JobStatus#FAILED} or {@link JobStatus#SUSPENDED}). In this
* case, you have to manually clean up the checkpoint state, both
* the meta data and actual program state.
*
* <p>The {@link ExternalizedCheckpointCleanup} mode defines how an
* externalized checkpoint should be cleaned up on job cancellation. If you
* choose to retain externalized checkpoints on cancellation you have you
* handle checkpoint clean up manually when you cancel the job as well
* (terminating with job status {@link JobStatus#CANCELED}).
*
* <p>The target directory for externalized checkpoints is configured
* via {@link org.apache.flink.configuration.CheckpointingOptions#CHECKPOINTS_DIRECTORY}.
*
* @param cleanupMode Externalized checkpoint cleanup behaviour.
*/
@PublicEvolving
public void enableExternalizedCheckpoints(ExternalizedCheckpointCleanup cleanupMode) {
this.externalizedCheckpointCleanup = checkNotNull(cleanupMode);
}
1条答案
按热度按时间uoifb46i1#
事实上,检查点存储在hdfs中,并不完全使它们直接外部化。外部化的检查点是在特定作业示例的意义上外部化的。标准检查点只用于从故障中恢复,如果作业被取消或失败,它们将自动清理,并且它们没有元数据,这意味着它们不打算由该特定作业示例单独使用。
现在,外部检查点将元数据与检查点一起保存,并且不会自动删除(您可以在某种程度上配置此行为)。因此,您可以将外部化检查点视为保存点,在某种意义上,您可以使用它在更新、失败或取消后启动另一个作业示例。