flink作业创建rocksdb示例失败

dsf9zpds  于 2021-06-21  发布在  Flink
关注(0)|答案(1)|浏览(766)

我在flink上运行了很多作业,后端使用rocksdb,我的一个作业出现错误,整夜重启,
错误消息如下:

java.lang.IllegalStateException: Could not initialize keyed state backend.
    at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initKeyedState(AbstractStreamOperator.java:330)
    at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:221)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:679)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:666)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:708)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Error while opening RocksDB instance.
    at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.openDB(RocksDBKeyedStateBackend.java:1063)
    at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.access$3300(RocksDBKeyedStateBackend.java:128)
    at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBIncrementalRestoreOperation.restoreInstance(RocksDBKeyedStateBackend.java:1472)
    at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBIncrementalRestoreOperation.restore(RocksDBKeyedStateBackend.java:1569)
    at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.restore(RocksDBKeyedStateBackend.java:996)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.createKeyedStateBackend(StreamTask.java:775)
    at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initKeyedState(AbstractStreamOperator.java:319)
Caused by: org.rocksdb.RocksDBException: Corruption: Sst file size mismatch: /mnt/dfs/3/hadoop/yarn/local/usercache/sloth/appcache/application_1526888270443_0002/flink-io-84ec9962-f37f-4fbc-8262-a215984d8d70/job-1a72a5f09ac8a80914256306363505aa_op-CoStreamFlatMap_1361_4_uuid-0b019d7f-2d28-44dc-baf2-12774ed3518f/db/008919.sst. Size recorded in manifest 132174005, actual size 2674688
Sst file size mismatch: /mnt/dfs/3/hadoop/yarn/local/usercache/sloth/appcache/application_1526888270443_0002/flink-io-84ec9962-f37f-4fbc-8262-a215984d8d70/job-1a72a5f09ac8a80914256306363505aa_op-CoStreamFlatMap_1361_4_uuid-0b019d7f-2d28-44dc-baf2-12774ed3518f/db/008626.sst. Size recorded in manifest 111956797, actual size 14286848
Sst file size mismatch: /mnt/dfs/3/hadoop/yarn/local/usercache/sloth/appcache/application_1526888270443_0002/flink-io-84ec9962-f37f-4fbc-8262-a215984d8d70/job-1a72a5f09ac8a80914256306363505aa_op-CoStreamFlatMap_1361_4_uuid-0b019d7f-2d28-44dc-baf2-12774ed3518f/db/008959.sst. Size recorded in manifest 43157714, actual size 933888
    at org.rocksdb.TtlDB.openCF(Native Method)
    at org.rocksdb.TtlDB.open(TtlDB.java:132)
    at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.openDB(RocksDBKeyedStateBackend.java:1054)
    ... 12 more

当我发现这个,它手动杀死它,然后重新启动。然后它工作得很好。
这个错误是怎么发生的,我找不到任何来自谷歌或其他地方的消息

slsn1g29

slsn1g291#

我在这个异常之前发现了错误。

No space left on device

我认为这个问题要以这个问题为代价

相关问题