我们正在使用kafka connect hdfs连接器,它不断地从kafka主题中提取数据并将它们提交到hdfs上。
在成功加载12+11小时后,我们突然在连接器侧发现此错误。
org.apache.kafka.clients.consumer.NoOffsetForPartitionException: Undefined offset with no reset policy for partition:Prd_IN_GeneralEvents-39
at org.apache.kafka.clients.consumer.internals.Fetcher.resetOffset(Fetcher.java:374)at org.apache.kafka.clients.consumer.internals.Fetcher.resetOffsetsIfNeeded(Fetcher.java:227)at org.apache.kafka.clients.consumer.KafkaConsumer.updateFetchPositions(KafkaConsumer.java:1592)at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1035)at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
at org.apache.kafka.connect.runtime.WorkerSinkTask.pollConsumer(WorkerSinkTask.java:360)at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:245)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:179)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
at
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
然后一些(100个中的9个)hdfs工作线程被杀死,我们开始丢失数据。
这个错误的根本原因是什么?
我们已经准备好了 auto.offset.reset=latest
在connect.distributed.properties文件中
暂无答案!
目前还没有任何答案,快来回答吧!