Kafka在创建新主题时会生成大量的额外数据

xpszyzbs 于 2021-06-06 发布在 Kafka

关注(0)|答案(1)|浏览(579)

我有一个3节点zookeeper集群版本3.4.11和2节点kafka集群版本0.11.3。我们编写了一个producer，它向kafka集群的特定主题和分区发送消息（我以前做过，producer已经过测试）。以下是代理配置：

broker.id=1
listeners=PLAINTEXT://node1:9092
num.partitions=24
delete.topic.enable=true
default.replication.factor=2
log.dirs=/data
zookeeper.connect=zoo1:2181,zoo2:2181,zoo3:2181
log.retention.hours=168
zookeeper.session.timeout.ms=40000
zookeeper.connection.timeout.ms=10000
offsets.topic.replication.factor=2
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=2

一开始，没有关于代理的主题，它们将自动创建。当我启动producer时，kafka集群显示了一个奇怪的行为：
1-它创建了所有主题，但当生成数据的速率为每秒10kb时，在不到一分钟的时间内，每个代理的日志目录从零数据变为9.0 GB数据！并且所有代理都已关闭（因为缺少日志目录容量）
2-就在开始生成数据时，我尝试使用控制台使用者使用数据，但它只是出错

WARN Error while fetching metadata with correlation id 2 : {Topic1=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)

3-以下是代理日志中反复出现的错误：

INFO Updated PartitionLeaderEpoch. New: {epoch:0, offset:0}, Current: {epoch:-1, offset-1} for Partition: Topic6-6. Cache now contains 0 entries. (kafka.server.epoch.LeaderEpochFileCache)
WARN Newly rolled segment file 00000000000000000000.log already exists; deleting it first (kafka.log.Log)
WARN Newly rolled segment file 00000000000000000000.index already exists; deleting it first (kafka.log.Log)
WARN Newly rolled segment file 00000000000000000000.timeindex already exists; deleting it first (kafka.log.Log)
ERROR [Replica Manager on Broker 1]: Error processing append operation on partition Topic6-6 (kafka.server.ReplicaManager)
kafka.common.KafkaException: Trying to roll a new log segment for topic partition Topic6-6 with start offset 0 while it already exists.

在多次重复上述日志后，我们得到：

ERROR [ReplicaManager broker=1] Error processing append operation on partition Topic24-10 (kafka.server.ReplicaManager)
org.apache.kafka.common.errors.InvalidOffsetException: Attempt to append an offset (402) to position 5 no larger than the last offset appended (402)

最后（当log dir中没有空间时）它会出错：

FATAL [Replica Manager on Broker 1]: Error writing to highwatermark file:  (kafka.server.ReplicaManager)
java.io.FileNotFoundException: /data/replication-offset-checkpoint.tmp (No space left on device)

然后关机！
4-我在另一台机器上设置了新的单节点kafka版本0.11.3，它使用同一个生产者和同一个zookeeper集群运行良好。
5-我关闭了两个kafka代理中的一个，只使用一个代理（集群的），它的行为与我使用两节点kafka集群时的行为相同。
有什么问题？
更新1：我试过Kafka2.1.0版，但结果一样！
更新2：我发现了问题的根源。在制作中，我创建了25个主题，每个主题有24个分区。令人惊讶的是，刚刚创建完的每个主题（使用kafka-topic.sh命令，在没有存储数据的情况下）占用了481mb的空间！例如，在主题“20”的日志目录中，对于每个分区目录，我有以下总共21mb的文件：

00000000000000000000.index (10MB)  00000000000000000000.log(0MB)  00000000000000000000.timeindex(10MB)  leader-epoch-checkpoint(4KB)

kafka为server.log文件中的每个主题分区编写以下行：

[2019-02-05 10:10:54,957] INFO [Log partition=topic20-14, dir=/data] Loading producer state till offset 0 with message format version 2 (kafka.log.Log)
[2019-02-05 10:10:54,957] INFO [Log partition=topic20-14, dir=/data] Completed load of log with 1 segments, log start offset 0 and log end offset 0 in 1 ms (kafka.log.Log)
[2019-02-05 10:10:54,958] INFO Created log for partition topic20-14 in /data with properties {compression.type -> producer, message.format.version -> 2.1-IV2, file.delete.delay.ms -> 60000, max.message.bytes -> 1000012, min.compaction.lag.ms -> 0, message.timestamp.type -> CreateTime, message.downconversion.enable -> true, min.insync.replicas -> 1, segment.jitter.ms -> 0, preallocate -> false, min.cleanable.dirty.ratio -> 0.5, index.interval.bytes -> 4096, unclean.leader.election.enable -> false, retention.bytes -> -1, delete.retention.ms -> 86400000, cleanup.policy -> [delete], flush.ms -> 9223372036854775807, segment.ms -> 604800000, segment.bytes -> 1073741824, retention.ms -> 604800000, message.timestamp.difference.max.ms -> 9223372036854775807, segment.index.bytes -> 10485760, flush.messages -> 9223372036854775807}. (kafka.log.LogManager)
[2019-02-05 10:10:54,958] INFO [Partition topic20-14 broker=0] No checkpointed highwatermark is found for partition topic20-14 (kafka.cluster.Partition)
[2019-02-05 10:10:54,958] INFO Replica loaded for partition topic20-14 with initial high watermark 0 (kafka.cluster.Replica)
[2019-02-05 10:10:54,958] INFO [Partition topic20-14 broker=0] topic20-14 starts at Leader Epoch 0 from offset 0. Previous Leader Epoch was: -1 (kafka.cluster.Partition)

服务器日志中没有错误。我甚至可以消费数据，如果我产生数据的主题。由于总日志目录空间是10gb，在我的场景中，kafka需要12025mb来存放25个主题，这比总目录空间要大，kafka会出错并关闭！
为了进行测试，我使用相同的zookeeper集群设置了另一个kafka代理（即broker2），并创建了一个包含24个分区的新主题，所有空分区只占用100k！
所以我真的很困惑！broker1和broker2运行的是同一版本的kafka（0.11.3），只是操作系统和系统文件不同：
如果broker1（新主题占用481mb数据）：
os centos 7和xfs作为系统文件
如果broker2（新主题占用100kb数据）：
操作系统ubuntu 16.04和ext4作为系统文件

apache-kafka

来源：https://stackoverflow.com/questions/54511789/kafka-generate-huge-extra-data-when-creating-new-topics

1条答案

按热度按时间

pftdvrlh1#

为什么Kafka为每个分区预先分配21mb？
这是正常行为，索引的预分配大小使用服务器属性进行控制： segment.index.bytes 默认值为10485760字节或10mb。这是因为每个分区目录中的索引分配10mb：

00000000000000000000.index (10MB)  
00000000000000000000.log(0MB)  
00000000000000000000.timeindex(10MB)  
leader-epoch-checkpoint(4KB)

另一方面，Kafka的文件提到了这一财产：

We preallocate this index file and shrink it only after log rolls.

但就我而言，它从未缩小指数。在进行了大量搜索之后，我发现Java8在某些版本（在我的例子中是192）中在处理许多小文件时有一个bug，并且在更新202中得到了修复。所以我把java版本更新到202，解决了这个问题。

赞(0）回复(0）举报 2021-06-06

我来回答

Kafka在创建新主题时会生成大量的额外数据

1条答案

相关问题

热门标签

最新问答