When working with Apache Kafka, there may be a situation when [ Apache Kafka How to delete data from Kafka topic ] we need to delete data from topic, because e.g. during testing junk data was sent, and we have not yet implemented support for such errors, resulting in the so-called “poison pill” – that is, a record (s) that each time we try to consume from Kafka cause that our processing fails.
Kafka Topic And Partitions
The following diagram presents how data is stored in Kafka Topic.
Kafka Topic consists of partitions, which amount can be >=1. The following example shows topic which has 4 partitions. On each partitions the data is stored (each message is presented as single rectangle). Also the each message has the offset – the order id which determines the order in partitions. When consumer reads the data from the messages are read using this offset to keep the order of messages to be read.
#1 Method (Not Recommended)
We can simply delete the topic and create it again. Personally, I think it is better to use the second method, i.e.
#2 Method: Retention Change
The second way is to change the data retention on the topic to some low value, e.g. 1 second. The data will be automatically deleted by Kafka’s internal processes. We don’t have to worry about anything.
First, let’s check the current configuration of the topic: retention.ms=86400000 (7 days) Apache Kafka How to delete data from Kafka topic)
kafka-topics --zookeeper kafka:2181 --topic bigdata-etl-file-source -describe
Topic:bigdata-etl-file-source PartitionCount:1 ReplicationFactor:1 Configs:retention.ms=86400000 Topic: bigdata-etl-file-source Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Retention SET To 1 Second
kafka-configs --zookeeper <zookeeper>:2181 --entity-type topics --alter --entity-name bigdata-etl-file-source --add-config retention.ms=1000
kafka-configs --zookeeper <zookeeper>:2181 --entity-type topics --alter --entity-name bigdata-etl-file-source --add-config retention.ms=1000 kafka-configs --zookeeper kafka:2181 --entity-type topics --alter --entity-name bigdata-etl-file-source --add-config retention.ms=1000
Remember to wait for a while (about 1 minute) for the data to be deleted.
After we verify that the data has already been removed from the topic, we can restore the previous settings. (Apache Kafka How to delete data from Kafka topic)
Removing Messages from a Kafka Topic, How to delete records from a Kafka topic, How to delete a Kafka Topic
Could You Please Share This Post? I appreciate It And Thank YOU! :) Have A Nice Day!
YOU MIGHT ALSO LIKE
We are sorry that this post was not useful for you!
Let us improve this post!
Tell us how we can improve this post?