You are currently viewing Apache Kafka How to delete data from Kafka topic? – you probably didn’t know these 2 cool methods!
Photo by John Schnobrich on Unsplash
Could You Please Share This Post? I Appreciate It And Thank YOU! :) Have A Nice Day!
4.9
(1607)

When working with Apache Kafka, there may be a situation when [ Apache Kafka How to delete data from Kafka topic ] we need to delete data from topic, because e.g. during testing junk data was sent, and we have not yet implemented support for such errors, resulting in the so-called “poison pill” – that is, a record (s) that each time we try to consume from Kafka cause that our processing fails.

Kafka Topic And Partitions

The following diagram presents how data is stored in Kafka Topic.

Kafka Topic consists of partitions, which amount can be >=1. The following example shows topic which has 4 partitions. On each partitions the data is stored (each message is presented as single rectangle). Also the each message has the offset – the order id which determines the order in partitions. When consumer reads the data from the messages are read using this offset to keep the order of messages to be read.

Apache Kafka How to delete data from Kafka topic? - you probably didn't know these 2 cool methods!

Apache Kafka How To Delete Data From Kafka Topic?

We can simply delete the topic and create it again. Personally, I think it is better to use the second method, i.e.

#2 Method: Retention Change

The second way is to change the data retention on the topic to some low value, e.g. 1 second. The data will be automatically deleted by Kafka’s internal processes. We don’t have to worry about anything.

First, let’s check the current configuration of the topic: retention.ms=86400000 (7 days) Apache Kafka How to delete data from Kafka topic)

kafka-topics --zookeeper kafka:2181 --topic bigdata-etl-file-source -describe

Topic:bigdata-etl-file-source	PartitionCount:1	ReplicationFactor:1	Configs:retention.ms=86400000
	Topic: bigdata-etl-file-source	Partition: 0	Leader: 0	Replicas: 0	Isr: 0

Retention SET To 1 Second

kafka-configs --zookeeper <zookeeper>:2181 --entity-type topics --alter --entity-name bigdata-etl-file-source --add-config retention.ms=1000

Check configuration:

kafka-configs --zookeeper <zookeeper>:2181 --entity-type topics --alter --entity-name bigdata-etl-file-source --add-config retention.ms=1000

kafka-configs --zookeeper kafka:2181 --entity-type topics --alter --entity-name bigdata-etl-file-source --add-config retention.ms=1000

Remember to wait for a while (about 1 minute) for the data to be deleted.

After we verify that the data has already been removed from the topic, we can restore the previous settings. (Apache Kafka How to delete data from Kafka topic)

Removing Messages from a Kafka Topic, How to delete records from a Kafka topic, How to delete a Kafka Topic

Could You Please Share This Post? 
I appreciate It And Thank YOU! :)
Have A Nice Day!

BigData-ETL: image 7YOU MIGHT ALSO LIKE

How useful was this post?

Click on a star to rate it!

Average rating 4.9 / 5. Vote count: 1607

No votes so far! Be the first to rate this post.

As you found this post useful...

Follow us on social media!

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?