Apace Kafka Partitions example

Category : Apache Kafka | Sub Category : Apache Kafka | By Prasad Bonam Last updated: 2023-08-05 09:38:47 Viewed : 301


Apace Kafka Partitions example :

In Apache Kafka, partitions are fundamental units for data organization and distribution. A Kafka topic is divided into multiple partitions, and each partition is an ordered, immutable sequence of records. Partitions enable horizontal scaling of data and provide data parallelism, allowing multiple consumers to read from a topic concurrently.

Lets go through an example to understand Kafka partitions:

  1. Setting Up Kafka: For this example, assume you have set up a Kafka cluster with a single broker: localhost:9092.

  2. Topic Creation: Create a Kafka topic named "my_topic" with three partitions and a replication factor of 1. This means there will be three partitions, and each partition will have a single replica.

    Using the command-line tool on Unix/Linux/Mac:

    bash
    bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic my_topic --partitions 3 --replication-factor 1

    Using the command-line tool on Windows:

    batch
    binwindowskafka-topics.bat --bootstrap-server localhost:9092 --create --topic my_topic --partitions 3 --replication-factor 1
  3. Producing Messages: Start a producer to send messages to the "my_topic" topic.

    Using the command-line tool on Unix/Linux/Mac:

    bash
    bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my_topic

    Using the command-line tool on Windows:

    batch
    binwindowskafka-console-producer.bat --broker-list localhost:9092 --topic my_topic

    Now, you can enter messages in the console, and Kafka will publish them to one of the three partitions in the "my_topic" topic. The partitioning strategy used by default is round-robin, which means messages will be distributed evenly across partitions.

  4. Consuming Messages: Start a consumer to read messages from the "my_topic" topic.

    Using the command-line tool on Unix/Linux/Mac:

    bash
    bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic my_topic

    Using the command-line tool on Windows:

    batch
    binwindowskafka-console-consumer.bat --bootstrap-server localhost:9092 --topic my_topic

    The consumer will read messages from one of the partitions in the "my_topic" topic. If you produce more messages, the consumer will continue to read from the same partition unless a new consumer joins the consumer group, in which case, partitions may get re-balanced among consumers.

  5. Partition Assignment: The Kafka cluster will distribute the three partitions of "my_topic" across the single broker (localhost:9092) as follows:

    • Partition 0: Broker localhost:9092
    • Partition 1: Broker localhost:9092
    • Partition 2: Broker localhost:9092

    As the topic "my_topic" has only one broker, all the partitions reside on that broker. In a real-world scenario, with multiple brokers, partitions would be distributed across different brokers for better scalability.

Kafka partitions provide data parallelism, enabling multiple producers and consumers to work concurrently and process data in a distributed manner. Each partition can be consumed by only one consumer within a consumer group, which allows for load balancing and parallel processing of data. The number of partitions for a topic should be chosen carefully based on the applications scalability and throughput requirements.

Search
Sub-Categories
Related Articles

Leave a Comment: