Apache Kafka - The Event Streaming Platform
What is an event streaming platform and when do we need it?
A stream of event is an ordered sequence of events occurring in your application. The event can be as simple as clicking on a link, or creating a new account, updating it, or may be a little complex event such as adding a product into your cart, generating invoice for the product, etc. Now, an event streaming platform takes a continuous stream of events and processes them as soon as it occurs.
when do we need to have a event streaming platform for an application?
We need it when we want the system to react in real time to the events occurring. Example: A fraud detection system that monitors credit card transactions and flags anomalies in real time.
We need it in event-driven micro-services - scalable and loosely coupled architectures where services communicate asynchronously by publishing and consuming events instead of direct API calls. Eg : In an e-commerce system, when a customer places an order, an "order placed" event triggers inventory updates, payment processing, and shipping notifications independently.
We need it when storing and replaying historical events is a necessity. Unlike traditional message queues, event streaming platform stores message for a certain period of time.
A little context before we dig deeper!
Before micro-services architecture, the software industry primarily used monolithic and service-oriented architectures (SOA).
Monolithic architecture is when all the application components (UI, business logic, database) are tightly coupled and are deployed as one big unit. The interaction between different parts of the application happens internally using function calls or shared memory. This architecture works fine with small scale applications (not needing any high scalability).
Eventually, SOA was a step towards micro-services architecture where applications are split into services but still communicate over Enterprise Service Buses - a middleware used to enable communication between different services. Traditional ESBs rely on synchronous SOAP (Simple Object Access Protocol)/XML-based messaging which is a request-response-based way, and hence slower than modern async event-driven models. Therefore, mostly got replaced by event-driven architectures (Kafka, gRPC, etc.)
ESB-Based Approach (SOAP/XML)
1. A customer places an order on an e-commerce site.
2. The frontend calls the Order Service via a SOAP request.
3. The Order Service makes a synchronous SOAP call to the Payment Service.
4. If the Payment Service is down, the request fails, and the order is lost.
5. The system waits for a response from each service before proceeding.
Kafka-Based Approach (Event-Driven)
1. A customer places an order → Order Created Event is published to Kafka.
2. Payment Service subscribes to the event and processes payment asynchronously.
3. The Inventory Service listens for the event and updates stock independently.
4. If the Payment Service is down, Kafka retains the event, and the payment is retried once it's back online, and the event is further handled according to the status of payment.
5. No service blocks another, improving resilience and scalability.
Event Streaming Platform - Kafka Terminologies
Now there has to be a stream of events, the stream of events exists in topics, which are persisted in Kafka brokers. Huff, let’s break it down :
Kafka Topics (Logical Storage) : Events are stored in topics. A topic is like a folder where a stream of related events are kept. For example, all events related to orders_received_topic.
Kafka Partitions : Each topic is split into partitions. Events within a partition are ordered and have an offset (unique ID).
Why do we even need partitions, duh?!
If a topic had only one partition, only one consumer could read from it at a time. By dividing a topic into multiple partitions, Kafka allows multiple consumers to process related events in parallel. Example: A topic with one partition → Single consumer processes 1000 events/sec. A topic with 3 partitions → Three consumers process 3000 events/sec in total.
Kafka assigns partitions to consumers. Each consumer gets one or more partitions in a balanced way. For eg : If a topic has 6 partitions and there are 3 consumers, Kafka assigns 2 partitions per consumer. If a new consumer joins, partitions are rebalanced to distribute the workload evenly. The events are ordered only within partition and not across the entire topic. Below is an example, as more events occur in topic, they get assigned to the partition based on the partition key.
Kafka Brokers (Physical Storage) : Brokers are Kafka servers where event data is stored on disk. Each broker manages topics (and its partitions) and replicates data for fault tolerance by replicating partitions across multiple brokers to prevent data loss in case of failures.
Kafka Cluster is a group of Kafka brokers working together to store and process event streams (in a distributed, fault-tolerant, and scalable manner).
Kafka Client APIs
1. Producer API : Used by a service to send events to kafka topics. Sends messages asynchronously. Messages go to partitions based on keys (or round-robin if no key).
2. Consumer API : Used by service to consume events from kafka topics. Supports multiple consumers for load balancing. Consumers use offsets to remember where they left off in the event stream.
3. Streams API : Used to process events from Kafka topics in real time and generate new events. The application receives an input stream from either one or some of the topics, processes it, and then sends the new event generated to the output topic/topics.
4. Admin API : Used to create, delete, and manage Kafka topics. Used by DevOps & Admins to manage Kafka clusters.
5. Connector API : Kafka Connector API is used to integrate external systems (like databases, cloud storage) with Kafka without writing custom producer/consumer code. Example: Reading from a database and putting data into a Kafka topic.
Types of Connectors : Source Connectors – Pull data from external systems (for example : a DB/ cloud storage) into Kafka. Sink Connectors – Push data from Kafka to external systems.
As of now, coming to an end but definitely, there’s a lot more to explore, the course I referred link to udemy ~




