Don’t Block, Just Queue: The Art of Asynchronous Traffic Control.

Modern systems handle traffic at a staggering scale. Google processes approximately ~190k searches per second, based on an estimated 16.4 billion searches per day. IoT devices generate data ranging from a few kilobytes to 1GB per second. A single Boeing 787 aircraft produces ~5GB of data per second while in flight.

But these numbers only tell half the story. This is just the data arriving at the "front door."

Once that data enters the system, the volume explodes. A single request often triggers multiple downstream actions—database writes, analytics logging, notifications, and third-party API calls. 5GB of external traffic can easily turn into 50GB of internal traffic.

How do we manage this load without the servers catching fire? Message Queues.

Message Queues do not magically reduce the size of the data. Instead, they manage how and when we process it. When that Boeing 787 dumps 5GB of data, a traditional system tries to process it all immediately. The database locks up, the CPU spikes to 100%, and the system crashes. This is a synchronous failure.

Message Queues decouple the components of the system, so they can be updated independently.
It smoothens the traffic flow, during traffic spikes the system does not crash it just takes longer to process the messages. It continues processing at its steady rate.
Instead of single service trying to do all the work, the work is distributed among different components. The Ingestion Service does not wait for Analytics Service to finish, it keeps processing its batch even if the Analytics Service fails.
Message Queues also increase the availability and performance of the system.

Event Streaming platforms are more advanced counterparts of Messaging Queues which are heavily used in modern Systems. Amazon SQS, ZeroMQ, RabbitMQ, Apache RocketMQ, Apache ActiveMQ, are some prominent Messaging Queues. Meanwhile, Apache Kafka, Apache Pulsar are some of the popular event streaming platforms.

The Anatomy: How it Works?

In order to understand this further, we need to define the 4 organs of the system —

The Producer: The service creating the data (e.g., The User Interface or an IoT sensor).
The Message: The data packet itself (containing the Payload + Headers).
The Broker (The Post Office): The physical server that receives, routes, and stores the message.
The Consumer: The service that takes the message off the queue and processes it.

Queues v/s Event Streams

While they look similar on the surface (data in, data out), technically they operate very differently. We can divide them into the Queue Model (Traditional) and the Stream Model (Modern).

The Queue Model (RabbitMQ, SQS)

Think of this like a task list: once a task is done, it is crossed off the list. The focus is on delivery. The goal is to get the message to a consumer and delete it as fast as possible.

Producer: Sends the message.
The Exchange (Router): Specific to advanced queues like RabbitMQ. The producer sends data here first. Based on configured rules, the exchange decides which specific queue receives the message.
The Queue (Buffer): A transient storage area that holds messages until they are consumed.
The Consumer: The worker that processes the message.
ACK (The Delete Command): Once the consumer finishes, it sends an acknowledgment (ACK). The queue then deletes the message immediately.

The Event Streaming Model (Kafka/Pulsar)

This model focuses on storage and history. The goal is to record what happened, in order, and let different workers consume the history at their own pace.

Producer: Emits "events" (facts that happened, e.g., "User Clicked").
The Topic: The category of events (e.g., 'Clicks', 'Invoices').
The Partition (Scalability Engine): A topic can be too massive for one server, so it is split into partitions. Messages are appended to the end of a partition log file. Note: Order is guaranteed only within a partition.
Offset (The Bookmark): Unlike queues, messages stay in the stream. The offset is just a number tracking where a consumer is. Different consumers can be at different offsets.
Consumer Group: If a topic has 4 partitions and you have a Consumer Group with 4 consumers, the system assigns one consumer to each partition for parallel processing.

Designing the Rails — Key Architecture Decisions

Once you are ready to scale, you cannot just "pick a queue." You need to match the tool to your traffic shape. Here are the seven critical decisions you must make.

Format & Size of Messages

Are you sending small JSON notifications or massive video logs? Most messaging tools (like Kafka and SQS) are optimized for small payloads (typically under 1MB). If you push large files directly into the queue, performance will degrade instantly. The Fix: Use the Claim Check Pattern. Upload the large file to object storage (like S3) first, then pass only the reference URL in the message payload.
Topology: Point-to-Point vs. Pub/Sub

This defines the relationship between the sender and the receiver.
- Point-to-Point (Work Queues): This is about competition. Even if 50 consumers are listening, a specific message is handled by only one of them. Use this for heavy background tasks (e.g., resizing an image) where you want to distribute the workload without duplicating effort.
- Publish/Subscribe (Pub/Sub): This is about broadcasting. The sender sends a message to a topic, and every active service subscribed to that topic gets its own copy. Use this for event notifications (e.g., a "User Signup" event triggers a Welcome Email, a Database Entry, and an Analytics Log simultaneously).

Delivery Semantics (Reliability)

How catastrophic is it if a message is lost? You typically choose between three tiers:
- At-Most-Once (Fire & Forget): High performance, but if the server crashes, the data is gone. Best for IoT sensor data where missing one temperature reading is acceptable.
- At-Least-Once (The Standard): The message is guaranteed to arrive, but it might arrive twice. Your consumer must be idempotent (able to handle duplicates without errors).
- Exactly-Once: The holy grail. It guarantees a single delivery but is expensive in terms of latency and complexity. Best reserved for financial transactions.
Order of Delivery

Does Message A have to be processed before Message B? Strict ordering is the enemy of scale because it forces you to use a single consumer. To get around this, we use Partitioned Ordering. You guarantee order only within a specific group (e.g., "All events for User ID 101 are ordered"), but different groups are processed in parallel.
Delivery Mechanism: Push vs. Pull

This defines who controls the flow of data.
- Push (e.g., RabbitMQ): The Broker forces messages onto the consumer. This offers the lowest latency but risks overwhelming the consumer during traffic spikes.
- Pull (e.g., Kafka, SQS): The Consumer requests messages when it is ready. This provides excellent flow control (the consumer is never overwhelmed) but introduces a slight latency delay due to polling.
Data Retention: Queue vs. Log

Do you want to delete the message or keep it?
- Traditional Queues (RabbitMQ/SQS): These are ephemeral. The goal is to delete the message as soon as it is processed.
- Event Streams (Kafka/Pulsar): These are durable. The message is stored like a log entry for a set time (e.g., 7 days) even after it is read. This allows you to "replay" history if a bug corrupts your database.
Throughput vs. Latency

You generally cannot maximize both.
- Low Latency: You process messages immediately as they arrive. This limits your total throughput.
- High Throughput: You move massive amounts of data (e.g., 10GB/sec) by batching messages together. This increases latency because the system must wait to fill the batch before sending it.

Message Queues are the unsung heroes of modern architecture. They allow our systems to breathe, to buffer, and to survive the inevitable chaos of the real world. Whether you are building a simple email notifier or a global streaming platform, the principle remains the same: Don't block. Just queue.

Don’t Block, Just Queue: The Art of Asynchronous Traffic Control.

The Anatomy: How it Works?

Queues v/s Event Streams

Designing the Rails — Key Architecture Decisions

Comments

More from this blog

HTTP/3: Faster Connections and Better Mobility

Design Choices for Location Based Services III

Design Choices for Location Based Services / Part II

Design choices for building Location Based Services / Part I

Command Palette

The Anatomy: How it Works?

Queues v/s Event Streams

Designing the Rails — Key Architecture Decisions

Comments

More from this blog