Event streaming
Event streaming is our message-driven, publish, and subscribe mechanism for asynchronous inter-component communication. Asynchronous, event-based messaging has long been known for its ability to create loosely coupled systems. Producers publish events without knowledge of the consumers and the consumers are only coupled to the event type definitions, not the producers of the events. The location of producers and consumers is transparent, which facilitates elasticity and resilience. Producers and consumers need only access the messaging system, there is no need for service discovery to determine the location of other components. The components operate in complete isolation. Messages can be produced even when the consumers are not available. When an unavailable consumer comes back online, it can scale up to process the backlog of messages. Producers can increase their responsiveness by delegating processing to other components so that they can focus their resources on their single responsibility.
There are a two, separate but related, aspects of event streaming that make it stand out from traditional messaging systems: scale and eventual consistency. Event streaming belongs to the dumb pipes, smart endpoints generation. Traditional messaging middleware of the monolithic generation can be considered smart pipes, too smart for their own good. These tools receive messages, perform routing and delivery, track acknowledgments, and often take on message transform and orchestration responsibilities. All this additional processing is a drain on their resources that limits their scalability. Their value-added features of transformation and orchestration increase coupling and decrease cohesion, which limits isolation and turns them into a single point of failure.
Event streams have a single responsibility, to receive and durably store events, lots of events, at massive scale. An event stream is an append-only, sharded database, that maintains an ordered log of events and scales horizontally to accommodate massive volumes. It is important to note that an event stream is a modern, sharded database. We will discuss the implications of modern, sharded databases shortly. It is the consumers of the stream (that is, stream processors) that have all the smarts, which first and foremost allows all the necessary processing to be spread across all the components. Stream processors read events from the stream and are responsible for checkpointing their current position in the stream. This is important for many reasons, but for now, it is just important that the streaming engine is relinquished of the need to track acknowledgments and all the overhead that this entails. Stream processors read events in batches to improve efficiency. Traditional messaging systems do this as well, but they typically do this under the covers and deliver the messages to consumers one at a time. This means that traditional consumers cannot capitalize on batch efficiencies as well. Stream processors leverage these batches in combination with functional reactive programming and asynchronous non-blocking IO to create robust, elegant, and highly concurrent processing logic. What's more, stream processors are replicated per shard to further increase concurrency. We will delve into more details with the event streaming pattern, in Chapter 3, Foundation Patterns.
The CAP theorem states that in the presence of a network partition, one has to choose between consistency and availability. In the context of modern consumer-facing applications, even a temporary increase in latency is considered to be equivalent to a network partition, because of the opportunity cost of lost customers. Therefore, it is widely preferred to choose availability over consistency and thus design systems around eventual consistency and session consistency. Embracing eventual consistency is a significant advancement for our industry. Event streaming and eventual consistency are interdependent. Eventual consistency is simply a reality of asynchronous messaging and event streaming is the mechanism for implementing eventual consistency.
Without both, we cannot increase responsiveness and scalability by allowing components to delegate processing to downstream components. More importantly, without both, we cannot build bounded isolated components that are resilient and elastic, because event streaming and eventual consistency are crucial to turning the database inside out and ultimately turning the cloud into the database.
There is a common misconception that eventual consistency equates to a lack of transactionality. Nothing could be further from the truth. Each hand-off in the flow of events through the system is implemented transactionally by the smart endpoints (that is, components) and their supporting architecture. The subsequent design of the aggregate flow produces the eventual consistency. When the system is operating normally, eventual consistency happens in near real-time but degrades gracefully when anomalies occur. We will discuss the architectural mechanisms, such as idempotency and the Event Sourcing pattern in Chapter 3, Foundation Patterns, and the Saga pattern in Chapter 5, Control Patterns. For now, keep in mind that the world we live in is eventually consistent. The classic example of event-driven, non-blocking, eventual consistency is the coffee shop. You stand in line at the coffee shop and wait for your turn to place and pay for your order. Then you wait for your cup of coffee to be prepared. If all goes to plan then your order is ready promptly. Otherwise, you inquire about your order when it is taking too long. If a mistake has been made then it is corrected and you are likely compensated with a discount for use on your next visit. In the end, you get your coffee and the coffee shop has a more effective, scalable operation. This is analogous to how event streaming and eventual consistency play their role in helping cloud-native systems achieve scale and resilience.