The definitive book on Apache Kafka

About the Book

The software architecture landscape has evolved dramatically over the past decade. Microservices have displaced monoliths. Data and applications are increasingly becoming distributed and decentralised. But composing disparate systems is a hard problem. More recently, software practitioners have been rapidly converging on event-driven architecture as a sustainable way of dealing with complexity — integrating systems without increasing their coupling.

In Effective Kafka, Emil Koutanov explores the fundamentals of Event-Driven Architecture — using Apache Kafka — the world's most popular and supported open-source event streaming platform.

You'll learn:

  • The fundamentals of event-driven architecture and event streaming platforms
  • The background and rationale behind Apache Kafka, its numerous potential uses and applications
  • The architecture and core concepts — the underlying software components, partitioning and parallelism, load-balancing, record ordering and consistency modes
  • Installation of Kafka and related tooling — using standalone deployments, clusters, and containerised deployments with Docker
  • Using CLI tools to interact with and administer Kafka classes, as well as publishing data and browsing topics
  • Using third-party web-based tools for monitoring a cluster and gaining insights into the event streams
  • Building stream processing applications in Java 11 using off-the-shelf client libraries
  • Patterns and best-practice for organising the application architecture, with emphasis on maintainability and testability of the resulting code
  • The numerous gotchas that lurk in Kafka's client and broker configuration, and how to counter them
  • Theoretical background on distributed and concurrent computing, exploring factors affecting their liveness and safety
  • Best-practices for running multi-tenanted clusters across diverse engineering teams, how teams collaborate to build complex systems at scale and equitably share the cluster with the aid of quotas
  • Operational aspects of running Kafka clusters at scale, performance tuning and methods for optimising network and storage utilisation
  • All aspects of Kafka security —including network segregation, encryption, certificates, authentication and authorization.

The coverage is progressively delivered and carefully aimed at giving you a journey-like experience into becoming proficient with Apache Kafka and Event-Driven Architecture. The goal is to get you designing and building applications. And by the conclusion of this book, you will be a confident practitioner and a Kafka evangelist within your organisation — wielding the knowledge necessary to teach others.

About the Author

Emil Koutanov
Emil Koutanov

Event-driven architecture and microservices evangelist, engineer, and a dad. Storyteller and open-source contributor.

Table of Contents

  • Event Streaming Fundamentals
    • The real challenges of distributed systems
    • Event-Driven Architecture
    • What is event streaming?
  • Introducing Apache Kafka
    • The history of Kafka
    • The present day
    • Uses of Kafka
  • Architecture and Core Concepts
    • Architecture Overview
    • Total and partial order
    • Records
    • Partitions
    • Topics
    • Consumer groups and load balancing
    • Free consumers
    • Summary of core concepts
  • Installation
    • Installing Kafka and ZooKeeper
    • Launching Kafka and ZooKeeper
    • Running in the background
    • Installing Kafdrop
  • Getting Started
    • Publishing and consuming using the CLI
    • A basic Java producer and consumer
  • Design Considerations
    • Roles and responsibilities
    • Parallelism
    • Idempotence and exactly-once delivery
  • Serialization
    • Key and value serializer
    • Key and value deserializer
  • Bootstrapping and Advertised Listeners
    • A gentle introduction to bootstrapping
    • A simple scenario
    • Multiple listeners
    • Listeners and the Docker Network
  • Broker Configuration
    • Entity types
    • Dynamic update modes
    • Configuration precedence and defaults
    • Applying broker configuration
    • Applying topic configuration
    • Users and Clients
  • Client Configuration
    • Configuration gotchas
    • Applying client configuration
    • Common configuration
    • Producer configuration
    • Consumer configuration
    • Admin client configuration
  • Robust Configuration
    • Using constants
    • Type-safe configuration
  • Batching and Compression
    • Comparing disk and network I/O
    • Producer record batching
    • Compression
  • Replication and Acknowledgements
    • Replication basics
    • Leader election
    • Setting the initial replication factor
    • Changing the replication factor
    • Decommissioning broker nodes
    • Acknowledgements
  • Data Retention
    • Kafka storage internals
    • Deletion
    • Compaction
    • Combining compaction with deletion
  • Group Membership and Partition Assignment
    • Group membership basics
    • Liveness and safety
    • Partition assignment strategy
  • Security
    • State of security in Kafka
    • Target state security
    • Network traffic policy
    • Confidentiality
    • Authentication
    • Authorization
  • Quotas
    • The rationale behind quotas
    • Types of quotas
    • Subject affinity and precedence order
    • Applying quotas
    • Buffering and timeouts
    • Sensing quota enforcement
    • Tuning the duration and number of sampling windows
  • Transactions
    • Preamble
    • The rationale behind transactions
    • Transactions under the hood
    • Simple stream processing example
    • Limitations
    • Are transactions over-hyped?