Skip to main content
Spring for Apache Kafka 4: Migration, Share Groups, and the New Consumer Protocol
  1. Posts/

Spring for Apache Kafka 4: Migration, Share Groups, and the New Consumer Protocol

·2475 words·12 mins
NeatGuyCoding
Author
NeatGuyCoding
Table of Contents

Spring for Apache Kafka 4: Migration, Share Groups, and the New Consumer Protocol
#

Spring for Apache Kafka (Spring Kafka below) 4.0 ships with Spring Boot 4.0, bundles Kafka client 4.1 and Jackson 3, and on the client side integrates KIP-932 (Queues / Share Group) and KIP-848 (group.protocol=consumer). This article is organized around engineering decisions: a reproducible 3→4 migration first, then error handling that still applies on classic consumer groups, then the manual wiring boundaries for Share Groups, and finally opt-in adoption of the next-generation rebalance protocol.

Version note: Share consumers are marked preview / early access in the Spring Kafka 4.0 documentation; the ShareAckMode enum and renew() appear in the 4.1 docs. Where 4.0.x and 4.1 differ, that is called out below.

If you maintain a “Boot 3.5 + standalone spring-kafka 3.x” stack, the upgrade path can be summarized in three layers: dependencies and modular starters (Boot 4 BOM), serialization and test infrastructure (Jackson 3, KRaft embedded broker), and consumption model extensions (KIP-848 opt-in on classic groups, plus optional manual Share Group setup). The first two layers suit whole-application upgrades; the third should be chosen per workload, not enabled by default.


Migrating Boot 3→4 and Kafka 3→4 with OpenRewrite
#

Why
#

Hand-editing pom.xml, package names, and test annotations makes it easy to miss Jackson starter swaps or deprecated @EmbeddedKafka properties. OpenRewrite codifies repeatable AST-level changes as recipes, suitable for batch runs on a branch or in CI.

Mechanism and constraints
#

Official recipes (verify in the rewrite-spring repository) include:

The demo repository also includes custom recipes (e.g. MigrateSpringBootJsonToJacksonStarter, RemoveDeprecatedEmbeddedKafkaParameters) that are not in the official recipes.csv; you must maintain those yourself during migration.

How
#

cd spring-kafka-3-to-4 && ./mvnw rewrite:run

In rewrite.yml, at minimum attach the official Boot/Kafka upgrade recipes, then add custom visitors as needed:

recipeList:
  - org.openrewrite.java.spring.boot4.UpgradeSpringBoot_4_0
  - org.openrewrite.java.spring.kafka.UpgradeSpringKafka_4_0
  # Demo repo custom (not official)
  - com.example.spring.kafka.RemoveDeprecatedEmbeddedKafkaParameters

Demo terminal: cd spring-kafka-3-to-4 and ./mvnw rewrite:run, with Upgrade Spring Boot / Spring Kafka recipe descriptions visible in the IDE.

Common pitfalls
#

  • Treating the demo’s CustomUpgradeSpringkafka_4_0 as the official recipe name — the official Kafka migration recipe is UpgradeSpringKafka_4_0.
  • Running OpenRewrite without running tests — ZK-related annotations and serializer changes still need test coverage.
  • Running only the parent POM recipe in a multi-module monorepo — child modules that still declare spring-kafka versions directly may need per-module runs or supplemental ChangeDependency visitors.

Modular starters and Jackson 3 serialization
#

Why
#

Boot 4 splits Kafka and JSON capabilities into dedicated starters and aligns with Jackson 3 (tools.jackson). Continuing to depend directly on spring-kafka + spring-boot-starter-json (Jackson 2) causes autoconfigure packages and serialization behavior to drift.

Mechanism and constraints
#

Before migration (illustrative)After migration
spring-kafkaspring-boot-starter-kafka
spring-boot-starter-jsonspring-boot-starter-jackson
spring-kafka-testspring-boot-starter-kafka-test

JsonSerializer / JsonDeserializer are deprecated from 4.0 onward; use JacksonJsonSerializer / JacksonJsonDeserializer (based on Jackson 3’s JsonMapper).

How
#

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-kafka</artifactId>
</dependency>
<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-jackson</artifactId>
</dependency>
spring:
  kafka:
    producer:
      value-serializer: org.springframework.kafka.support.serializer.JacksonJsonSerializer
    consumer:
      value-deserializer: org.springframework.kafka.support.serializer.JacksonJsonDeserializer

Side-by-side: POM diff between spring-kafka 3.5.0 and spring-boot-starter-kafka 4.0.5.

Pre-migration application.yml still using JsonSerializer and ErrorHandlingDeserializer delegate configuration.

Common pitfalls
#

  • Changing dependencies only, not serializer class names — runtime may still use the deprecated Jackson 2 adapter layer.
  • Ignoring security-related properties such as spring.json.trusted.packages — under Jackson 3 you still need to verify JSON trusted package configuration per What’s New.
  • Mixing third-party libraries bound to Jackson 2 — Boot 4 applications use Jackson 3 on the classpath; third-party starters that hard-depend on com.fasterxml.jackson need separate compatibility review.

ErrorHandlingDeserializer still pairs with a JacksonJsonDeserializer delegate: the outer layer catches deserialization exceptions; the inner delegate is the Jackson 3 deserializer, so poison pills can be classified before they reach the listener.


@EmbeddedKafka and KRaft in tests
#

Why
#

Kafka 4 brokers support KRaft only; ZooKeeper is no longer used. Integration tests that keep zookeeperPort, kraft = false, and similar properties fail at embedded broker startup and no longer match production topology.

Mechanism and constraints
#

Spring Kafka 4.0 What’s New removes zookeeperPort, zkConnectionTimeout, zkSessionTimeout, kraft, and related @EmbeddedKafka properties; the implementation is now EmbeddedKafkaKraftBroker.

How
#

@SpringBootTest
@ActiveProfiles("test")
@EmbeddedKafka(partitions = 3)
class SpringKafkaApplicationTests { }

A custom OpenRewrite recipe can bulk-remove deprecated properties (as in the demo repository).

Migration comparison: zookeeperPort = 2181, kraft = false versus streamlined @EmbeddedKafka(partitions = 3).

Common pitfalls
#

  • Assuming removing kraft = false means “turn off KRaft” — the effect is KRaft-only enforced.
  • Upgrading directly to Kafka 4 on production brokers still on ZK — unrelated to test annotations; that is broker upgrade scope.

Poison pills and DLT on classic consumer groups
#

Why
#

Messages that cannot be deserialized or fail permanently in business logic, if retried indefinitely within the group, stall partition progress for the entire group.id (head-of-line blocking). On classic consumer groups, Spring Kafka’s mature pattern is routing failed records to a Dead Letter Topic (DLT).

Mechanism and constraints
#

DeadLetterPublishingRecoverer implements ConsumerRecordRecoverer, publishing failed ConsumerRecords to a DLT, often combined with ErrorHandlingDeserializer and DLT Strategies.

Boundary (speaker’s view, aligned with official documentation direction): Share Groups have no DLT story today; poison pills rely on reject(), delivery attempt limits, and other KIP-932 semantics — you cannot copy @RetryableTopic verbatim.

How
#

@Configuration
class KafkaExceptionHandlingConfiguration {

  @Bean
  DeadLetterPublishingRecoverer recoverer(
      KafkaTemplate<?, ?> bytesKafkaTemplate,
      Map<Class<?>, KafkaTemplate<?, ?>> templates) {
    return new DeadLetterPublishingRecoverer(bytesKafkaTemplate, templates);
  }
}

Demo class KafkaExceptionHandlingConfiguration with DeadLetterPublishingRecoverer bean definition.

Common pitfalls
#

  • Configuring a DLT recoverer on a Share consumer — official kafka-queues does not cover that combination.
  • Not distinguishing deserialization failures from business failures — prefer ErrorHandlingDeserializer for the former; use recoverer / retry topic for the latter.

KIP-932: what Share Groups solve
#

Why
#

Under a classic consumer group, the number of active consumers is usually no greater than the partition count; a single slow message blocks the partition that consumer owns (partition-level head-of-line blocking). Under traffic spikes, teams often over-provision partitions just to gain horizontal scale.

KIP-932 introduces Share Groups: multiple ShareConsumers in the same share group can cooperate at record granularity, yielding queue-like semantics — per-record acknowledgment and release for retry by others.

Mechanism and constraints
#

The broker maintains per-record state (KIP and Spring docs agree):

Mermaid diagram 1

  • Available → Acquired: delivery count increases when the consumer fetches (config name in KIP: group.share.delivery.attempt.limit).
  • Acquired → Acknowledged: successful processing.
  • release: temporary failure; another instance in the group can acquire again.
  • reject: permanent failure; enters Archived (similar to discard / archive — not Spring DLT).

Speaker’s view (default values in this article not line-by-line verified against Kafka 4.1): roughly five delivery attempts and ~30s processing lock by default; Kafka 4.2 GA and Boot 4.1 were described as aligning in May — verify against your broker / BOM version. Boot 4.0 Release Notes state Kafka 4.1.0.

Unlike classic offset commits, Share consumption does not rely on “one logical position per partition” for progress; it relies on the broker’s per-record state machine. In operations, watch delivery count and Archived ratios, not only consumer lag (classic metrics may interpret share consumption differently — check monitoring docs for your Kafka version).

Slide “Queues for Kafka – How does it work?”: parallel models for consumer group vs. share group.

Message states: Available → Acquired (delivery count++) → Acknowledged / Archived.

Common pitfalls
#

  • Treating Share Group as “consumer group with more partitions” — it does not guarantee order within a partition.
  • Forcing Share where you need Kafka Streams, batch @KafkaListener, regex topics, or static partition assignment — Spring docs list explicit non-support.

Manual wiring for Share consumers
#

Why
#

As of Spring Kafka 4.0 documentation, Share consumers are early access with no Boot autoconfiguration equivalent to spring.kafka.consumer.*. Without ShareKafkaListenerContainerFactory, listeners cannot create share containers correctly at runtime.

Mechanism and constraints
#

Core types:

4.0 vs 4.1: 4.0.x uses setExplicitShareAcknowledgment(true) for listener-side ShareAcknowledgment; 4.1 documents ShareAckMode.MANUAL. Demo code’s setShareAckMode targets 4.1; on 4.0.5 use the explicit flag instead.

Some classic ConsumerConfig keys are invalid for share groups — the minimal set is often just bootstrap.servers (as in the talk demo; see Kafka docs for the full list).

Speaker’s view: Micrometer tracing and metrics comparable to classic groups are still immature for Share scenarios.

How
#

@Configuration
@EnableKafka
class ShareConsumerConfiguration {

  @Bean
  ShareConsumerFactory<String, TransactionEvent> shareConsumerFactory() {
  Map<String, Object> props = Map.of(
      ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
    return new DefaultShareConsumerFactory<>(props);
  }

  @Bean
  ShareKafkaListenerContainerFactory<String, TransactionEvent>
      manualShareKafkaListenerContainerFactory(
          ShareConsumerFactory<String, TransactionEvent> factory) {
    var f = new ShareKafkaListenerContainerFactory<>(factory);
    f.setConcurrency(6);
    f.getContainerProperties().setShareAckMode(ShareAckMode.MANUAL); // 4.1+
    f.getContainerProperties()
        .setShareAcknowledgmentTimeout(Duration.ofSeconds(30));
    return f;
  }
}

4.0.x equivalent (without ShareAckMode):

f.getContainerProperties().setExplicitShareAcknowledgment(true);

IDE: ShareConsumerConfiguration and DefaultShareConsumerFactory imports.

ManualShareConsumerConfiguration: ShareKafkaListenerContainerFactory, setConcurrency(6), ShareAckMode MANUAL.

Demo repository modules: spring-kafka-4-producer and spring-kafka-4-share-consumer.

Common pitfalls
#

  • Adding @KafkaListener without declaring containerFactory — falls back to the classic container factory.
  • Treating spring.kafka.listener.concurrency as the only Share concurrency knob — share containers need separate setConcurrency, with no classic hard cap tied to partition count (different cooperative consumption model).

ShareAcknowledgment: acknowledge, release, reject
#

Why
#

Work-queue scenarios must separate “downstream temporarily unavailable” from “permanently invalid for the business.” A single exception path cannot map precisely to KIP-932 release (retryable) vs reject (archive).

Mechanism and constraints
#

MethodSemantics (Javadoc)
acknowledge()Success
release()Temporary failure; can be consumed again within the group
reject()Permanent failure; no further delivery

In explicit / MANUAL mode, unacknowledged records on the same thread block subsequent polls (Poll Constraints). In implicit mode, normal method return can trigger auto-ack — do not confuse with classic AckMode.

4.1+: ShareAcknowledgment.renew() extends the acquisition lock; default lock duration relates to group.share.record.lock.duration.ms (documentation example 30 seconds). Long-running work should plan for renew(), not only a larger timeout.

How
#

@KafkaListener(
    id = "annotation-manual-transaction-event-processor",
    topics = "transaction-events",
    containerFactory = "manualShareKafkaListenerContainerFactory",
    groupId = "annotation-manual-group")
void handle(ConsumerRecord<String, TransactionEvent> record,
            ShareAcknowledgment ack) {
  try {
    service.process(record.value());
    ack.acknowledge();
  } catch (TransientException e) {
    ack.release();
  } catch (Exception e) {
    ack.reject();
  }
}

The demo also injects test events via HTTP POST /api/transactions (REST controller plus the listener above).

AnnotationDrivenManualKafkaShareConsumer: @KafkaListener with containerFactory = "manualShareKafkaListenerContainerFactory".

Console logs: Acknowledged / Rejected / Released for retry.

Package layout: explicit, implicit, manual, and related examples side by side.

Common pitfalls
#

  • In MANUAL mode, neither calling ack nor expecting auto-commit — blocks consumption on the same thread.
  • Treating release() like classic consumer seek rewind — semantics re-enter Available for group competition, not a fixed partition offset.
  • Compiling with ShareAckMode.MANUAL on 4.0.5 — use setExplicitShareAcknowledgment(true) instead.

Choosing Share Group vs. classic group
#

DimensionClassic consumer groupShare Group
Parallelism vs. partitionsConsumers usually ≤ partition countCan exceed partition count (record-level cooperation)
OrderingIn-partition orderNot guaranteed
DLT / @RetryableTopicMatureNot yet (speaker summary + docs uncovered)
Boot autoconfigurationspring.kafka.*Manual factory required
Batch / regex topic / StreamsSupportedListed as unsupported in docs
ObservabilityRelatively completeSpeaker’s view: metrics/tracing gaps remain

If production depends on ordered event streams, Streams, batch listeners, or mature DLT, stay on classic groups. Share Group fits order-tolerant work queues that need multi-instance work stealing. Speaker’s view: share remains preview in current Spring Boot releases; production should wait for a Boot/Kafka combination aligned with Kafka Queues GA — confirm timing with official releases.

Before rollout, self-check: (1) Can you accept out-of-order delivery for the same key? (2) Is retry acceptable as “another instance takes over” rather than fixed-partition replay? (3) For permanent failure, is Archived acceptable instead of DLT audit? (4) Do monitoring and alerts cover share-specific metrics? If any answer is no, stay on classic groups and prioritize KIP-848 to reduce rebalance cost.


KIP-848: group.protocol=consumer
#

Why
#

Classic groups with many partitions and consumers can stall the whole group during rebalance, conflicting with K8s HPA and frequent rolling deploys. KIP-848 moves much assignment logic to the broker; consumers opt in via the new protocol.

Mechanism and constraints
#

From Kafka client source (ConsumerConfig):

  • Property: group.protocol
  • Default: classic
  • Valid values: classic | consumer

Spring binding:

spring:
  kafka:
    consumer:
      properties:
        group.protocol: consumer

With consumer enabled, classic settings such as partition.assignment.strategy trigger ConfigException (consistent with custom PartitionAssignor no longer applying). Speaker’s view: quantify rebalance improvements from the KIP text; do not oversell “zero downtime” without version context.

This protocol applies only to classic consumer groups, not Share Groups (KIP-932).

How
#

group.protocol=consumer

Ensure the broker has KIP-848 upgrade strategy enabled (bidirectional / upgrade cluster settings in the KIP — see KIP Case Studies).

Common pitfalls
#

  • Setting group.protocol on a Share consumer — wrong path.
  • Keeping a custom assignor alongside the consumer protocol — fails at startup.
  • Using old Micrometer dashboards after upgrade — speaker’s view: broker metrics include protocol=consumer|classic changes; update alerts per the KIP.
  • Blue/green with two different group.id values running the same logic — unrelated to protocol upgrade but can mask rebalance issues; migrate protocol under the same group.id with rolling instances.

For Spring applications, ensure ConsumerFactory / @KafkaListener consumers ultimately include group.protocol=consumer; no Spring API change is required. To roll back, set the property back to classic and roll instances, provided your broker version still allows the downgrade path KIP-848 describes.


Rolling from classic to the consumer protocol
#

Why
#

Large consumer groups hesitate to change protocol for fear of a one-shot rebalance creating consumption gaps. KIP-848 documents rolling consumers for upgrade and downgrade.

Mechanism and constraints
#

KIP-848 excerpt (direction verified):

“Upgrading to the new protocol or downgrading from it is possible by rolling the consumers

Speaker’s view (not line-by-line verified against Kafka 4.1 broker source): you can temporarily mix classic and consumer instances; when the broker detects a new-protocol member joining, it migrates the group; then scale down classic instances; if old-protocol members return, rollback is possible. Speaker’s recommendation: end mixed-protocol operation quickly to avoid long-term dual-protocol broker overhead.

Mermaid diagram 2

How
#

The demo uses Docker Compose scaling (commands from live OCR; demo repository URL not independently verified):

docker compose -f docker-compose-applications.yml -p demo up -d
docker compose -f docker-compose-applications.yml -p demo \
  scale consumer-new-consumer-protocol=1
# After verifying group protocol switch, scale to 3 and retire classic instances
docker compose -f docker-compose-applications.yml -p demo \
  scale consumer-new-consumer-protocol=3

Terminal: Scale from 0 to 1 instance and consumer-new-consumer-protocol.

Scale from 0 to 1 and classic / consumer transitions in Group protocol logs.

Common pitfalls
#

  • Changing clients before broker upgrade / cluster upgrade strategy — undefined behavior or startup failure.
  • Treating long-term mixed protocols as steady state — contrary to the KIP and the speaker’s recommendation.

References and further reading
#

Related