Skip to content

Conversation

@dajac
Copy link
Member

@dajac dajac commented Nov 27, 2025

In #20061, we introduced the
CoordinatorMetadataImage and CoordinatorMetadataDelta interfaces to
basically contain/control how metadata is used within the
GroupCoordinatorService, more precisely within the
GroupCoordinatorShard. When we did the change, we directly changed the
GroupCoordinator interface to take implementation of those interfaces,
requiring to wrap the MetadataImage and the MetadataDelta in the
BrokerMetadataPublisher. While looking at it now, I find this not idea
for a couple a reasons:

  1. From a BrokerMetadataPublisher perspective, propagating metadata
    should be standardized. Basically, all the internal components should
    work the same from his point of view. If a component wants to be more
    restrictive within his scope, it is fine but the publisher should not be
    aware of this.
  2. From a GroupCoordinator perspective, requiring
    CoordinatorMetadataImage and CoordinatorMetadataDelta limits what we
    can do. For instance, we could move more functionality such as electing
    shards from the BrokerMetadataPublisher to the
    GroupCoordinatorService to further simplify the metadata publishing
    and improving the encapsulation of the components.
  3. From a ShareCoordinator perspective, the abstraction failed short
    as we require CoordinatorMetadataImage and CoordinatorMetadataDelta
    in the interface but we still require FeaturesImage as well. The
    abstraction fails short here.

This patch is an attempt to change this by moving back to requiring
MetadataImage and MetadataDelta in the GroupCoordinator interface
and to wrap them within the service itself. It does not change the
ShareCoordinator yet. I will do this as a follow-up if people agree
with the general approach.

Reviewers: Sean Quah squah@confluent.io, Lianet Magrans
lmagrans@confluent.io

@github-actions github-actions bot added core Kafka Broker group-coordinator small Small PRs labels Nov 27, 2025
* This is initialised when the {@link GroupCoordinator#onMetadataUpdate(MetadataDelta, MetadataImage)} is called
*/
private CoordinatorMetadataImage metadataImage = null;
private volatile CoordinatorMetadataImage metadataImage = null;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug. The attribute must be volatile because it is accessed from multiple threads.

Comment on lines +2368 to +2369
var wrappedImage = newImage == null ? null : new KRaftCoordinatorMetadataImage(newImage);
var wrappedDelta = delta == null ? null : new KRaftCoordinatorMetadataDelta(delta);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty ugly and it should not be here because the image and the delta can never be null. Unfortunately, we have a couple of tests passing nulls here. I will fix those separately and remove this code in a follow-up. I want the core change to be well scoped.

Comment on lines +6032 to +6034
metadataImage = new MetadataImageBuilder()
.addTopic(TOPIC_ID, TOPIC_NAME, 1)
.build();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is semi-related. It is just easier to build the image rather than mocking it.

@dajac dajac changed the title Minor update gc interface MINOR: Update GroupCoordinator interface to received MetadataImage/Delta directly Nov 27, 2025
Copy link
Contributor

@squah-confluent squah-confluent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor looks good to me

Copy link
Member

@lianetm lianetm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactoring, makes sense to me. Just a nit.

var wrappedImage = newImage == null ? null : new KRaftCoordinatorMetadataImage(newImage);
var wrappedDelta = delta == null ? null : new KRaftCoordinatorMetadataDelta(delta);
metadataImage = wrappedImage;
runtime.onNewMetadataImage(wrappedImage, wrappedDelta);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to rename this on the CoordinatorRuntime to onNewMetadataUpdate for consistency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our parameter order isn't consistent. We have one onNewMetadataUpdate definition take (delta, image) and the other take (image, delta).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed it.

@dajac dajac requested a review from lianetm November 28, 2025 08:19
@github-actions github-actions bot added KIP-932 Queues for Kafka and removed small Small PRs labels Nov 28, 2025
Copy link
Member

@lianetm lianetm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM.

@dajac dajac merged commit e9ff2cc into apache:trunk Nov 30, 2025
23 checks passed
@dajac dajac deleted the minor-update-gc-interface branch November 30, 2025 08:42
Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dajac thanks for this refacor. I have just posted some review comments. Sorry for the delay

throwIfNotActive();
metadataImage = newImage;
runtime.onNewMetadataImage(newImage, delta);
var wrappedImage = newImage == null ? null : new KRaftCoordinatorMetadataImage(newImage);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the image be null? Since a null wrappedImage causes an NPE in CoordinatorRuntime#onMetadataUpdate, if null input is allowed, we should implement a fast-fail

log.debug("Scheduling applying of a new metadata image with version {}.", newImage.version());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for neglecting the previous comment. Sounds good

CoordinatorMetadataImage newImage,
CoordinatorMetadataDelta delta
public void onMetadataUpdate(
CoordinatorMetadataDelta delta,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the ShareCoordinator also be updated with this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the description:

It does not change the
ShareCoordinator yet. I will do this as a follow-up if people agree
with the general approach

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if people agree with the general approach

If the interfaces CoordinatorMetadataImage and CoordinatorMetadataDelta are not referenced by ShareCoordinator or GroupCoordinator, can we remove them by using record class to rewrite them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are still used by other interfaces so I prefer to keep them as interfaces.

chia7712 pushed a commit that referenced this pull request Dec 2, 2025
…#21032)

This patch is a continuation of
#21008. It applies the same naming
convention within the `CoordinatorRuntime`.

Reviewers: Lianet Magrans <lmagrans@confluent.io>, Chia-Ping Tsai
 <chia7712@gmail.com>
chia7712 pushed a commit that referenced this pull request Dec 2, 2025
…/Delta` directly (#21029)

This patch is a continuation of the work started in
#21008. It updates the
`ShareCoordinator` interface to follow the same pattern.

Reviewers: Lianet Magrans <lmagrans@confluent.io>, Chia-Ping Tsai
 <chia7712@gmail.com>
TaiJuWu pushed a commit to TaiJuWu/kafka that referenced this pull request Dec 3, 2025
…/Delta` directly (apache#21008)

In apache#20061, we introduced the
`CoordinatorMetadataImage` and `CoordinatorMetadataDelta` interfaces to
basically contain/control how metadata is used within the
`GroupCoordinatorService`, more precisely within the
`GroupCoordinatorShard`. When we did the change, we directly changed the
`GroupCoordinator` interface to take implementation of those interfaces,
requiring to wrap the `MetadataImage` and the `MetadataDelta` in the
`BrokerMetadataPublisher`. While looking at it now, I find this not idea
for a couple a reasons:
1) From a `BrokerMetadataPublisher` perspective, propagating metadata
should be standardized. Basically, all the internal components should
work the same from his point of view. If a component wants to be more
restrictive within his scope, it is fine but the publisher should not be
aware of this.
2) From a `GroupCoordinator` perspective, requiring
`CoordinatorMetadataImage` and `CoordinatorMetadataDelta` limits what we
can do. For instance, we could move more functionality such as electing
shards from the `BrokerMetadataPublisher` to the
`GroupCoordinatorService` to further simplify the metadata publishing
and improving the encapsulation of the components.
3) From a `ShareCoordinator` perspective, the abstraction failed short
as we require `CoordinatorMetadataImage` and `CoordinatorMetadataDelta`
in the interface but we still require `FeaturesImage` as well. The
abstraction fails short here.

This patch is an attempt to change this by moving back to requiring
`MetadataImage` and `MetadataDelta` in the `GroupCoordinator` interface
and to wrap them within the service itself. It does not change the
`ShareCoordinator` yet. I will do this as a follow-up if people agree
with the general approach.

Reviewers: Sean Quah <squah@confluent.io>, Lianet Magrans <lmagrans@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Kafka Broker group-coordinator KIP-932 Queues for Kafka

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants