Skip to content

Conversation

@zhijun42
Copy link
Contributor

In functions clusterSendFailoverAuthIfNeeded and clusterProcessPacket, we iterate through every slot bit sequentially in the form of for (int j = 0; j < CLUSTER_SLOTS; j++), performing 16384 checks even when only a few bits were set, and thus causing unnecessary loop overhead.

This is particularly wasteful in function clusterSendFailoverAuthIfNeeded where we need to ensure the sender's claimed slots all have up-to-date config epoch. Usually healthy senders would meet such condition, and thus we normally need to exhaust the for loop of 16384 checks.

The proposed new implementation loads 64 bits (8 byte word) at a time and skips empty words completely, therefore only performing 256 checks.

Signed-off-by: Zhijun <dszhijun@gmail.com>
Signed-off-by: Zhijun <dszhijun@gmail.com>
Signed-off-by: Zhijun <dszhijun@gmail.com>
@dvkashapov
Copy link

Nice one! Did you have a chance to look at tests?

Signed-off-by: Zhijun <dszhijun@gmail.com>
@zhijun42
Copy link
Contributor Author

Nice one! Did you have a chance to look at tests?

Yeah. Turns out I accidentally removed an early return.

@codecov
Copy link

codecov bot commented Oct 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.47%. Comparing base (baf2d57) to head (7011469).
⚠️ Report is 5 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff             @@
##           unstable    #2781     +/-   ##
===========================================
  Coverage     72.47%   72.47%             
===========================================
  Files           128      128             
  Lines         71624    70161   -1463     
===========================================
- Hits          51908    50848   -1060     
+ Misses        19716    19313    -403     
Files with missing lines Coverage Δ
src/cluster_legacy.c 87.25% <100.00%> (+0.15%) ⬆️

... and 103 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Member

@ranshid ranshid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did just an ultra quick review.

should we also handle clusterUpdateSlotsConfigWith?

src/cluster.h Outdated

#define CLUSTER_SLOT_MASK_BITS 14 /* Number of bits used for slot id. */
#define CLUSTER_SLOTS (1 << CLUSTER_SLOT_MASK_BITS) /* Total number of slots in cluster mode, which is 16384. */
#define CLUSTER_SLOT_WORDS (CLUSTER_SLOTS / 64) /* Treating slot bitmaps as 8-byte words */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like we can take this define into the c file. I hardly think it worth exposing this as it is mainly used for internal implementation logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. I did remove this but forgot to include the changed file in git commit.

if (bitmapTestBit(hdr->myslots, j)) {
if (server.cluster->slots[j] == sender || isSlotUnclaimed(j)) continue;
if (server.cluster->slots[j]->configEpoch > sender_claimed_config_epoch) {
int found_new_owner = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use bool type instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

Signed-off-by: Zhijun <dszhijun@gmail.com>
@zhijun42
Copy link
Contributor Author

Did just an ultra quick review.

should we also handle clusterUpdateSlotsConfigWith?

Nah, we need to handle every single slot there, regardless if that slot bit is set.

memcpy(&word, hdr->myslots + (w << 3), sizeof(word));
while (word) {
/* Get the index of the least-significant set bit, in this 64-bit word */
const unsigned bit = (unsigned)__builtin_ctzll(word);
Copy link
Contributor

@sarthakaggarwal97 sarthakaggarwal97 Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these two lines are common whenever we find the slot, should we extract this? Maybe we can move some common part out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Extracted into an inline helper function.

Signed-off-by: Zhijun <dszhijun@gmail.com>
Signed-off-by: Zhijun <dszhijun@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants