-
Notifications
You must be signed in to change notification settings - Fork 940
Cluster: Optimize slot bitmap iteration from per-bit to 64-bit word scan #2781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Cluster: Optimize slot bitmap iteration from per-bit to 64-bit word scan #2781
Conversation
Signed-off-by: Zhijun <dszhijun@gmail.com>
Signed-off-by: Zhijun <dszhijun@gmail.com>
Signed-off-by: Zhijun <dszhijun@gmail.com>
|
Nice one! Did you have a chance to look at tests? |
Signed-off-by: Zhijun <dszhijun@gmail.com>
Yeah. Turns out I accidentally removed an early return. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #2781 +/- ##
===========================================
Coverage 72.47% 72.47%
===========================================
Files 128 128
Lines 71624 70161 -1463
===========================================
- Hits 51908 50848 -1060
+ Misses 19716 19313 -403
🚀 New features to boost your workflow:
|
ranshid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did just an ultra quick review.
should we also handle clusterUpdateSlotsConfigWith?
src/cluster.h
Outdated
|
|
||
| #define CLUSTER_SLOT_MASK_BITS 14 /* Number of bits used for slot id. */ | ||
| #define CLUSTER_SLOTS (1 << CLUSTER_SLOT_MASK_BITS) /* Total number of slots in cluster mode, which is 16384. */ | ||
| #define CLUSTER_SLOT_WORDS (CLUSTER_SLOTS / 64) /* Treating slot bitmaps as 8-byte words */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like we can take this define into the c file. I hardly think it worth exposing this as it is mainly used for internal implementation logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. I did remove this but forgot to include the changed file in git commit.
src/cluster_legacy.c
Outdated
| if (bitmapTestBit(hdr->myslots, j)) { | ||
| if (server.cluster->slots[j] == sender || isSlotUnclaimed(j)) continue; | ||
| if (server.cluster->slots[j]->configEpoch > sender_claimed_config_epoch) { | ||
| int found_new_owner = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should use bool type instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
Signed-off-by: Zhijun <dszhijun@gmail.com>
Nah, we need to handle every single slot there, regardless if that slot bit is set. |
src/cluster_legacy.c
Outdated
| memcpy(&word, hdr->myslots + (w << 3), sizeof(word)); | ||
| while (word) { | ||
| /* Get the index of the least-significant set bit, in this 64-bit word */ | ||
| const unsigned bit = (unsigned)__builtin_ctzll(word); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these two lines are common whenever we find the slot, should we extract this? Maybe we can move some common part out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Extracted into an inline helper function.
Signed-off-by: Zhijun <dszhijun@gmail.com>
Signed-off-by: Zhijun <dszhijun@gmail.com>
In functions
clusterSendFailoverAuthIfNeededandclusterProcessPacket, we iterate through every slot bit sequentially in the form offor (int j = 0; j < CLUSTER_SLOTS; j++), performing 16384 checks even when only a few bits were set, and thus causing unnecessary loop overhead.This is particularly wasteful in function
clusterSendFailoverAuthIfNeededwhere we need to ensure the sender's claimed slots all have up-to-date config epoch. Usually healthy senders would meet such condition, and thus we normally need to exhaust the for loop of 16384 checks.The proposed new implementation loads 64 bits (8 byte word) at a time and skips empty words completely, therefore only performing 256 checks.