-
Notifications
You must be signed in to change notification settings - Fork 130
Also use compare small Uint8Array by content logic for joins #899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…parison This fixes a bug where joining collections by small Uint8Array keys would fail because the keys were compared by reference instead of by content. The fix normalizes Uint8Array keys (≤128 bytes) to string representations before using them as Map keys, similar to the approach used in PR #779 for value comparison. This ensures that Uint8Array instances with the same byte content are treated as equal, even if they are different objects. Changes: - Added normalizeKey() function to convert small Uint8Arrays to string keys - Updated Index class to normalize keys in all Map operations - Maintained mapping from normalized keys to original keys for iteration - Added test case for joining collections with Uint8Array keys Fixes #896
This refactoring eliminates code duplication by creating shared utilities for Uint8Array normalization that can be used across both db-ivm and db packages. Changes: - Created new module: packages/db-ivm/src/utils/uint8array.ts - Exports UINT8ARRAY_NORMALIZE_THRESHOLD constant (128 bytes) - Exports isUint8Array() helper function - Exports normalizeUint8Array() for byte array to string conversion - Exports normalizeValue() for generic value normalization - Updated packages/db-ivm/src/indexes.ts: - Removed local normalizeKey() function - Now imports and uses normalizeValue from shared utilities - Updated packages/db/src/utils/comparison.ts: - Removed duplicate UINT8ARRAY_NORMALIZE_THRESHOLD constant - Removed duplicate isUint8Array check logic - Now imports and uses shared utilities from @tanstack/db-ivm - Updated packages/db-ivm/src/index.ts: - Added export for utils/uint8array module Benefits: - Single source of truth for Uint8Array normalization logic - Easier maintenance and consistency across packages - Follows DRY principle All tests pass (268 tests in db-ivm, 1699 tests in db).
🦋 Changeset detectedLatest commit: 457a409 The changes in this PR will be included in the next version bump. This PR includes changesets to release 12 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
More templates
@tanstack/angular-db
@tanstack/db
@tanstack/db-ivm
@tanstack/electric-db-collection
@tanstack/offline-transactions
@tanstack/powersync-db-collection
@tanstack/query-db-collection
@tanstack/react-db
@tanstack/rxdb-db-collection
@tanstack/solid-db
@tanstack/svelte-db
@tanstack/trailbase-db-collection
@tanstack/vue-db
commit: |
|
Size Change: -36 B (-0.04%) Total Size: 86.3 kB
ℹ️ View Unchanged
|
|
Size Change: 0 B Total Size: 3.34 kB ℹ️ View Unchanged
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should consider carefully if this is a change we want to make. The performance of joins is important, and currently this implementation makes all joins use a new (slightly) slower code path with the normalisation and tracking of original keys - thats not needed for 99% of joins that use string or numeric keys.
I'm honestly not sure that supporting uint8 arrays as join keys is a good idea or not. We can decide to just push this back on the user and have them normalise a uuid to a string when syncing data - I do wander if that is a better option that adding complexity.
| describe(`Operators`, () => { | ||
| describe(`Join operation`, () => { | ||
| testJoin() | ||
| testUint8ArrayKeyJoin() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we decide to keep this implementation, can we incorporate this into the main testJoin function rather than adding as a new one with just this single test?
| */ | ||
| #inner: IndexMap<TKey, TValue, TPrefix> | ||
| #consolidatedMultiplicity: Map<TKey, number> = new Map() // sum of multiplicities per key | ||
| #originalKeys: Map<TKey, TKey> = new Map() // map from normalized keys to original keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little nervous of the overhead of adding this additional map and the lookups. We are making joins slower for all cases in order to support an edge case.
Export the helper function from db-ivm for use in the main db package.