Commit 00ad5e2
authored
Optimize binary encoding by directly emitting the null byte (#25610)
Optimize binary encoding by directly emitting the null byte inside the
generated file. The null byte 00h is a valid UTF-8 character:
https://datatracker.ietf.org/doc/html/rfc3629
Given that we do still have the opt out -sSINGLE_FILE_BINARY_ENCODE=0
setting from binary encoding, I propose we try to take the encoding to
its maximum potential, and see if we can get away with emitting the null
byte as-is.
The benefit of this are two-fold:
a) assuming a uniform distribution of encoded bytes, not emitting nulls
takes +0.39% more space. (or with nulls, -0.26% smaller)
b) by not offsetting the bytes, any strings in the emitted binary data
will be directly human-readable, e.g.:
<img width="2464" height="1451" alt="image"
src="https://github.com/user-attachments/assets/e85edc36-da52-4274-8a43-092405a45850"
/>
So C strings will be directly parseable/searchable in the output. That
is appealing.
I do not currently know of dealbreaking reasons to not avoid nulls,
except than a generic FUD "editors/toolchains might be buggy to handle
the null."
But those are bugs of the editors, and we do have the
`-sSINGLE_FILE_BINARY_ENCODE=0` fallback to avoid this. So emitting
nulls will allow us to surface if there will be insurmountable issues
with null bytes in the output. We can always revert back to the previous
form, if a difficult blocker arises. (and as a plus, we will then have
learned about that blocker, concretely telling us why that approach will
not be feasible)1 parent b5bf368 commit 00ad5e2
File tree
5 files changed
+43
-11
lines changed- src
- test/codesize
- tools
5 files changed
+43
-11
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
7 | | - | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
3 | | - | |
| 2 | + | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
3 | | - | |
| 2 | + | |
| 3 | + | |
4 | 4 | | |
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
3 | 3 | | |
4 | 4 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2602 | 2602 | | |
2603 | 2603 | | |
2604 | 2604 | | |
| 2605 | + | |
| 2606 | + | |
| 2607 | + | |
| 2608 | + | |
| 2609 | + | |
| 2610 | + | |
| 2611 | + | |
| 2612 | + | |
| 2613 | + | |
| 2614 | + | |
| 2615 | + | |
| 2616 | + | |
| 2617 | + | |
| 2618 | + | |
| 2619 | + | |
| 2620 | + | |
| 2621 | + | |
| 2622 | + | |
| 2623 | + | |
| 2624 | + | |
| 2625 | + | |
| 2626 | + | |
| 2627 | + | |
| 2628 | + | |
| 2629 | + | |
| 2630 | + | |
| 2631 | + | |
| 2632 | + | |
| 2633 | + | |
| 2634 | + | |
| 2635 | + | |
| 2636 | + | |
| 2637 | + | |
2605 | 2638 | | |
2606 | 2639 | | |
2607 | 2640 | | |
| |||
2953 | 2986 | | |
2954 | 2987 | | |
2955 | 2988 | | |
2956 | | - | |
2957 | | - | |
| 2989 | + | |
2958 | 2990 | | |
2959 | 2991 | | |
2960 | | - | |
| 2992 | + | |
2961 | 2993 | | |
2962 | 2994 | | |
2963 | 2995 | | |
2964 | 2996 | | |
2965 | 2997 | | |
2966 | 2998 | | |
2967 | 2999 | | |
2968 | | - | |
2969 | 3000 | | |
2970 | 3001 | | |
2971 | 3002 | | |
| |||
0 commit comments