-
-
Notifications
You must be signed in to change notification settings - Fork 399
fix!: store raw commit/tag actor headers and parse lazily (see #2177) #2253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix!: store raw commit/tag actor headers and parse lazily (see #2177) #2253
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for making this happen.
I took a first look and don't like it at all, but also wouldn't know how to achieve round-tripping differently.
If you wouldn't mind, breaking changes should be in a separate commit and prefixed with fix!: probably, and all adjustments to other crates go into a separate commit (not marked as breaking (assuming they aren't breaking).
First of all, the I was wondering… what if this Then one can use So this PR is definitely good at showing that alternative solutions like the one mentioned here a worth exploring. |
2a27780 to
250d531
Compare
|
I'm not sure I can do exactly what you have in mind but I gave it a try. It's much cleaner code-wise at least. Divided it into 2 commits like you asked too, but I'm still not sure this is 100% ready. It's 2AM right now for me so I'll sleep and see tomorrow if I have anything else which might make this cleaner. Thanks for the feedback! |
ddf9121 to
678bba4
Compare
40d5a95 to
114f986
Compare
114f986 to
ddf21a1
Compare
Byron
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the second round!
Yes, using the fields directly and parsing on the fly is the way to go. In general, this parsing is now fallible, and .expect/.unwrap can't be used.
Besides that, it's definitely getting there, thanks again!
gix-object/src/commit/mod.rs
Outdated
| /// Return the author, with whitespace trimmed. | ||
| /// | ||
| /// This is different from the `author` field which may contain whitespace. | ||
| pub fn author(&self) -> gix_actor::SignatureRef<'a> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These must be fallible, panics aren't allowed. It's OK to use the error type returned by SignatureRef::from_bytes().
gix-object/src/commit/write.rs
Outdated
| } | ||
|
|
||
| fn write_signature(mut out: &mut dyn io::Write, field: &[u8], raw: &bstr::BStr) -> io::Result<()> { | ||
| if signature_requires_raw(raw) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why this differentiation is still required. In theory, the committer and author are now verbatim, which should always be what's written back.
gix-object/src/object/convert.rs
Outdated
| } = other; | ||
| let tagger = tagger.map(|raw| { | ||
| gix_actor::SignatureRef::from_bytes::<()>(raw.as_ref()) | ||
| .expect("signatures were validated during parsing") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every time the signature is parsed it must be fallible.
gix-object/src/tag/write.rs
Outdated
| gix_actor::SignatureRef::from_bytes::<()>(raw.as_ref()).expect("signatures were validated during parsing") | ||
| } | ||
|
|
||
| fn signature_requires_raw(raw: &BStr) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I think differentiating between these shoudln't be necessary.
|
I've adjusted the commit and tag APIs so they return fallible results. I’ve also removed the raw/canonical split. You were right, just streamring the stored header bytes and sizing them via Every place that re-parses actors during conversion or utility code should be fallible now. Tell me if you see anything else. |
c423fbe to
f547dcc
Compare
This PR makes signature handling truly lossless for "creative" emails and other info. We now stash the raw name slice on IdentityRef/SignatureRef and fall back to it when rewriting, so even commits with embedded angle brackets round-trip cleanly (might want to expand to other malformed characters before merging? idk). Parsing and serialization honor that flag but still keep strict validation for normal input. I also added regression coverage for these scenarios. Might not be the most elegant solution, as now every ctor/helper that builds signatures or identities explicitly sets raw: None. This is only me taking a stab at it for fun. It is not prod ready but just an idea.
Tries to help with #2177.