-
Notifications
You must be signed in to change notification settings - Fork 14.1k
const-eval: always do mem-to-mem copies if there might be padding involved #148967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
rustbot has assigned @JonathanBrouwer. Use |
|
@bors try |
This comment has been minimized.
This comment has been minimized.
…try> const-eval: always do mem-to-mem copies if there might be padding involved
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
c4acb77 to
01194d7
Compare
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (78c81ee): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -3.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -2.7%, secondary -9.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -1.1%, secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 472.272s -> 472.014s (-0.05%) |
|
Uh okay I guess this is actually good for perf.^^ At least for the benchmarks we have. The copy apparently gets a little cheaper, but we force more things to use the less efficient in-memory representation. The latter just does not seem to matter in our benchmarks.
Just to be safe:
@craterbot check
|
|
👌 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
01194d7 to
472364c
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
Most of the performance regressions are from the coercions benchmark. All it does is create an array of a large number of string literals in const. Why did this benchmark's performance regress? There is no padding involved in any of the types. |
|
This does look slightly better, but not by much. The extra check we add is utterly trivial now when the type is a wide pointer, not sure why that would show up at all -- maybe LLVM is just having a harder time optimizing this code now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me with one nit/question.
We need to wait for crater & lang team
4ba01da to
6d3267d
Compare
6d3267d to
4a3e937
Compare
|
🚧 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
|
I just realized this has the interesting consequence of making the following code UB according to Miri: #![feature(core_intrinsics)]
#![feature(custom_mir)]
use std::intrinsics::mir::*;
#[custom_mir(dialect = "runtime", phase = "optimized")]
fn test(x: (i64, i8)) {
mir! {
{
x = x;
Return()
}
}
}
fn main() {
test((0, 1));
}I think that's fine, many such assignments where LHS and RHS overlap are already UB -- but ScalarPair types have been exempt so far. However this means we have to update the docs for which MIR assignments allow the LHS and RHS to overlap, and which do not. (Note that this only affects MIR-level assignments. In the surface language obviously arbitrary overlap is allowed; MIR building introduces copies to avoid UB.) |
|
🎉 Experiment
Footnotes
|
|
2k spurious results seems like a lot |
|
👌 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
|
In terms of those 6 regressions
So, those all seem spurious too. |
|
The "no resolution for an import" one is #147966 |
|
🚧 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
|
🎉 Experiment
Footnotes
|
Just checking here: is this a typed copy or an untyped copy? My instinct here is that on a typed copy we need to Basically, it's my understanding that this is UB: use std::mem::MaybeUninit;
#[derive(Debug, Copy, Clone)]
struct Demo(u8, u16);
let first = MaybeUninit::<Demo>::zeroed();
let mut second = MaybeUninit::<Demo>::uninit();
*second.as_mut_ptr() = *first.as_ptr(); // <-- HERE
let demo: u32 = std::mem::transmute(second);
dbg!(demo);But is this changing the HERE copy to copy the uninit bytes too? Or am I misunderstanding which copies we're talking about? Oh, maybe a better example because it's a hard error from CTFE right now: let x = const { unsafe {
use std::mem::MaybeUninit;
#[derive(Debug, Copy, Clone)]
struct Demo(u8, u16);
let first = MaybeUninit::<Demo>::zeroed();
let mut second = MaybeUninit::<Demo>::uninit();
if true {
// Typed copy causes hard error
*second.as_mut_ptr() = *first.as_ptr();
} else {
// Untyped copy is fine
second.as_mut_ptr().copy_from_nonoverlapping(first.as_ptr(), 1);
}
let demo: u32 = std::mem::transmute(second);
demo
} };
dbg!(x);So if that typed copy in CTFE changed to a mem-to-mem copy (which I think is essentially an untyped copy?) that'd no longer be a hard error, which seems like potentially not what we'd want? I might be completely misunderstanding this issue, though 😬 |
We are talking about what happens inside the interpreter on a typed copy. For performance reason, those are not implemented exactly -- they are implemented as in a way that is equivalent for non-UB code, but fails to detect some UB. (Specifically, for many but not all types, they are implemented as just an untyped copy.) That was the idea, anyway. It turns out the difference is also observable without UB because, due to LLVM and ultimately system linker limitations, pointer fragments may not occur in the final value of a const. We could fix this by doing proper, full typed copies all the time, but that would be expensive. So what this PR proposed is to instead make behavior consistent in a different way that still prevents leaking details of the layout computation code.
Indeed, under this PR, we no longer detect the UB in that code. We can't have everything... and which const UB we detect is not guaranteed to be stable. |
|
I guess here is another way to frame it: Generally, it is correct for a Rust implementation to implement a typed copy via an untyped copy. That just reduces how much UB there is. The const-eval interpreter, at the moment, performs something very close to proper typed copies for types that happen to get However, for types with padding, due to the "no pointer fragments in final value" check, whether we do a typed copy or not becomes actually observable in UB-free code. This can go both ways: there are cases where doing the typed copy makes more code compile (because we avoid copying pointer fragments in padding that later cause problems), and there are cases where doing the typed copy makes less code compile (because the pointer fragments in padding that we copy can combine with other pointer fragments to form a full pointer which satisfies the "no pointer fragments" check). So to make things behave consistently, I propose we have const-eval always fall back to an untyped copy when there is padding. This means we miss some UB, but it ensures behavior is independent of rustc layout choices. |
|
After some more discussion with @Nadrieril, let me try to phrase this in terms of what we might say in the reference (instead of phrasing this in terms of breaking changes, which is what I did so far). I will assume that we do want to keep supporting pointer fragments (since they make low-level unsafe code during const-eval behave more like at runtime). First, a recap of some relevant facts:
As a consequence, we can't know which bytes in the result are padding, and if we see a pointer fragment, we have to hard error. For the Language Reference, we seem to have two choices:
|
|
At the last t-lang meeting I expressed confusion as to what exactly was being asked of t-lang here. Thanks to @RalfJung answering my questions I now have the following understanding, if you'll allow me to be pedantic about what is and isn't specified: ConstraintsConstraint 1: The memory we get from const-eval is untyped. For root allocations (e.g. a Constraint 2: The memory we get from const-eval is not fully defined by the Rust Abstract Machine (i.e. the Rust runtime specification). The Rust Abstract Machine specifies how a Rust program behaves on execution, but doesn't specify it completely: for example, on a typed copy for a type with padding, the AM does not specify what happens to the padding bytes. A concrete implementation like const-eval can and must make choices for such cases. Constraint 3: Codegen can't give a sensible byte value to AM-defined ptr fragments found in const memory. If the AM says that a byte in memory is a pointer fragment, the only legal value it can take would be the corresponding byte of the final codegenned address. We can't zero it or uninitialize it or anything like that. However rustc never learns the final address (it is handled by linker shenanigans that don't work on pointer fragments), so if we encounter such a byte we must raise a compile error. Lang consequencesPutting all of that together, since we can't know which byte is AM-defined vs implementation-dependent, we must error on any ptr fragment found in const-eval memory. This includes pointer fragments that happen to be there because of implementation choices. Despite the byte "not mattering" (reading it would be UB, so we could replace it with any value), we can't know that it doesn't matter. Hence within-spec implementation choices of the const-eval interpreter necessarily affect which programs compile. Therefore, the least that the language could commit to would be:
I shall take this to be the de-facto current state of the language specification on this topic, even though this isn't in the Reference. T-lang asksThis is where most of my confusion resided: there are two intertwined questions that sound very similar. Question 1: Assuming that t-lang is ok with the Reference only saying "undefined const-eval bytes may cause compilation to fail", the question this PR asks of t-lang is a vibe-check about an implementation detail that affects which programs compile. This may change again in the future if the breakage stays low. Question 2: T-lang may also choose to make the new behavior part of the Reference/Rust spec. In other words, the Reference would say something like "const-eval executes the program according to the given modified Abstract Machine/specification: ..." and the bytes in const-eval memory would then be fully pinned-down by the spec. And to make question 2 more tricky, if we're strengthening the const-eval AM one could argue that the more natural choice is "typed copies uninitialize padding", which I understand to have negative perf consequences. EDIT: Ah I hadn't seen Ralf's comment that already summarized our discussion. We're saying the same thing. |
This is the final piece of the puzzle for #148470: when copying data of a type that has padding, always do a mem-to-mem copy, so that we always preserve the source padding exactly. That prevents rustc implementation choices from leaking into user-visible behavior.
This is technically a breaking change: the example at the top of #148470 no longer compiles with this. However, it seems very unlikely that anyone would have dependent on this. My main concern is not backwards compatibility, it is performance.
Fixes #148470
Originally posted by @RalfJung in #148470
Originally posted by @RalfJung in #148470
Related: