Commit 995eaed
Fix
`#[address_space(shared)] static mut` is used to model GPU shared
memory. It's a bit weird. In particular, GPU shared memory is
uninitialized, but `static mut` requires an initializer in Rust. `gemm`
uses a zero initialize, but this initializer is ignored by NVVM. At
least, it was in CUDA 12.x, but in CUDA 13.0 the `gemm` example fails
with this error:
```
thread 'rustc' panicked at crates/rustc_codegen_nvvm/src/nvvm.rs:120:9:
Malformed NVVM IR program rejected by libnvvm, dumping verifier log:
error: Error: : Global Variable `_ZN12gemm_kernels10gemm_tiled10gemm_tiled6TILE_A17hc9c66e758c373a7eE':
context: @_ZN12gemm_kernels10gemm_tiled10gemm_tiled6TILE_A17hc9c66e758c373a7eE = internal unnamed_addr addrspace(3) global <{ [1024 x i8] }> zeroinitializer, align 4
Shared variables can't be initialized
```
This memory looks like it's initialized to zero but isn't, and then is
written and read normally. This is incredibly dodgy and very likely UB.
The proper way to deal with uninitialized memory in Rust is with
`MaybeUninit`, and there are strict rules around its used, e.g. writes
must be done with `write` and `assume_init` must be used values after
they are written.
This commit changes `gemm` to use `MaybeUninit` for the shared memory.
This fixes the error on CUDA 13.0 and the example runs correctly.
(This is the only executed use of GPU shared memory in rust-cuda. There
is a `shared_array!` macro defined but it's only used in a compiletest
where it is compiled but not run. That macro is extremely dubious but I
will deal with it in a separate PR because it's not necessary to get
CUDA 13.0 working.)gemm example on CUDA 13.0.1 parent 2623e21 commit 995eaed
2 files changed
+25
-7
lines changedSome generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
| |||
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| 42 | + | |
41 | 43 | | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
42 | 49 | | |
43 | | - | |
| 50 | + | |
44 | 51 | | |
45 | | - | |
| 52 | + | |
46 | 53 | | |
47 | 54 | | |
48 | 55 | | |
| |||
57 | 64 | | |
58 | 65 | | |
59 | 66 | | |
60 | | - | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
61 | 70 | | |
62 | | - | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
63 | 74 | | |
64 | 75 | | |
65 | | - | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
66 | 79 | | |
67 | | - | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
68 | 83 | | |
69 | 84 | | |
70 | 85 | | |
71 | 86 | | |
72 | 87 | | |
73 | | - | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
74 | 91 | | |
75 | 92 | | |
76 | 93 | | |
| |||
0 commit comments