[DirectX] Let data scalarizer pass account for sub-types when updating GEP type #166200

inbelic · 2025-11-03T17:36:30Z

This pr lets the dxil-data-scalarization account for a GEP with a source type that is a sub-type of the pointer operand type.

The pass is updated so that the replaced GEP introduces zero indices such that the result type remains the same (with the vector -> array transform).

Please see resolved issue for an annotated example.

Resolves: #165473

llvmbot · 2025-11-03T17:37:03Z

@llvm/pr-subscribers-backend-directx

Author: Finn Plummer (inbelic)

Changes

This pr lets the dxil-data-scalarization account for a GEP with a source type that is a sub-type of the pointer operand type.

The pass is updated so that the replaced GEP introduces zero indices such that the result type remains the same (with the vector -> array transform).

Please see resolved issue for an annotated example.

Resolves: #165473

Full diff: https://github.com/llvm/llvm-project/pull/166200.diff

3 Files Affected:

(modified) llvm/lib/Target/DirectX/DXILDataScalarization.cpp (+52-16)
(modified) llvm/test/CodeGen/DirectX/scalarize-alloca.ll (+65)
(added) llvm/test/CodeGen/DirectX/scalarize-global.ll (+70)

diff --git a/llvm/lib/Target/DirectX/DXILDataScalarization.cpp b/llvm/lib/Target/DirectX/DXILDataScalarization.cpp
index d507d71b99fc9..88a9b5084cba2 100644
--- a/llvm/lib/Target/DirectX/DXILDataScalarization.cpp
+++ b/llvm/lib/Target/DirectX/DXILDataScalarization.cpp
@@ -304,40 +304,76 @@ bool DataScalarizerVisitor::visitGetElementPtrInst(GetElementPtrInst &GEPI) {
   GEPOperator *GOp = cast<GEPOperator>(&GEPI);
   Value *PtrOperand = GOp->getPointerOperand();
   Type *NewGEPType = GOp->getSourceElementType();
-  bool NeedsTransform = false;
 
   // Unwrap GEP ConstantExprs to find the base operand and element type
-  while (auto *CE = dyn_cast<ConstantExpr>(PtrOperand)) {
-    if (auto *GEPCE = dyn_cast<GEPOperator>(CE)) {
-      GOp = GEPCE;
-      PtrOperand = GEPCE->getPointerOperand();
-      NewGEPType = GEPCE->getSourceElementType();
-    } else
-      break;
+  while (auto *GEPCE = dyn_cast_or_null<GEPOperator>(
+             dyn_cast<ConstantExpr>(PtrOperand))) {
+    GOp = GEPCE;
+    PtrOperand = GEPCE->getPointerOperand();
+    NewGEPType = GEPCE->getSourceElementType();
   }
 
+  Type *const OrigGEPType = NewGEPType;
+  Value *const OrigOperand = PtrOperand;
+
   if (GlobalVariable *NewGlobal = lookupReplacementGlobal(PtrOperand)) {
     NewGEPType = NewGlobal->getValueType();
     PtrOperand = NewGlobal;
-    NeedsTransform = true;
   } else if (AllocaInst *Alloca = dyn_cast<AllocaInst>(PtrOperand)) {
     Type *AllocatedType = Alloca->getAllocatedType();
     if (isa<ArrayType>(AllocatedType) &&
-        AllocatedType != GOp->getResultElementType()) {
+        AllocatedType != GOp->getResultElementType())
       NewGEPType = AllocatedType;
-      NeedsTransform = true;
+  } else
+    return false; // Only GEPs into an alloca or global variable are considered
+
+  // Defer changing i8 GEP types until dxil-flatten-arrays
+  if (OrigGEPType->isIntegerTy(8))
+    NewGEPType = OrigGEPType;
+
+  // If the original type is a "sub-type" of the new type, then ensure the gep
+  // correctly zero-indexes the extra dimensions to keep the offset calculation
+  // correct.
+  // Eg:
+  //  i32, [4 x i32] and [8 x [ 4 x i32]] are sub-types of [8 x [4 x i32]], etc.
+  //
+  // So then:
+  //   gep [4 x i32] %idx
+  //     -> gep [8 x [4 x i32]], i32 0, i32 %idx
+  //   gep i32 %idx
+  //     -> gep [8 x [4 x i32]], i32 0, i32 0, i32 %idx
+  uint32_t MissingDims = 0;
+  Type *SubType = NewGEPType;
+
+  // The new type will be in it's array version so match accordingly
+  Type *const GEPArrType = equivalentArrayTypeFromVector(OrigGEPType);
+
+  while (SubType != GEPArrType) {
+    MissingDims++;
+
+    ArrayType *ArrType = dyn_cast<ArrayType>(SubType);
+    if (!ArrType) {
+      assert(SubType == GEPArrType && "GEP uses a strange sub-type of alloca/global variable");
+      break;
     }
+
+    SubType = ArrType->getElementType();
   }
 
+
+  bool NeedsTransform = OrigOperand != PtrOperand ||
+                        OrigGEPType != NewGEPType || MissingDims != 0;
+
   if (!NeedsTransform)
     return false;
 
-  // Keep scalar GEPs scalar; dxil-flatten-arrays will do flattening later
-  if (!isa<ArrayType>(GOp->getSourceElementType()))
-    NewGEPType = GOp->getSourceElementType();
-
   IRBuilder<> Builder(&GEPI);
-  SmallVector<Value *, MaxVecSize> Indices(GOp->indices());
+  SmallVector<Value *, MaxVecSize> Indices;
+
+  for (uint32_t I = 0; I < MissingDims; I++)
+    Indices.push_back(Builder.getInt32(0));
+  llvm::append_range(Indices, GOp->indices());
+
   Value *NewGEP = Builder.CreateGEP(NewGEPType, PtrOperand, Indices,
                                     GOp->getName(), GOp->getNoWrapFlags());
 
diff --git a/llvm/test/CodeGen/DirectX/scalarize-alloca.ll b/llvm/test/CodeGen/DirectX/scalarize-alloca.ll
index a8557e47b0ea6..475935d2eb135 100644
--- a/llvm/test/CodeGen/DirectX/scalarize-alloca.ll
+++ b/llvm/test/CodeGen/DirectX/scalarize-alloca.ll
@@ -42,3 +42,68 @@ define void @alloca_2d_gep_test() {
   %3 = getelementptr inbounds nuw [2 x <2 x i32>], ptr %1, i32 0, i32 %2
   ret void
 }
+
+; CHECK-LABEL: subtype_array_test
+define void @subtype_array_test() {
+  ; SCHECK:  [[alloca_val:%.*]] = alloca [8 x [4 x i32]], align 4
+  ; FCHECK:  [[alloca_val:%.*]] = alloca [32 x i32], align 4
+  ; CHECK: [[tid:%.*]] = tail call i32 @llvm.dx.thread.id(i32 0)
+  ; SCHECK: [[gep:%.*]] = getelementptr inbounds nuw [8 x [4 x i32]], ptr [[alloca_val]], i32 0, i32 [[tid]]
+  ; FCHECK: [[flatidx_mul:%.*]] = mul i32 [[tid]], 4
+  ; FCHECK: [[flatidx:%.*]] = add i32 0, [[flatidx_mul]]
+  ; FCHECK: [[gep:%.*]] = getelementptr inbounds nuw [32 x i32], ptr [[alloca_val]], i32 0, i32 [[flatidx]]
+  ; CHECK: ret void
+  %arr = alloca [8 x [4 x i32]], align 4
+  %i = tail call i32 @llvm.dx.thread.id(i32 0)
+  %gep = getelementptr inbounds nuw [4 x i32], ptr %arr, i32 %i
+  ret void
+}
+
+; CHECK-LABEL: subtype_vector_test
+define void @subtype_vector_test() {
+  ; SCHECK:  [[alloca_val:%.*]] = alloca [8 x [4 x i32]], align 4
+  ; FCHECK:  [[alloca_val:%.*]] = alloca [32 x i32], align 4
+  ; CHECK: [[tid:%.*]] = tail call i32 @llvm.dx.thread.id(i32 0)
+  ; SCHECK: [[gep:%.*]] = getelementptr inbounds nuw [8 x [4 x i32]], ptr [[alloca_val]], i32 0, i32 [[tid]]
+  ; FCHECK: [[flatidx_mul:%.*]] = mul i32 [[tid]], 4
+  ; FCHECK: [[flatidx:%.*]] = add i32 0, [[flatidx_mul]]
+  ; FCHECK: [[gep:%.*]] = getelementptr inbounds nuw [32 x i32], ptr [[alloca_val]], i32 0, i32 [[flatidx]]
+  ; CHECK: ret void
+  %arr = alloca [8 x <4 x i32>], align 4
+  %i = tail call i32 @llvm.dx.thread.id(i32 0)
+  %gep = getelementptr inbounds nuw <4 x i32>, ptr %arr, i32 %i
+  ret void
+}
+
+; CHECK-LABEL: subtype_scalar_test
+define void @subtype_scalar_test() {
+  ; SCHECK:  [[alloca_val:%.*]] = alloca [8 x [4 x i32]], align 4
+  ; FCHECK:  [[alloca_val:%.*]] = alloca [32 x i32], align 4
+  ; CHECK: [[tid:%.*]] = tail call i32 @llvm.dx.thread.id(i32 0)
+  ; SCHECK: [[gep:%.*]] = getelementptr inbounds nuw [8 x [4 x i32]], ptr [[alloca_val]], i32 0, i32 0, i32 [[tid]]
+  ; FCHECK: [[flatidx_mul:%.*]] = mul i32 [[tid]], 1
+  ; FCHECK: [[flatidx:%.*]] = add i32 0, [[flatidx_mul]]
+  ; FCHECK: [[gep:%.*]] = getelementptr inbounds nuw [32 x i32], ptr [[alloca_val]], i32 0, i32 [[flatidx]]
+  ; CHECK: ret void
+  %arr = alloca [8 x [4 x i32]], align 4
+  %i = tail call i32 @llvm.dx.thread.id(i32 0)
+  %gep = getelementptr inbounds nuw i32, ptr %arr, i32 %i
+  ret void
+}
+
+; CHECK-LABEL: subtype_i8_test
+define void @subtype_i8_test() {
+  ; SCHECK:  [[alloca_val:%.*]] = alloca [8 x [4 x i32]], align 4
+  ; FCHECK:  [[alloca_val:%.*]] = alloca [32 x i32], align 4
+  ; CHECK: [[tid:%.*]] = tail call i32 @llvm.dx.thread.id(i32 0)
+  ; SCHECK: [[gep:%.*]] = getelementptr inbounds nuw i8, ptr [[alloca_val]], i32 [[tid]]
+  ; FCHECK: [[flatidx_mul:%.*]] = mul i32 [[tid]], 1
+  ; FCHECK: [[flatidx_lshr:%.*]] = lshr i32 [[flatidx_mul]], 2
+  ; FCHECK: [[flatidx:%.*]] = add i32 0, [[flatidx_lshr]]
+  ; FCHECK: [[gep:%.*]] = getelementptr inbounds nuw [32 x i32], ptr [[alloca_val]], i32 0, i32 [[flatidx]]
+  ; CHECK: ret void
+  %arr = alloca [8 x [4 x i32]], align 4
+  %i = tail call i32 @llvm.dx.thread.id(i32 0)
+  %gep = getelementptr inbounds nuw i8, ptr %arr, i32 %i
+  ret void
+}
diff --git a/llvm/test/CodeGen/DirectX/scalarize-global.ll b/llvm/test/CodeGen/DirectX/scalarize-global.ll
new file mode 100644
index 0000000000000..ca10f6ece5a85
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/scalarize-global.ll
@@ -0,0 +1,70 @@
+; RUN: opt -S -passes='dxil-data-scalarization' -mtriple=dxil-pc-shadermodel6.3-library %s | FileCheck %s --check-prefixes=SCHECK,CHECK
+; RUN: opt -S -passes='dxil-data-scalarization,dxil-flatten-arrays' -mtriple=dxil-pc-shadermodel6.3-library %s | FileCheck %s --check-prefixes=FCHECK,CHECK
+
+@"arrayofVecData" = local_unnamed_addr addrspace(3) global [8 x <4 x i32>] zeroinitializer, align 16
+@"vecData" = external addrspace(3) global <4 x i32>, align 4
+
+; SCHECK: [[arrayofVecData:@arrayofVecData.*]] = local_unnamed_addr addrspace(3) global [8 x [4 x i32]] zeroinitializer, align 16
+; FCHECK: [[arrayofVecData:@arrayofVecData.*]] = local_unnamed_addr addrspace(3) global [32 x i32] zeroinitializer, align 16
+; CHECK: [[vecData:@vecData.*]] = external addrspace(3) global [4 x i32], align 4
+
+; CHECK-LABEL: subtype_array_test
+define <4 x i32> @subtype_array_test() {
+  ; CHECK: [[tid:%.*]] = tail call i32 @llvm.dx.thread.id(i32 0)
+  ; SCHECK: [[gep:%.*]] = getelementptr inbounds nuw [8 x [4 x i32]], ptr addrspace(3) [[arrayofVecData]], i32 0, i32 [[tid]]
+  ; FCHECK: [[flatidx_mul:%.*]] = mul i32 [[tid]], 4
+  ; FCHECK: [[flatidx:%.*]] = add i32 0, [[flatidx_mul]]
+  ; FCHECK: [[gep:%.*]] = getelementptr inbounds nuw [32 x i32], ptr addrspace(3) [[arrayofVecData]], i32 0, i32 [[flatidx]]
+  ; CHECK: [[x:%.*]] = load <4 x i32>, ptr addrspace(3) [[gep]], align 4
+  ; CHECK: ret <4 x i32> [[x]]
+  %i = tail call i32 @llvm.dx.thread.id(i32 0)
+  %gep = getelementptr inbounds nuw [4 x i32], ptr addrspace(3) @"arrayofVecData", i32 %i
+  %x = load <4 x i32>, ptr addrspace(3) %gep, align 4
+  ret <4 x i32> %x
+}
+
+; CHECK-LABEL: subtype_vector_test
+define <4 x i32> @subtype_vector_test() {
+  ; CHECK: [[tid:%.*]] = tail call i32 @llvm.dx.thread.id(i32 0)
+  ; SCHECK: [[gep:%.*]] = getelementptr inbounds nuw [8 x [4 x i32]], ptr addrspace(3) [[arrayofVecData]], i32 0, i32 [[tid]]
+  ; FCHECK: [[flatidx_mul:%.*]] = mul i32 [[tid]], 4
+  ; FCHECK: [[flatidx:%.*]] = add i32 0, [[flatidx_mul]]
+  ; FCHECK: [[gep:%.*]] = getelementptr inbounds nuw [32 x i32], ptr addrspace(3) [[arrayofVecData]], i32 0, i32 [[flatidx]]
+  ; CHECK: [[x:%.*]] = load <4 x i32>, ptr addrspace(3) [[gep]], align 4
+  ; CHECK: ret <4 x i32> [[x]]
+  %i = tail call i32 @llvm.dx.thread.id(i32 0)
+  %gep = getelementptr inbounds nuw <4 x i32>, ptr addrspace(3) @"arrayofVecData", i32 %i
+  %x = load <4 x i32>, ptr addrspace(3) %gep, align 4
+  ret <4 x i32> %x
+}
+
+; CHECK-LABEL: subtype_scalar_test
+define <4 x i32> @subtype_scalar_test() {
+  ; CHECK: [[tid:%.*]] = tail call i32 @llvm.dx.thread.id(i32 0)
+  ; SCHECK: [[gep:%.*]] = getelementptr inbounds nuw [8 x [4 x i32]], ptr addrspace(3) [[arrayofVecData]], i32 0, i32 0, i32 [[tid]]
+  ; FCHECK: [[flatidx_mul:%.*]] = mul i32 [[tid]], 1
+  ; FCHECK: [[flatidx:%.*]] = add i32 0, [[flatidx_mul]]
+  ; FCHECK: [[gep:%.*]] = getelementptr inbounds nuw [32 x i32], ptr addrspace(3) [[arrayofVecData]], i32 0, i32 [[flatidx]]
+  ; CHECK: [[x:%.*]] = load <4 x i32>, ptr addrspace(3) [[gep]], align 4
+  ; CHECK: ret <4 x i32> [[x]]
+  %i = tail call i32 @llvm.dx.thread.id(i32 0)
+  %gep = getelementptr inbounds nuw i32, ptr addrspace(3) @"arrayofVecData", i32 %i
+  %x = load <4 x i32>, ptr addrspace(3) %gep, align 4
+  ret <4 x i32> %x
+}
+
+; CHECK-LABEL: subtype_i8_test
+define <4 x i32> @subtype_i8_test() {
+  ; CHECK: [[tid:%.*]] = tail call i32 @llvm.dx.thread.id(i32 0)
+  ; SCHECK: [[gep:%.*]] = getelementptr inbounds nuw i8, ptr addrspace(3) [[arrayofVecData]], i32 [[tid]]
+  ; FCHECK: [[flatidx_mul:%.*]] = mul i32 [[tid]], 1
+  ; FCHECK: [[flatidx_lshr:%.*]] = lshr i32 [[flatidx_mul]], 2
+  ; FCHECK: [[flatidx:%.*]] = add i32 0, [[flatidx_lshr]]
+  ; FCHECK: [[gep:%.*]] = getelementptr inbounds nuw [32 x i32], ptr addrspace(3) [[arrayofVecData]], i32 0, i32 [[flatidx]]
+  ; CHECK: [[x:%.*]] = load <4 x i32>, ptr addrspace(3) [[gep]], align 4
+  ; CHECK: ret <4 x i32> [[x]]
+  %i = tail call i32 @llvm.dx.thread.id(i32 0)
+  %gep = getelementptr inbounds nuw i8, ptr addrspace(3) @"arrayofVecData", i32 %i
+  %x = load <4 x i32>, ptr addrspace(3) %gep, align 4
+  ret <4 x i32> %x
+}

github-actions · 2025-11-03T17:38:28Z

✅ With the latest revision this PR passed the C/C++ code formatter.

inbelic · 2025-11-03T17:56:32Z

llvm/test/CodeGen/DirectX/scalarize-alloca.ll

@@ -42,3 +42,68 @@ define void @alloca_2d_gep_test() {
  %3 = getelementptr inbounds nuw [2 x <2 x i32>], ptr %1, i32 0, i32 %2
  ret void
 }
+


FYI: SCHECK denotes just running the dxil-data-scalarization pass and FCHECK denotes running both that and the dxil-flatten-arrays pass

Icohedron

LGTM!

llvm/lib/Target/DirectX/DXILDataScalarization.cpp

Co-authored-by: Deric C. <cheung.deric@gmail.com>

…g GEP type (llvm#166200) This pr lets the `dxil-data-scalarization` account for a GEP with a source type that is a sub-type of the pointer operand type. The pass is updated so that the replaced GEP introduces zero indices such that the result type remains the same (with the vector -> array transform). Please see resolved issue for an annotated example. Resolves: llvm#165473

inbelic added 2 commits October 31, 2025 10:21

drive by clean up

ab1327f

[DirectX] Make data scalarization pass account for GEP as a sub-type

993b51a

llvmbot added the backend:DirectX label Nov 3, 2025

touch up

c11205b

inbelic mentioned this pull request Nov 3, 2025

[DirectX] Data scalarization pass does not account for struct types #166203

Open

inbelic changed the title ~~[DirectX] Make data scalarizer pass account sub-types in GEPs~~ [DirectX] Let data scalarizer pass account for sub-types when updating GEP type Nov 3, 2025

Icohedron self-assigned this Nov 3, 2025

inbelic commented Nov 3, 2025

View reviewed changes

Icohedron approved these changes Nov 3, 2025

View reviewed changes

llvm/lib/Target/DirectX/DXILDataScalarization.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/DirectX/DXILDataScalarization.cpp Outdated Show resolved Hide resolved

damyanp assigned farzonl and unassigned Icohedron Nov 4, 2025

farzonl approved these changes Nov 4, 2025

View reviewed changes

inbelic and others added 4 commits November 5, 2025 09:19

Merge branch 'main' into inbelic/gep

132af9e

Merge branch 'main' into inbelic/gep

c0b7bd7

Merge branch 'main' into inbelic/gep

87a42d7

review: fix up comments

83bc735

Co-authored-by: Deric C. <cheung.deric@gmail.com>

inbelic merged commit 75c09b7 into llvm:main Nov 6, 2025
11 checks passed

inbelic deleted the inbelic/gep branch November 7, 2025 18:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DirectX] Let data scalarizer pass account for sub-types when updating GEP type #166200

[DirectX] Let data scalarizer pass account for sub-types when updating GEP type #166200

Uh oh!

inbelic commented Nov 3, 2025

Uh oh!

llvmbot commented Nov 3, 2025

Uh oh!

github-actions bot commented Nov 3, 2025 •

edited

Loading

Uh oh!

inbelic Nov 3, 2025

Uh oh!

Icohedron left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[DirectX] Let data scalarizer pass account for sub-types when updating GEP type #166200

[DirectX] Let data scalarizer pass account for sub-types when updating GEP type #166200

Uh oh!

Conversation

inbelic commented Nov 3, 2025

Uh oh!

llvmbot commented Nov 3, 2025

Uh oh!

github-actions bot commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

inbelic Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Icohedron left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions bot commented Nov 3, 2025 •

edited

Loading