Add support for offheap allocation in multianewarray evaluator on IBM Z platform #22797

ehsankianifar · 2025-10-17T15:26:15Z

@r30shah
This PR add support for offheap 2D array allocation. It is calling helper if the second dim is zero length and allocating offheap. I have a few possible solutions for that.

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

ehsankianifar · 2025-10-17T15:40:52Z

Remove WIP when I finish testing

ehsankianifar · 2025-10-17T16:53:19Z

Local tests pass. I remove the WIP and launching jenkin tests.
The compilation logs:
offHeap.log

r30shah

@VermaSh - Can I request your review on this first for off heap?

r30shah

Enhances the multianewarray evaluator to support offheap allocation on
IMB Z platform. Offheap allocation for arrays with a zero-length second
dimension is not supported; in such cases, a helper call is triggered
to handle the allocation.

Please fix typo (IBM Z not IMB Z).

I have not looked into your code - but can you confirm this statement is true or not.

We did inline zero-length first / second dimension array when off-heap is enabled. But changes in this PR will divert those case to helper (Or only zero length second dimension) and only inlines if both dimension is not zero?

ehsankianifar · 2025-10-20T13:52:02Z

Thanks @r30shah! I fixed the commit message. Yes, we were inlining zero length 2D arrays before, then we disable inlining offheap and with this change we enable offheap except for zero length second dim.
A subsequent PR will fix inlining of zero second length arrays.

r30shah · 2025-10-20T13:55:42Z

Is it that complicated to fix for zero second length that you will do in second PR?

ehsankianifar · 2025-10-20T14:10:41Z

Is it that complicated to fix for zero second length that you will do in second PR?

No I have a few ideas but want to select a plan that has minimum impact on the non zero allocation since I expect non-zero to get called more frequently.

VermaSh · 2025-10-20T15:06:32Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

   dependencies->addPostCondition(dim2SizeReg, TR::RealRegister::AssignAny);
-   cursor = generateRXInstruction(cg, TR::InstOpCode::LTGF, node, dim2SizeReg, generateS390MemoryReference(dimsPtrReg, 0, cg), cursor);
-   iComment("Load 2st dim length.");
   // The size of zero length array is already loaded in size register. Jump over the array size calculation instructions if length is 0.


Since we're no longer writing 0 to dim2SizeReg, this comment can be removed.

Thanks shubham. Comment was removed

ehsankianifar · 2025-10-21T17:38:55Z

@r30shah @VermaSh I pushed the changes to support zero length offheap allocation as well as non-zero length.

ehsankianifar · 2025-10-23T13:42:02Z

@hzongaro pointed that we do not inline constant non-zero second dimension 2D arrays (#22562 (review)). I am trying to see why it was added and will remove the condition if it is not necessary anymore.

VermaSh · 2025-10-27T14:43:30Z

@ehsankianifar Is this ready for review or are you still working on adding changes?

ehsankianifar · 2025-10-27T18:48:40Z

@ehsankianifar Is this ready for review or are you still working on adding changes?

Thanks @VermaSh, Yes it is ready for review.

ehsankianifar · 2025-10-28T14:35:11Z

I upload a sample compilation log with constant zero length here for reference:
zeroLength.log
FYI @VermaSh

VermaSh · 2025-10-28T15:23:34Z

Thank you! @ehsankianifar. Can you please share trace logs for constant and non-constant non-zero 2nd dimension?

ehsankianifar · 2025-10-28T20:25:03Z

Thanks @VermaSh,
The node after post lower trees is always the same regardless of the argument types and values. Here is an example of non constant length compilation log:
jitCompilation.log
The node in Pre Instruction Selection is exactly like the const zero length logs.

VermaSh · 2025-11-05T10:11:44Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+   if (isOffHeapAllocationEnabled)
+      {
+      // Store first element's address in data address field.
+      cursor = generateRXInstruction(cg, TR::InstOpCode::STG, node, sizeReg, generateS390MemoryReference(resultReg, fej9->getOffsetOfContiguousDataAddrField(), cg), cursor);


If leaf array size is 0, won't we completely miss initializing dataAddr slot for 1st dimension array?

No this code will be executed if the first dimension length is not zero.

ah ok, I got confused by the comments. All clear now

VermaSh

I think we're leaving the dataAddr slot for the first-dimension array uninitialized when the leaf array has size 0.

ehsankianifar · 2025-11-05T15:36:40Z

I think we're leaving the dataAddr slot for the first-dimension array uninitialized when the leaf array has size 0.

Thanks @VermaSh for your review. We populate the data address field in the the first dimension if there is any leaf array (first dim length is not zero). That part of code has no knowledge of the second dim length. Here is a sample I just allocated and checked the created object with first dim 1 and second dim zero (obj[1][0]). Everything looks correct:

VermaSh · 2025-11-05T19:03:35Z

runtime/compiler/env/VMJ9.cpp


   static bool recreateRoot = feGetEnv("TR_LowerMultiANewArrayRecreateRoot") ? true : false;
-#if defined(TR_HOST_POWER)
+#if defined(TR_HOST_POWER) || defined(TR_HOST_S390)


Just a FYI, these changes are in https://github.com/eclipse-openj9/openj9/pull/22562/files as well so there might be a conflict if that PR goes in first.

VermaSh

LGT from off-heap perspective

ehsankianifar · 2025-11-05T19:08:49Z

LGT from off-heap perspective

Thank you for your review. @r30shah can we merge this PR?

r30shah

Some question and one change. Based on your response to the two question I have posed I will continue review.

r30shah · 2025-11-07T20:00:32Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+      // Store first element's address in data address field.
+      cursor = generateRXInstruction(cg, TR::InstOpCode::STG, node, sizeReg, generateS390MemoryReference(resultReg, fej9->getOffsetOfContiguousDataAddrField(), cg), cursor);
+      // Subtract the header size from the dim2 size, so we can move to the next leaf by adding this value to the address of the leaf's first element.
+      cursor = generateRILInstruction(cg, TR::InstOpCode::SLFI, node, dim2SizeReg, (int32_t)TR::Compiler->om.contiguousArrayHeaderSizeInBytes(), cursor);


Shouldn't we use SLGFI here ? dim2SizeReg is used as 64-bit at all places. Is this an oversight or deliberate ? I do understand that it would be impossible to have dim2SizeReg to be greater than 32-bit so may not cause any functional issue but this would not even save us any instruction.

Since we know that the size is 32 bit max, I am using the upper 32 bits for storing compressed reference to the leaf array. It is not used in this code path and I can use 64bit instruction here but it may cause confusion and inconsistency along code paths.

r30shah · 2025-11-07T20:06:30Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+      cursor = generateRILInstruction(cg, TR::InstOpCode::SLFI, node, dim2SizeReg, (int32_t)TR::Compiler->om.contiguousArrayHeaderSizeInBytes(), cursor);
+      // The length of the second dimension is stored in the upper 32 bits of the scratch register.
+      // Comparing the full 64-bit scratch register with its lower 32 bits sets the condition code to COND_BNE if the second dimension is non-zero.
+      cursor = generateRREInstruction(cg, TR::InstOpCode::CGFR, node, scratchReg, scratchReg, cursor);


Hmm, this was one of the reason originally I asked / recommended not to have values fused together before storing it. Can you explain the check here ? Lower half is zero and comparison would upperhalf with lowerhalf and upperhalf and CC is set ?

it is the most efficient way to check if the second dim is zero without extra loading or using extra registers.

It may be, not questioning that. I want to confirm my understanding of the logic here, so that I can continue review.

Sorry I misunderstood the question earlier.
The scratch register stores the first dimension length in its lower 32 bits and the second dimension length in its upper 32 bits. When using CGFR, we compare the full 64-bit value of the scratch register with its 32-bit lower half. These values match only if the upper half (second dimension length) is zero.
Currently, I don’t have a free register to split the dimension lengths, so I must either reuse the upper half of the scratch register or allocate a new register to hold the value.

r30shah · 2025-11-07T20:09:17Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+      // The length of the second dimension is stored in the upper 32 bits of the scratch register.
+      // Comparing the full 64-bit scratch register with its lower 32 bits sets the condition code to COND_BNE if the second dimension is non-zero.
+      cursor = generateRREInstruction(cg, TR::InstOpCode::CGFR, node, scratchReg, scratchReg, cursor);
+      // The condition code has already been set and must remain unchanged during the second dimension allocation loop in this code path.


Can you explain why this ? Can this be avoided, in general we should avoid writing the evalutors where condition code is set lot more instructions before. This becomes difficult to debug and fix in case there comes an issue in the future.

Problem Summary

The goal is to store the data start address only when the second dimension length is non-zero. I evaluated several approaches:

Approaches Considered

Separate allocation loops

One loop for zero-length allocations and another for non-zero-length.

Branching is based on the second dimension length.

Drawbacks: Increases code size. Moving the zero-length loop to OOL may introduce regressions.

Unconditional store + post-zeroing

Store the address regardless of length.

After allocation, loop through leaf arrays and zero the address if the length is zero.

Drawbacks: May cause performance regressions due to unnecessary stores and post-processing.

Conditional store inside loop

Check the length in each iteration and store the address only if non-zero.

Drawbacks: Adds overhead due to repeated condition checks.

Selected approach: Pre-loop check + conditional store

Check the length once before entering the loop.

Use a conditional store inside the loop.

Note: Comments were added to ensure no instructions inside the loop alter the condition code. Future contributors are expected to maintain this constraint.

@ehsankianifar Thanks for the detailed response. I am still skeptical of this approach as I do not think a comment like this is sufficient. You set the condition code before entering a loop and then inside a loop after some n instruction, you use that CC to store.
Though given you have shared 4 approaches you have considered so far, I will take a look at the log file once again and see if I can suggest better approach.

This is the log file #22797 (comment) I am looking at. If that is not the one with latest, please upload latest one.

@ehsankianifar - Is the log I pointed out generates the code in this block ? I do not see CGFR instruction in the log.

Thanks @r30shah,
That log is to compare the tree transformation for constant length array vs non-constant array. Here is a compilation log for offHeap enabled allocation:
OffheapComp.log

r30shah

@ehsankianifar

r30shah · 2025-11-10T18:27:27Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+      // The length of the second dimension is stored in the upper 32 bits of the scratch register.
+      // Comparing the full 64-bit scratch register with its lower 32 bits sets the condition code to COND_BNE if the second dimension is non-zero.
+      cursor = generateRREInstruction(cg, TR::InstOpCode::CGFR, node, scratchReg, scratchReg, cursor);
+      // The condition code has already been set and must remain unchanged during the second dimension allocation loop in this code path.


@ehsankianifar - Is the log I pointed out generates the code in this block ? I do not see CGFR instruction in the log.

r30shah

@ehsankianifar Can I request you to rebase this one? I am picking the review on this again.

Enhances the multianewarray evaluator to support offheap allocation on IBM Z platform. Offheap allocation for arrays with a zero-length second dimension is not supported; in such cases, a helper call is triggered to handle the allocation. signed-off-by: Ehsan Kiani Far <ehsan.kianifar@gmail.com>

Enhances the multianewarray evaluator to support zero-length offheap allocation on IBM Z platform. signed-off-by: Ehsan Kiani Far <ehsan.kianifar@gmail.com>

Previously, inlining of constant 2D arrays with a non-zero second dimension was disabled on z due to limitations in handling such cases. However, recent improvements have resolved those issues, making it safe and beneficial to inline these arrays. Remove the constraint that prevented inlining of non-zero constant second dimension 2D arrays. signed-off-by: Ehsan Kiani Far <ehsan.kianifar@gmail.com>

Refactored off-heap 2D array allocation logic to improve clarity and maintainability on IBM Z platform. Previously, STOCG was used to store the data address only when the length was non-zero, relying on condition codes to control flow. This approach introduced hidden dependencies on allocator condition codes, making the code harder to read and prone to future bugs. Updated the implementation to allocate non-zero length off-heap arrays inline, while handling zero-length arrays out of line. This removes reliance on condition codes for allocation decisions, simplifies control flow, and ensures consistent behavior across allocation paths. signed-off-by: Ehsan Kiani Far <ehsan.kianifar@gmail.com>

ehsankianifar · 2025-12-02T22:21:26Z

I moved the allocation of off-heap zero second dimension length leaf arrays OOL. Tests are passing locally and I started a few jenkin tests.
The compilation logs for off-heap 2D array for both comp refs and no comp refs are provided in the flowing logs files:
OffheapComp.log
OffheapNoComp.log

ehsankianifar changed the title ~~Z: Add support for offheap allocation in multianewarray evaluator~~ Add support for offheap allocation in multianewarray evaluator on IBM Z platform Oct 17, 2025

ehsankianifar commented Oct 17, 2025

View reviewed changes

runtime/compiler/z/codegen/J9TreeEvaluator.cpp Show resolved Hide resolved

ehsankianifar marked this pull request as draft October 17, 2025 15:39

ehsankianifar self-assigned this Oct 17, 2025

ehsankianifar requested a review from r30shah October 17, 2025 16:53

ehsankianifar marked this pull request as ready for review October 17, 2025 16:53

jdmpapin added comp:jit arch:z labels Oct 17, 2025

r30shah reviewed Oct 17, 2025

View reviewed changes

r30shah requested changes Oct 17, 2025

View reviewed changes

ehsankianifar force-pushed the Z_AddOffheapSupportInMultianewarrayEvaluator branch from d4792d2 to f7b0fe2 Compare October 20, 2025 13:47

VermaSh reviewed Oct 20, 2025

View reviewed changes

ehsankianifar force-pushed the Z_AddOffheapSupportInMultianewarrayEvaluator branch from f7b0fe2 to deaab6d Compare October 20, 2025 16:17

ehsankianifar force-pushed the Z_AddOffheapSupportInMultianewarrayEvaluator branch 2 times, most recently from f788654 to fa536c3 Compare October 24, 2025 16:29

ehsankianifar requested a review from dsouzai as a code owner October 27, 2025 15:32

ehsankianifar force-pushed the Z_AddOffheapSupportInMultianewarrayEvaluator branch from 50f14ea to 1618674 Compare October 27, 2025 18:50

ehsankianifar force-pushed the Z_AddOffheapSupportInMultianewarrayEvaluator branch from fb38ac1 to 1618674 Compare November 4, 2025 20:59

VermaSh reviewed Nov 5, 2025

View reviewed changes

VermaSh approved these changes Nov 5, 2025

View reviewed changes

r30shah reviewed Nov 7, 2025

View reviewed changes

r30shah requested changes Nov 7, 2025

View reviewed changes

r30shah requested changes Nov 10, 2025

View reviewed changes

r30shah reviewed Nov 25, 2025

View reviewed changes

ehsankianifar added 3 commits December 2, 2025 11:53

Z: Add support for inline zero-length offheap 2D array allocation

9229e51

Enhances the multianewarray evaluator to support zero-length offheap allocation on IBM Z platform. signed-off-by: Ehsan Kiani Far <ehsan.kianifar@gmail.com>

ehsankianifar force-pushed the Z_AddOffheapSupportInMultianewarrayEvaluator branch 2 times, most recently from 0adb331 to 425601f Compare December 2, 2025 19:24

ehsankianifar force-pushed the Z_AddOffheapSupportInMultianewarrayEvaluator branch from 425601f to e3ab377 Compare December 2, 2025 19:36

update

6258442

Add support for offheap allocation in multianewarray evaluator on IBM Z platform #22797

Are you sure you want to change the base?

Add support for offheap allocation in multianewarray evaluator on IBM Z platform #22797

Uh oh!

Conversation

ehsankianifar commented Oct 17, 2025

Uh oh!

Uh oh!

ehsankianifar commented Oct 17, 2025

Uh oh!

ehsankianifar commented Oct 17, 2025

Uh oh!

r30shah left a comment

Choose a reason for hiding this comment

Uh oh!

r30shah left a comment

Choose a reason for hiding this comment

Uh oh!

ehsankianifar commented Oct 20, 2025

Uh oh!

r30shah commented Oct 20, 2025

Uh oh!

ehsankianifar commented Oct 20, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ehsankianifar commented Oct 21, 2025

Uh oh!

ehsankianifar commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VermaSh commented Oct 27, 2025

Uh oh!

ehsankianifar commented Oct 27, 2025

Uh oh!

ehsankianifar commented Oct 28, 2025

Uh oh!

VermaSh commented Oct 28, 2025

Uh oh!

ehsankianifar commented Oct 28, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

VermaSh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ehsankianifar commented Nov 5, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

VermaSh left a comment

Choose a reason for hiding this comment

Uh oh!

ehsankianifar commented Nov 5, 2025

Uh oh!

r30shah left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ehsankianifar Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

ehsankianifar commented Oct 23, 2025 •

edited

Loading

VermaSh left a comment •

edited

Loading

ehsankianifar Nov 10, 2025 •

edited

Loading