Intialize dataAddr field only for non-zero length array #20869

VermaSh · 2024-12-23T19:10:20Z

Update array inline allocation sequence to initialize dataAddr field only for non-zero length arrays. Field should be left blank for zero length arrays. Additionally this also clears padding field after the size field.

VermaSh · 2024-12-23T19:17:29Z

Off-heap personal build, upstream personal build. My off-heap personal build is looking good so far, only 1 unrelated failure. I'll remove the WIP tag once both builds have passed without off-heap related failures.

cc: @r30shah / @zl-wang / @dmitripivkine

VermaSh · 2024-12-24T18:13:59Z

Launched new non-off-heap personal build to verify PR changes

VermaSh · 2025-01-03T19:19:24Z

Relaunched non off-heap build

VermaSh · 2025-01-07T05:45:29Z

I have verified the sequence for x and z through a unit test. Need to do the same for power and aarch64 before removing WIP tag.

VermaSh · 2025-01-07T22:26:11Z

Launched personal non off-heap build for power.

VermaSh · 2025-01-08T18:22:07Z

Kept only z here as it is ready for review, other platforms require more testing. @r30shah Can I please get a review?

r30shah

Checking out the changes, by skimming through the change, I had some intiail questions, I have posted.

r30shah · 2025-01-08T20:03:51Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+      if (isOffHeapAllocationEnabled)
+         {
+         TR_ASSERT_FATAL_WITH_NODE(node,
+            TR::InstOpCode::LGF == loadDimLenOpCode,


Can you explain why we have this assert (Checking loadDimLenOpCode to LGF). That variable is initialized to LGF at line 4891 and I do not see it ever updated.

dataAddr field should be 0 for 0 size arrays. Since size will already be 0, I figured we can use that to 0-out dataAddr field. The issue is that size is 4 bytes wide whereas dataAddr is 8 bytes wide, in order to use the same register I need to clear top 4 bytes of firstDimLenReg. I added the assert to ensure that size is sign extended to 64 bits when being loaded to firstDimLenReg. Same logic for second LGF assert.

I do get the reason behind using the LGF instruction. Do you mean to say the logic for the assert is safeguarding someone accidentally changing the code to use other instruction to load firstDimLenReg ? If so instead of putting this assert which I feel is not needed, we should put the comment where you are loading firstDimLenReg on why using LGF instead of LG and warn about the potential functional issue.

I have added a comment explaining why LGF is used instead of LG.

r30shah · 2025-01-08T20:06:23Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

-      }
+      if (isOffHeapAllocationEnabled)
+         {
+         TR_ASSERT_FATAL_WITH_NODE(node,


Same question as before.

VermaSh · 2025-01-14T15:38:51Z

@r30shah Can I please get another review?

r30shah

Thanks Shubham for making changes, I think overall things looks good to me. I have left some question / comments and nitpicks and appreciate if you can address them. Also can you share a instruction selection log to ensure the ICF is correct ?

r30shah · 2025-01-14T15:04:48Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

@@ -4871,7 +4871,6 @@ static TR::Register * generateMultianewArrayWithInlineAllocators(TR::Node *node,

 #if defined(J9VM_GC_SPARSE_HEAP_ALLOCATION)
   bool isOffHeapAllocationEnabled = TR::Compiler->om.isOffHeapAllocationEnabled();


I understand that this part of the code exists and the changes in this PR is not changing that. But as we are here changing the code, can I request to move initialization on isOffHeapAllocationEnabled closer to where it is used as with the change it is first used at line 4963 ? This would also help removing unnecessary defined check here and brings variable declaration and initialization closer.

r30shah · 2025-01-14T15:06:37Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

-   // In the mainline, first load the first and second dimensions' lengths into registers.
+   /* In the mainline, first load the first and second dimensions' lengths into registers.
+    *
+    * LGF is used instead of LG so that array size register can be used to NULL dataAddr


Should be L here right? size field is 4 bytes.

Ah yes, you are right, I'll update the code.

r30shah · 2025-01-14T15:19:27Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+   if (TR::Compiler->om.compressObjectReferences())
+      {
+      // Clear padding in contiguous array header layout. +4 because size field is 4 bytes wide.
+      cursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, firstDimLenReg, generateS390MemoryReference(targetReg, static_cast<int32_t>(fej9->getOffsetOfDiscontiguousArraySizeField()) + 4, cg));


Comment says clear padding but you are storing the firstDimLenReg for the Array Size. Can you clarify and eventually clean-up comment if needed.

Also why is the inconsistencies with static_cast throughout the code. May be remove it from here (There is anyway implicit cast to int32_t).

r30shah · 2025-01-14T15:19:41Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+      if (isOffHeapAllocationEnabled)
+         {
+         cursor = generateRXInstruction(cg, TR::InstOpCode::STG, node, firstDimLenReg, generateS390MemoryReference(targetReg, fej9->getOffsetOfDiscontiguousDataAddrField(), cg));
+         iComment("Clear 1st dim dataAddr field.");


Same comment as before.

Because we are dealing with 0 size array at this point, and we used LGF to load dimension size, firstDimLenReg will be 0 so I figured we could use that to clear dataAddr field.

Ok. This does feel like an implicit assumption. If the register gets modified in future, we end up with wrong code and this is tied with the use of LGF to use the dimension register. I am ok with this change. But make sure that we write this explicit assumption (That dimension register should contain NULL)

Yup, that's why I had the assert initially. I'll update the comment with that assumption.

r30shah · 2025-01-14T15:20:27Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

@@ -5026,6 +5020,33 @@ static TR::Register * generateMultianewArrayWithInlineAllocators(TR::Node *node,
   iComment("Init 1st dim class field.");
   cursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, firstDimLenReg, generateS390MemoryReference(targetReg, fej9->getOffsetOfContiguousArraySizeField(), cg));
   iComment("Init 1st dim size field.");
+   if (!TR::Compiler->om.compressObjectReferences())
+      { // Clear padding in contiguous array header layout. +4 because size field is 4 bytes wide.


Add this comment to new line, not besides braces.

r30shah · 2025-01-14T15:24:15Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+   if (TR::Compiler->om.compressObjectReferences())
+      {
+      // Clear padding in contiguous array header layout. +4 because size field is 4 bytes wide.
+      cursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, secondDimLenReg, generateS390MemoryReference(temp2Reg, static_cast<int32_t>(fej9->getOffsetOfDiscontiguousArraySizeField()) + 4, cg));


Same question as before with firstDimLenReg.

Because we are dealing with 0 size array at this point, and we used LGF to load dimension size, secondDimLenReg will be 0 so I figured we could use that to clear dataAddr field.

r30shah · 2025-01-14T15:25:28Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

      iCursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, eNumReg,
                      generateS390MemoryReference(resReg, fej9->getOffsetOfDiscontiguousArraySizeField(), cg), iCursor);
+      // Clear padding in contiguous array header layout. +4 because size field is 4 bytes wide.
+      iCursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, eNumReg, generateS390MemoryReference(resReg, static_cast<int32_t>(fej9->getOffsetOfDiscontiguousArraySizeField()) + 4, cg), iCursor);


Can you explain the comment ?

Extra padding is added to the array header for 8 byte boundary alignment; so with that comment I just meant that we add 4 to size field offset to get to the padding. I'll update the comment to be more clear.

Okk, does enumReg contains 0 ?

So this works because if enumReg isn't 0 (i.e. we aren't dealing with 0 size array), whatever we write at getOffsetOfDiscontiguousArraySizeField will be overwritten by the element or dataAddr field (if present). But I think using XC might provide better readability. I'll update the code.

r30shah · 2025-01-14T15:40:56Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+               // Load address of first array element
+               iCursor = generateRXInstruction(cg, TR::InstOpCode::LA, node, dataSizeReg, dataAddrMR, iCursor);
+               // Use cc from CFI to load NULL into dataSizeReg for 0 size array
+               iCursor = generateRRFInstruction(cg, TR::InstOpCode::LOCGR, node, dataSizeReg, offsetReg, getMaskForBranchCondition(TR::InstOpCode::COND_BE), true, iCursor);


Can you explain why we used getMaskForBranchCondition function call here for storing 0x8 otherwise ?

Just thought it was better for overall readability.

r30shah · 2025-01-14T15:46:20Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+               iCursor = generateRRInstruction(cg, TR::InstOpCode::XGR, node, offsetReg, offsetReg, iCursor);
+               iCursor = generateRILInstruction(cg, TR::InstOpCode::CFI, node, enumReg, 0, iCursor);
+
+               dataAddrMR = generateS390MemoryReference(resReg, TR::Compiler->om.contiguousArrayHeaderSizeInBytes(), cg);


Just going through the code, do we even need the memory reference variable ? If not may be simply use generateS390MemoryReference in the instruction you are using.

r30shah · 2025-01-14T15:49:13Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+                  if (comp->getOption(TR_TraceCG))
+                     traceMsg(comp, "Node (%p): Clean out dataAddr field.\n", node);
+
+                  uint16_t bytesToClear = static_cast<uint16_t>(TR::Compiler->om.discontiguousArrayHeaderSizeInBytes() - fej9->getOffsetOfDiscontiguousDataAddrField()) - 1;


This is rather confusing. I understand the reason behind subtracting 1, but I think code becomes more readable, if you simply subtract 1 when you are using bytesToClear in the XC instruction generation.

VermaSh · 2025-01-17T20:43:31Z

@r30shah I have updated the PR based on your suggestions, can I please get another review.

VermaSh · 2025-01-17T20:43:56Z

Also can you share a instruction selection log to ensure the ICF is correct ?

I'll share the instruction selection log shortly.

VermaSh · 2025-01-17T21:12:30Z

Is this what you were looking for?
For variable length non 0 size array:

     0x3fcc3df7452 00000062 [     0x3fcbb162460]                                                   Label L0036: # (Start of internal control flow)
     0x3fcc3df7452 00000062 [     0x3fcbb12cfe0] a7 b4 00 eb                                       BRC     BRNP(0xb), Outlined Label L0032, labelTargetAddr=0x000003FCC3DF7628
     0x3fcc3df7456 00000066 [     0x3fcbb12d0c0] a7 0b 00 17                                       AGHI    GPR0,0x17
     0x3fcc3df745a 0000006a [     0x3fcbb12d1a0] a5 07 ff f8                                       NILL    GPR0,0xfff8
     0x3fcc3df745e 0000006e [     0x3fcbb12d340] e3 00 d0 60 00 08                                 AG      GPR0,#415 96(GPR13)
     0x3fcc3df7464 00000074 [     0x3fcbb12d4d0] e3 00 d0 68 00 21                                 CLG     GPR0,#416 104(GPR13)
     0x3fcc3df746a 0000007a [     0x3fcbb12d5a0] a7 24 00 df                                       BRC     BL(0x2), Outlined Label L0032, labelTargetAddr=0x000003FCC3DF7628
     0x3fcc3df746e 0000007e [     0x3fcbb12d740] e3 90 d0 60 00 04                                 LG      GPR9,#417 96(GPR13)
     0x3fcc3df7474 00000084 [     0x3fcbb12d8d0] e3 00 d0 60 00 24                                 STG     GPR0,#418 96(GPR13)
     0x3fcc3df747a 0000008a [     0x3fcbb161960] c0 88 01 bc 3a 00                                 IIHF    GPR8,29112832
     0x3fcc3df7480 00000090 [     0x3fcbb161b30] e3 80 90 00 00 24                                 STG     GPR8,#419 0(GPR9)
     0x3fcc3df7486 00000096 [     0x3fcbb161c70] b9 82 00 11                                       XGR     GPR1,GPR1
     0x3fcc3df748a 0000009a [     0x3fcbb161d50] c2 8d 00 00 00 00                                 CFI     GPR8,0
     0x3fcc3df7490 000000a0 [     0x3fcbb161fe0] 41 00 90 10                                       LA      GPR0,#420 16(GPR9)
     0x3fcc3df7494 000000a4 [     0x3fcbb1620b0] b9 e2 80 01                                       LOCGR   GPR0,0000000000000008,GPR1
     0x3fcc3df7498 000000a8 [     0x3fcbb162190] e3 00 90 08 00 24                                 STG     GPR0,#421 8(GPR9)
     0x3fcc3df749e 000000ae [     0x3fcbb162320] e3 20 91 00 00 36                                 PFD     2, #422 256(GPR9) # Prefetch for store
     0x3fcc3df74a4 000000b4 [     0x3fcbb162b00]                                                   assocreg
     0x3fcc3df74a4 000000b4 [     0x3fcbb162540]                                                   Label L0033: # (End of internal control flow)

method.log

r30shah · 2025-01-17T21:25:43Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

+      if (isOffHeapAllocationEnabled)
+         {
+         cursor = generateRXInstruction(cg, TR::InstOpCode::STG, node, firstDimLenReg, generateS390MemoryReference(targetReg, fej9->getOffsetOfDiscontiguousDataAddrField(), cg));
+         iComment("Clear 1st dim dataAddr field.");


Ok. This does feel like an implicit assumption. If the register gets modified in future, we end up with wrong code and this is tied with the use of LGF to use the dimension register. I am ok with this change. But make sure that we write this explicit assumption (That dimension register should contain NULL)

r30shah · 2025-01-17T21:28:26Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

      iCursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, eNumReg,
                      generateS390MemoryReference(resReg, fej9->getOffsetOfDiscontiguousArraySizeField(), cg), iCursor);
+      // Clear padding in contiguous array header layout. +4 because size field is 4 bytes wide.
+      iCursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, eNumReg, generateS390MemoryReference(resReg, static_cast<int32_t>(fej9->getOffsetOfDiscontiguousArraySizeField()) + 4, cg), iCursor);


Okk, does enumReg contains 0 ?

r30shah · 2025-01-17T21:37:11Z

runtime/compiler/z/codegen/J9TreeEvaluator.cpp

-      cursor = generateS390BranchInstruction(cg, TR::InstOpCode::BRC, TR::InstOpCode::COND_BRC, node, populateFirstDimDataAddrSlot);
-      }
-   else
+   bool isOffHeapAllocationEnabled = TR::Compiler->om.isOffHeapAllocationEnabled();


Need proper indentation here.

VermaSh · 2025-01-17T23:59:28Z

@r30shah I have updated the PR based on your suggestions. Can I please get another review?

Update array inline allocation sequence to initialize dataAddr field only for non-zero size arrays. Field should be left blank for zero size arrays. Signed-off-by: Shubham Verma <[email protected]>

VermaSh force-pushed the inline_array_allocation_z branch from ff0ba22 to 5938046 Compare December 23, 2024 19:13

VermaSh marked this pull request as draft December 23, 2024 19:17

VermaSh changed the title ~~Intialize dataAddr field only for non-zero length array~~ WIP: Intialize dataAddr field only for non-zero length array Dec 23, 2024

VermaSh force-pushed the inline_array_allocation_z branch from 5938046 to 54bf0a5 Compare December 24, 2024 18:11

VermaSh force-pushed the inline_array_allocation_z branch 2 times, most recently from da4a877 to 9b31e29 Compare December 27, 2024 16:50

VermaSh force-pushed the inline_array_allocation_z branch 8 times, most recently from 1317bac to c3c566c Compare January 3, 2025 19:10

VermaSh force-pushed the inline_array_allocation_z branch 5 times, most recently from 648f364 to 068425c Compare January 7, 2025 05:42

VermaSh force-pushed the inline_array_allocation_z branch from 068425c to 6d13fde Compare January 7, 2025 22:15

VermaSh force-pushed the inline_array_allocation_z branch 3 times, most recently from 71bbabb to 8f03946 Compare January 8, 2025 18:20

VermaSh changed the title ~~WIP: Intialize dataAddr field only for non-zero length array~~ Intialize dataAddr field only for non-zero length array Jan 8, 2025

VermaSh marked this pull request as ready for review January 8, 2025 18:21

r30shah requested changes Jan 8, 2025

View reviewed changes

VermaSh force-pushed the inline_array_allocation_z branch 2 times, most recently from 861e92d to f4ce058 Compare January 13, 2025 15:15

r30shah reviewed Jan 14, 2025

View reviewed changes

VermaSh force-pushed the inline_array_allocation_z branch from f4ce058 to fe5e34c Compare January 17, 2025 20:41

r30shah requested changes Jan 17, 2025

View reviewed changes

VermaSh force-pushed the inline_array_allocation_z branch 2 times, most recently from 961e9c7 to 152a6d8 Compare January 17, 2025 23:57

z: NULL initialize dataAddr field for 0 size arrays

bbd1f85

Update array inline allocation sequence to initialize dataAddr field only for non-zero size arrays. Field should be left blank for zero size arrays. Signed-off-by: Shubham Verma <[email protected]>

VermaSh force-pushed the inline_array_allocation_z branch from 152a6d8 to bbd1f85 Compare January 18, 2025 00:03

		@@ -4871,7 +4871,6 @@ static TR::Register * generateMultianewArrayWithInlineAllocators(TR::Node *node,

		#if defined(J9VM_GC_SPARSE_HEAP_ALLOCATION)
		bool isOffHeapAllocationEnabled = TR::Compiler->om.isOffHeapAllocationEnabled();

Intialize dataAddr field only for non-zero length array #20869

Are you sure you want to change the base?

Intialize dataAddr field only for non-zero length array #20869

Conversation

VermaSh commented Dec 23, 2024 • edited Loading

VermaSh commented Dec 23, 2024 • edited Loading

VermaSh commented Dec 24, 2024

VermaSh commented Jan 3, 2025

VermaSh commented Jan 7, 2025

VermaSh commented Jan 7, 2025

VermaSh commented Jan 8, 2025

r30shah left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VermaSh commented Jan 14, 2025

r30shah left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VermaSh Jan 17, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VermaSh commented Jan 17, 2025

VermaSh commented Jan 17, 2025

VermaSh commented Jan 17, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VermaSh commented Jan 17, 2025

VermaSh commented Dec 23, 2024 •

edited

Loading

VermaSh commented Dec 23, 2024 •

edited

Loading

VermaSh Jan 17, 2025 •

edited

Loading