Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intialize dataAddr field only for non-zero length array #20869

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

VermaSh
Copy link
Contributor

@VermaSh VermaSh commented Dec 23, 2024

Update array inline allocation sequence to initialize dataAddr field only for non-zero length arrays. Field should be left blank for zero length arrays. Additionally this also clears padding field after the size field.

@VermaSh VermaSh force-pushed the inline_array_allocation_z branch from ff0ba22 to 5938046 Compare December 23, 2024 19:13
@VermaSh
Copy link
Contributor Author

VermaSh commented Dec 23, 2024

Off-heap personal build, upstream personal build. My off-heap personal build is looking good so far, only 1 unrelated failure. I'll remove the WIP tag once both builds have passed without off-heap related failures.

cc: @r30shah / @zl-wang / @dmitripivkine

@VermaSh VermaSh marked this pull request as draft December 23, 2024 19:17
@VermaSh VermaSh changed the title Intialize dataAddr field only for non-zero length array WIP: Intialize dataAddr field only for non-zero length array Dec 23, 2024
@VermaSh VermaSh force-pushed the inline_array_allocation_z branch from 5938046 to 54bf0a5 Compare December 24, 2024 18:11
@VermaSh
Copy link
Contributor Author

VermaSh commented Dec 24, 2024

Launched new non-off-heap personal build to verify PR changes

@VermaSh VermaSh force-pushed the inline_array_allocation_z branch 2 times, most recently from da4a877 to 9b31e29 Compare December 27, 2024 16:50
@VermaSh VermaSh force-pushed the inline_array_allocation_z branch 8 times, most recently from 1317bac to c3c566c Compare January 3, 2025 19:10
@VermaSh
Copy link
Contributor Author

VermaSh commented Jan 3, 2025

Relaunched non off-heap build

@VermaSh VermaSh force-pushed the inline_array_allocation_z branch 5 times, most recently from 648f364 to 068425c Compare January 7, 2025 05:42
@VermaSh
Copy link
Contributor Author

VermaSh commented Jan 7, 2025

I have verified the sequence for x and z through a unit test. Need to do the same for power and aarch64 before removing WIP tag.

@VermaSh VermaSh force-pushed the inline_array_allocation_z branch from 068425c to 6d13fde Compare January 7, 2025 22:15
@VermaSh
Copy link
Contributor Author

VermaSh commented Jan 7, 2025

Launched personal non off-heap build for power.

@VermaSh VermaSh force-pushed the inline_array_allocation_z branch 3 times, most recently from 71bbabb to 8f03946 Compare January 8, 2025 18:20
@VermaSh VermaSh changed the title WIP: Intialize dataAddr field only for non-zero length array Intialize dataAddr field only for non-zero length array Jan 8, 2025
@VermaSh VermaSh marked this pull request as ready for review January 8, 2025 18:21
@VermaSh
Copy link
Contributor Author

VermaSh commented Jan 8, 2025

Kept only z here as it is ready for review, other platforms require more testing. @r30shah Can I please get a review?

Copy link
Contributor

@r30shah r30shah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking out the changes, by skimming through the change, I had some intiail questions, I have posted.

if (isOffHeapAllocationEnabled)
{
TR_ASSERT_FATAL_WITH_NODE(node,
TR::InstOpCode::LGF == loadDimLenOpCode,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why we have this assert (Checking loadDimLenOpCode to LGF). That variable is initialized to LGF at line 4891 and I do not see it ever updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataAddr field should be 0 for 0 size arrays. Since size will already be 0, I figured we can use that to 0-out dataAddr field. The issue is that size is 4 bytes wide whereas dataAddr is 8 bytes wide, in order to use the same register I need to clear top 4 bytes of firstDimLenReg. I added the assert to ensure that size is sign extended to 64 bits when being loaded to firstDimLenReg. Same logic for second LGF assert.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do get the reason behind using the LGF instruction. Do you mean to say the logic for the assert is safeguarding someone accidentally changing the code to use other instruction to load firstDimLenReg ? If so instead of putting this assert which I feel is not needed, we should put the comment where you are loading firstDimLenReg on why using LGF instead of LG and warn about the potential functional issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a comment explaining why LGF is used instead of LG.

}
if (isOffHeapAllocationEnabled)
{
TR_ASSERT_FATAL_WITH_NODE(node,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as before.

@VermaSh VermaSh force-pushed the inline_array_allocation_z branch 2 times, most recently from 861e92d to f4ce058 Compare January 13, 2025 15:15
@VermaSh
Copy link
Contributor Author

VermaSh commented Jan 14, 2025

@r30shah Can I please get another review?

Copy link
Contributor

@r30shah r30shah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Shubham for making changes, I think overall things looks good to me. I have left some question / comments and nitpicks and appreciate if you can address them. Also can you share a instruction selection log to ensure the ICF is correct ?

@@ -4871,7 +4871,6 @@ static TR::Register * generateMultianewArrayWithInlineAllocators(TR::Node *node,

#if defined(J9VM_GC_SPARSE_HEAP_ALLOCATION)
bool isOffHeapAllocationEnabled = TR::Compiler->om.isOffHeapAllocationEnabled();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this part of the code exists and the changes in this PR is not changing that. But as we are here changing the code, can I request to move initialization on isOffHeapAllocationEnabled closer to where it is used as with the change it is first used at line 4963 ? This would also help removing unnecessary defined check here and brings variable declaration and initialization closer.

// In the mainline, first load the first and second dimensions' lengths into registers.
/* In the mainline, first load the first and second dimensions' lengths into registers.
*
* LGF is used instead of LG so that array size register can be used to NULL dataAddr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be L here right? size field is 4 bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, you are right, I'll update the code.

if (TR::Compiler->om.compressObjectReferences())
{
// Clear padding in contiguous array header layout. +4 because size field is 4 bytes wide.
cursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, firstDimLenReg, generateS390MemoryReference(targetReg, static_cast<int32_t>(fej9->getOffsetOfDiscontiguousArraySizeField()) + 4, cg));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment says clear padding but you are storing the firstDimLenReg for the Array Size. Can you clarify and eventually clean-up comment if needed.

Also why is the inconsistencies with static_cast throughout the code. May be remove it from here (There is anyway implicit cast to int32_t).

if (isOffHeapAllocationEnabled)
{
cursor = generateRXInstruction(cg, TR::InstOpCode::STG, node, firstDimLenReg, generateS390MemoryReference(targetReg, fej9->getOffsetOfDiscontiguousDataAddrField(), cg));
iComment("Clear 1st dim dataAddr field.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as before.

Copy link
Contributor Author

@VermaSh VermaSh Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we are dealing with 0 size array at this point, and we used LGF to load dimension size, firstDimLenReg will be 0 so I figured we could use that to clear dataAddr field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. This does feel like an implicit assumption. If the register gets modified in future, we end up with wrong code and this is tied with the use of LGF to use the dimension register. I am ok with this change. But make sure that we write this explicit assumption (That dimension register should contain NULL)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's why I had the assert initially. I'll update the comment with that assumption.

@@ -5026,6 +5020,33 @@ static TR::Register * generateMultianewArrayWithInlineAllocators(TR::Node *node,
iComment("Init 1st dim class field.");
cursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, firstDimLenReg, generateS390MemoryReference(targetReg, fej9->getOffsetOfContiguousArraySizeField(), cg));
iComment("Init 1st dim size field.");
if (!TR::Compiler->om.compressObjectReferences())
{ // Clear padding in contiguous array header layout. +4 because size field is 4 bytes wide.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this comment to new line, not besides braces.

if (TR::Compiler->om.compressObjectReferences())
{
// Clear padding in contiguous array header layout. +4 because size field is 4 bytes wide.
cursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, secondDimLenReg, generateS390MemoryReference(temp2Reg, static_cast<int32_t>(fej9->getOffsetOfDiscontiguousArraySizeField()) + 4, cg));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as before with firstDimLenReg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we are dealing with 0 size array at this point, and we used LGF to load dimension size, secondDimLenReg will be 0 so I figured we could use that to clear dataAddr field.

iCursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, eNumReg,
generateS390MemoryReference(resReg, fej9->getOffsetOfDiscontiguousArraySizeField(), cg), iCursor);
// Clear padding in contiguous array header layout. +4 because size field is 4 bytes wide.
iCursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, eNumReg, generateS390MemoryReference(resReg, static_cast<int32_t>(fej9->getOffsetOfDiscontiguousArraySizeField()) + 4, cg), iCursor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the comment ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra padding is added to the array header for 8 byte boundary alignment; so with that comment I just meant that we add 4 to size field offset to get to the padding. I'll update the comment to be more clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okk, does enumReg contains 0 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this works because if enumReg isn't 0 (i.e. we aren't dealing with 0 size array), whatever we write at getOffsetOfDiscontiguousArraySizeField will be overwritten by the element or dataAddr field (if present). But I think using XC might provide better readability. I'll update the code.

// Load address of first array element
iCursor = generateRXInstruction(cg, TR::InstOpCode::LA, node, dataSizeReg, dataAddrMR, iCursor);
// Use cc from CFI to load NULL into dataSizeReg for 0 size array
iCursor = generateRRFInstruction(cg, TR::InstOpCode::LOCGR, node, dataSizeReg, offsetReg, getMaskForBranchCondition(TR::InstOpCode::COND_BE), true, iCursor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why we used getMaskForBranchCondition function call here for storing 0x8 otherwise ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thought it was better for overall readability.

iCursor = generateRRInstruction(cg, TR::InstOpCode::XGR, node, offsetReg, offsetReg, iCursor);
iCursor = generateRILInstruction(cg, TR::InstOpCode::CFI, node, enumReg, 0, iCursor);

dataAddrMR = generateS390MemoryReference(resReg, TR::Compiler->om.contiguousArrayHeaderSizeInBytes(), cg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just going through the code, do we even need the memory reference variable ? If not may be simply use generateS390MemoryReference in the instruction you are using.

if (comp->getOption(TR_TraceCG))
traceMsg(comp, "Node (%p): Clean out dataAddr field.\n", node);

uint16_t bytesToClear = static_cast<uint16_t>(TR::Compiler->om.discontiguousArrayHeaderSizeInBytes() - fej9->getOffsetOfDiscontiguousDataAddrField()) - 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rather confusing. I understand the reason behind subtracting 1, but I think code becomes more readable, if you simply subtract 1 when you are using bytesToClear in the XC instruction generation.

@VermaSh VermaSh force-pushed the inline_array_allocation_z branch from f4ce058 to fe5e34c Compare January 17, 2025 20:41
@VermaSh
Copy link
Contributor Author

VermaSh commented Jan 17, 2025

@r30shah I have updated the PR based on your suggestions, can I please get another review.

@VermaSh
Copy link
Contributor Author

VermaSh commented Jan 17, 2025

Also can you share a instruction selection log to ensure the ICF is correct ?

I'll share the instruction selection log shortly.

@VermaSh
Copy link
Contributor Author

VermaSh commented Jan 17, 2025

Is this what you were looking for?
For variable length non 0 size array:

     0x3fcc3df7452 00000062 [     0x3fcbb162460]                                                   Label L0036: # (Start of internal control flow)
     0x3fcc3df7452 00000062 [     0x3fcbb12cfe0] a7 b4 00 eb                                       BRC     BRNP(0xb), Outlined Label L0032, labelTargetAddr=0x000003FCC3DF7628
     0x3fcc3df7456 00000066 [     0x3fcbb12d0c0] a7 0b 00 17                                       AGHI    GPR0,0x17
     0x3fcc3df745a 0000006a [     0x3fcbb12d1a0] a5 07 ff f8                                       NILL    GPR0,0xfff8
     0x3fcc3df745e 0000006e [     0x3fcbb12d340] e3 00 d0 60 00 08                                 AG      GPR0,#415 96(GPR13)
     0x3fcc3df7464 00000074 [     0x3fcbb12d4d0] e3 00 d0 68 00 21                                 CLG     GPR0,#416 104(GPR13)
     0x3fcc3df746a 0000007a [     0x3fcbb12d5a0] a7 24 00 df                                       BRC     BL(0x2), Outlined Label L0032, labelTargetAddr=0x000003FCC3DF7628
     0x3fcc3df746e 0000007e [     0x3fcbb12d740] e3 90 d0 60 00 04                                 LG      GPR9,#417 96(GPR13)
     0x3fcc3df7474 00000084 [     0x3fcbb12d8d0] e3 00 d0 60 00 24                                 STG     GPR0,#418 96(GPR13)
     0x3fcc3df747a 0000008a [     0x3fcbb161960] c0 88 01 bc 3a 00                                 IIHF    GPR8,29112832
     0x3fcc3df7480 00000090 [     0x3fcbb161b30] e3 80 90 00 00 24                                 STG     GPR8,#419 0(GPR9)
     0x3fcc3df7486 00000096 [     0x3fcbb161c70] b9 82 00 11                                       XGR     GPR1,GPR1
     0x3fcc3df748a 0000009a [     0x3fcbb161d50] c2 8d 00 00 00 00                                 CFI     GPR8,0
     0x3fcc3df7490 000000a0 [     0x3fcbb161fe0] 41 00 90 10                                       LA      GPR0,#420 16(GPR9)
     0x3fcc3df7494 000000a4 [     0x3fcbb1620b0] b9 e2 80 01                                       LOCGR   GPR0,0000000000000008,GPR1
     0x3fcc3df7498 000000a8 [     0x3fcbb162190] e3 00 90 08 00 24                                 STG     GPR0,#421 8(GPR9)
     0x3fcc3df749e 000000ae [     0x3fcbb162320] e3 20 91 00 00 36                                 PFD     2, #422 256(GPR9) # Prefetch for store
     0x3fcc3df74a4 000000b4 [     0x3fcbb162b00]                                                   assocreg
     0x3fcc3df74a4 000000b4 [     0x3fcbb162540]                                                   Label L0033: # (End of internal control flow)

method.log

if (isOffHeapAllocationEnabled)
{
cursor = generateRXInstruction(cg, TR::InstOpCode::STG, node, firstDimLenReg, generateS390MemoryReference(targetReg, fej9->getOffsetOfDiscontiguousDataAddrField(), cg));
iComment("Clear 1st dim dataAddr field.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. This does feel like an implicit assumption. If the register gets modified in future, we end up with wrong code and this is tied with the use of LGF to use the dimension register. I am ok with this change. But make sure that we write this explicit assumption (That dimension register should contain NULL)

iCursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, eNumReg,
generateS390MemoryReference(resReg, fej9->getOffsetOfDiscontiguousArraySizeField(), cg), iCursor);
// Clear padding in contiguous array header layout. +4 because size field is 4 bytes wide.
iCursor = generateRXInstruction(cg, TR::InstOpCode::ST, node, eNumReg, generateS390MemoryReference(resReg, static_cast<int32_t>(fej9->getOffsetOfDiscontiguousArraySizeField()) + 4, cg), iCursor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okk, does enumReg contains 0 ?

cursor = generateS390BranchInstruction(cg, TR::InstOpCode::BRC, TR::InstOpCode::COND_BRC, node, populateFirstDimDataAddrSlot);
}
else
bool isOffHeapAllocationEnabled = TR::Compiler->om.isOffHeapAllocationEnabled();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need proper indentation here.

@VermaSh VermaSh force-pushed the inline_array_allocation_z branch 2 times, most recently from 961e9c7 to 152a6d8 Compare January 17, 2025 23:57
@VermaSh
Copy link
Contributor Author

VermaSh commented Jan 17, 2025

@r30shah I have updated the PR based on your suggestions. Can I please get another review?

Update array inline allocation sequence to initialize dataAddr field
only for non-zero size arrays. Field should be left blank for zero
size arrays.

Signed-off-by: Shubham Verma <[email protected]>
@VermaSh VermaSh force-pushed the inline_array_allocation_z branch from 152a6d8 to bbd1f85 Compare January 18, 2025 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants