-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pykernel - matmul common reader/writer kernels #2289
base: main
Are you sure you want to change the base?
Conversation
79998c8
to
295da6a
Compare
NocAsyncReadTile | ||
}]; | ||
|
||
let arguments = (ins I32:$id, TTKernel_InterleavedAddrGenFast:$s, I32:$dstLocalL1Addr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s
is too short of an identifier to be useful, can we change it to something longer/more self-documenting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to addrGenStruct
@@ -634,6 +652,16 @@ def TTKernel_StoreToL1Op : TTKernel_Op<"store_to_l1"> { | |||
let arguments = (ins I32:$value, TTKernel_L1AddrPtr:$l1_ptr, I32:$offset); | |||
} | |||
|
|||
def TTKernel_GetInterleavedAddrGenFastOp : TTKernel_Op<"get_interleaved_addr_gen_fast"> { | |||
let summary = "GetAddrGenFastConfig"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
summary
doesn't seem to match GetInterleavedAddrGenFastOp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, ty!
rewriter.create<emitc::LoadOp>(op->getLoc(), opaqueStructType, varOp); | ||
|
||
// Replace the original operation with the loaded value so it can be used. | ||
op.replaceAllUsesWith(loadOp.getResult()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think op.replaceAllUsesWith(loadOp.getResult());
is a legal modification when using conversion patterns -- all IR modifications must be done via rewriter
(see https://www.youtube.com/watch?v=xIeihq2WZOU). The rewriter.replaceOp...
line should be enough -- is that not the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm not sure why I left that in there. It works without, thank you!
NocAsyncWriteTilie | ||
}]; | ||
|
||
let arguments = (ins I32:$id, TTKernel_InterleavedAddrGenFast:$s, I32:$srcLocalL1Addr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about s
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to addrGenStruct
@@ -556,6 +624,7 @@ class ConvertTTKernelToEmitCPass | |||
return op.getNumArguments() == 0; | |||
}); | |||
target.addLegalOp<func::ReturnOp>(); | |||
target.addIllegalOp<ttkernel::GetInterleavedAddrGenFastOp>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should not be needed here because the next line declares all of ttkernel
dialect as illegal at the end of this conversion step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this was sloppy, ty!
// CHECK: "emitc.member"(%[[VAR]]) <{member = "data_format"}>{{.*}} | ||
// CHECK: emitc.assign {{.*}} | ||
// CHECK: emitc.assign {{.*}} | ||
// CHECK: emitc.assign {{.*}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using {{.*}}
at the end of regexes isn't doing anything useful, because CHECK is already a match for a substring of a line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed ty!
// CHECK: = emitc.load %[[VAR]] : <!emitc.opaque<"InterleavedAddrGenFast<true>">> | ||
%is_dram = arith.constant 1 : i1 | ||
%temp1 = arith.constant 262400 : i32 | ||
%temp2 = arith.constant 32 : i32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asking for two changes here:
- since we are using %temp1 and %temp2 in multiple places later and they aren't just dummy inputs, can we give these SSA vars some meaningful names?
- can we capture the names of these variables by using
%[[CAPITALIZED_SSA_VAR_NAME]] = ...
CHECKs and then verify that these variables are passed intoemitc.call_opaque "noc_async_write_tile"
in the correct arg order instead of just using {{.*}} there? look at other tests in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've given them more meaningful names as well as capturing and checking for them in later uses.
"ttkernel.noc_async_write_tile"(%temp2, %s, %temp1) : (i32, !ttkernel.interleaved_addr_gen_fast, i32) -> () | ||
// CHECK: emitc.call_opaque "noc_async_read_tile"{{.*}} | ||
"ttkernel.noc_async_read_tile"(%temp2, %s, %temp1) : (i32, !ttkernel.interleaved_addr_gen_fast, i32) -> () | ||
// CHECK-NEXT: return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this CHECK_NEXT anchor doesn't seem to be doing anything useful here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep.. removed
auto lvalueBankBaseAddr = rewriter.create<emitc::MemberOp>( | ||
op->getLoc(), | ||
emitc::LValueType::get(op.getBankBaseAddress().getType()), | ||
"bank_base_address", varOp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the correct pattern during dialect conversion is to obtain operand types/properties from the adaptor
, not the op
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, 100% agree. I used adaptor for dataformat too, not sure why I was using op
here :(
"bank_base_address", varOp); | ||
auto lvaluePageSize = rewriter.create<emitc::MemberOp>( | ||
op->getLoc(), emitc::LValueType::get(op.getPageSize().getType()), | ||
"page_size", varOp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment, shouldn't this reference adaptor
, not op
?
78de1ce
to
83c1e10
Compare
Ticket
#2254
More specifically:
#2178
#2247
#2249
Problem description
Want to be able to write reader and writer kernels used in matmul programming examples.
What's changed
InterleavedAddrGenFast
cpp struct mapping:get_interleaved_addr_gen_fast
noc_async_read_tile
ttkernel opnoc_async_write_tile
ttkernel opChecklist