Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XRTRunner Utility Class & Programming Examples Cleanup #673

Merged
merged 37 commits into from
Jul 31, 2024
Merged
Changes from 1 commit
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
2b24c89
Introduce XRTRunner
hunhoffe Jul 19, 2024
50e020a
Fixup a few bugs
hunhoffe Jul 19, 2024
155fc27
migrate passthrough_channel to xrtrunner
hunhoffe Jul 19, 2024
298c0bc
migrate passthrough_kernel to use xrtrunner
hunhoffe Jul 19, 2024
7d3f247
Fix up passthrough kernel structure and comments
hunhoffe Jul 19, 2024
d51a2cb
Merge branch 'main' into xrt-runner-utility
hunhoffe Jul 19, 2024
d13e14f
clean up other passthrough examples
hunhoffe Jul 19, 2024
273ffa9
Continue cleaning up passthrough example
hunhoffe Jul 19, 2024
5784a40
Start to clean up matrix scalar add
hunhoffe Jul 20, 2024
c1d8893
Continue fixing up matrix scalar add example
hunhoffe Jul 20, 2024
41cdb31
Get multi-launch example ported to new format
hunhoffe Jul 20, 2024
9c88fd3
Clean up multi-core-dma; use tile_size where appropriate
hunhoffe Jul 20, 2024
569a2ac
fixup multi core channel
hunhoffe Jul 20, 2024
0585149
Fixing up multi launch example, not working currently
hunhoffe Jul 20, 2024
5c522d8
Merge branch 'main' into xrt-runner-utility
hunhoffe Jul 24, 2024
69517a0
Rewrite multi-launch channel in a way that makes more sense; it still…
hunhoffe Jul 29, 2024
e48465e
Mark multi launch test to fail
hunhoffe Jul 29, 2024
a6ae88b
Clean up shim dma 2d example
hunhoffe Jul 29, 2024
14cd7fc
Clean up segment alloc example
hunhoffe Jul 29, 2024
59935cc
Merge branch 'main' into xrt-runner-utility
hunhoffe Jul 29, 2024
5091c24
Clean up segment_alloc code a little bit more
hunhoffe Jul 29, 2024
505b5c1
update multi-segment dma example
hunhoffe Jul 29, 2024
609709b
finish cleaning up multi segment
hunhoffe Jul 29, 2024
2c8c3dd
Test different data types with transpose DMA
hunhoffe Jul 29, 2024
8cdef30
add mapping for bfloat16
hunhoffe Jul 29, 2024
bb8d107
Clean up makefiles
hunhoffe Jul 29, 2024
8e3c4a7
Fix some bugs
hunhoffe Jul 29, 2024
ee31b76
Clean up channel herd_to_herd examples
hunhoffe Jul 29, 2024
9af812a
revert bad makefile changes
hunhoffe Jul 29, 2024
e3706bb
Fix up channel size example
hunhoffe Jul 29, 2024
17c1347
Fixup datatype mismatch between uint32 and int32 in programming examples
hunhoffe Jul 29, 2024
a69df68
Fix up hierarchical example
hunhoffe Jul 29, 2024
f5f9214
clean up worker to worker
hunhoffe Jul 29, 2024
959bb30
Merge branch 'main' into xrt-runner-utility
hunhoffe Jul 30, 2024
497a591
update programming example documentation
hunhoffe Jul 30, 2024
b48b8fc
Merge branch 'main' into xrt-runner-utility
hunhoffe Jul 30, 2024
982a33a
Merge branch 'main' into xrt-runner-utility
hunhoffe Jul 31, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix up passthrough kernel structure and comments
hunhoffe committed Jul 19, 2024
commit 7d3f247016abfc642fa65092ddc77b8af869edcb
Original file line number Diff line number Diff line change
@@ -8,61 +8,55 @@
from air.dialects.memref import AllocOp, DeallocOp
from air.dialects.func import FuncOp
from air.dialects.scf import for_, yield_
from air.backend.xrt_runner import XRTRunner
from air.backend.xrt_runner import XRTRunner, type_mapper

range_ = for_

INOUT_DATATYPE = np.uint8
INOUT_ELEM_SIZE = np.dtype(INOUT_DATATYPE).itemsize


@module_builder
def build_module(vector_size, num_subvectors):
assert vector_size % num_subvectors == 0

# chop input in 4 sub-tensors
lineWidthInBytes = vector_size // num_subvectors
xrt_dtype = type_mapper(INOUT_DATATYPE)

# Type and method of input/output
memrefTyInOut = T.memref(vector_size, T.ui8())
memrefTyInOut = T.memref(vector_size, xrt_dtype)
ChannelOp("ChanIn")
ChannelOp("ChanOut")

# We want to store our data in L1 memory
mem_space = IntegerAttr.get(T.i32(), MemorySpace.L1)
# The compute core splits input into subvectors for processing
lineWidthInBytes = vector_size // num_subvectors

# This is the type definition of the image
# Memref type definition used by the compute core and external function
mem_space = IntegerAttr.get(T.i32(), MemorySpace.L1)
tensor_type = MemRefType.get(
shape=[lineWidthInBytes],
element_type=T.ui8(),
element_type=xrt_dtype,
memory_space=mem_space,
)

# Function definition of the external function we will call
passThroughLine = external_func(
"passThroughLine", inputs=[tensor_type, tensor_type, T.i32()]
)

# We will send an image worth of data in and out
@FuncOp.from_py_func(memrefTyInOut, memrefTyInOut)
def copy(arg0, arg1):

# The arguments are the input and output
@launch(operands=[arg0, arg1])
def launch_body(a, b):
ChannelPut("ChanIn", a)
ChannelGet("ChanOut", b)

# The arguments are still the input and the output
@segment(name="seg")
def segment_body():

# The herd sizes correspond to the dimensions of the contiguous block of cores we are hoping to get.
# We just need one compute core, so we ask for a 1x1 herd
@herd(name="copyherd", sizes=[1, 1], link_with="passThrough.cc.o")
def herd_body(tx, ty, sx, sy):
def herd_body(_tx, _ty, _sx, _sy):

for i in range_(num_subvectors):
# We must allocate a buffer of image size for the input/output
# Process each subvector individually
for _i in range_(num_subvectors):
tensor_in = AllocOp(tensor_type, [], [])
tensor_out = AllocOp(tensor_type, [], [])

30 changes: 30 additions & 0 deletions python/air/backend/xrt_runner.py
Original file line number Diff line number Diff line change
@@ -5,8 +5,38 @@

import numpy as np
from .xrt import XRTBackend
from air.dialects.air import *
import filelock
from typing import List
from collections import defaultdict

TYPE_MAP_DICT = defaultdict(
lambda: None,
{
np.uint8: T.ui8,
# TODO: add more mappings here
},
)


def type_mapper(np_dtype):
"""
This function is meant to run within a module context (e.g., with a function wrapped with @build_module)
args:
np_dtype: the numpy data type to map
return:
The data type to run on the npu
"""
xrt_dtype = TYPE_MAP_DICT[np_dtype]()

if xrt_dtype is None:
raise AirBackendError(f"numpy data type {np_dtype} has no default mapping")
elif xrt_dtype.width / 8 != np.dtype(np_dtype).itemsize:
# This is a sanity check on the TYPE_MAP_DICT rather than a check on the user input
raise AirBackendError(
f"Python data type has width {xrt_dtype.width / 8} but numpy data type has width {np.dtype(np_dtype).itemsize}"
)
return xrt_dtype


class XRTRunner: