Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neureka #1

Open
wants to merge 271 commits into
base: neureka
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
271 commits
Select commit Hold shift + click to select a range
8e8febd
Rename folders
lukamac May 19, 2022
e42368f
small modification to read correctly the output of the network in int32
ABurrello May 20, 2022
a72bf94
fixing bugs for Quantlab 8bits networks
ABurrello May 24, 2022
7957819
Remove i_conv
lukamac May 24, 2022
dfdb3d1
modified the code reserved space. It is now in the config file of eve…
ABurrello May 24, 2022
2005dc1
Add to_byte function
lukamac May 25, 2022
5a5da6e
Change pointer type to match output type and bits
lukamac May 25, 2022
c779bdc
Fix wrong check_sum_out
lukamac May 25, 2022
9bfdd04
MV1/MV2 4bits/8bits uint/int with quantlab working
ABurrello May 27, 2022
8a4e64a
Full tests performed on the 8 config_files present, 4 networks from N…
ABurrello May 30, 2022
4e33a98
fixed free of ram and global variables for multiple executions of net…
ABurrello Jun 1, 2022
d576e64
fixed memory issues. layer_generator working only with output unsigned
ABurrello Jun 2, 2022
60e2b66
fixed order of arguments for arguments of add nodes exported by quantlab
ABurrello Jun 9, 2022
5e847e8
added modification to 1d,2d,3d dma copies and all files for one layer…
ABurrello Jun 17, 2022
451403c
added a new target, GAP8_board, to deploy networks on chip without do…
ABurrello Jun 29, 2022
8ae2b6a
updated dory_example pointer
ABurrello Jun 29, 2022
7673e8a
fixed problems with weight dimension not multiple of 4
ABurrello Jul 7, 2022
834bf1f
Diana fixing of addition nodes
ABurrello Jul 8, 2022
0d6bc22
Add GAP8 network tests
lukamac Jul 8, 2022
1bd1a30
Add printing of stderr on build fail
lukamac Jul 8, 2022
c581e30
adding support for Resnet20 deployed on Diana
ABurrello Jul 12, 2022
e36bf80
fixed submodules commit and added tests to Tests
ABurrello Jul 12, 2022
738a455
Removed submodule
ABurrello Jul 12, 2022
2567484
added dory-hal submodule for Diana
ABurrello Jul 12, 2022
47a2229
minors
ABurrello Jul 14, 2022
72676fe
added support for analog operations of Diana
Jul 19, 2022
64d3c2b
analog deployment working for performance of ox_unrolling=1
Jul 20, 2022
897efb7
added tests for GAP8_board
Jul 20, 2022
7697249
Refactor L3-L2 task sched
lukamac Jul 20, 2022
dea8c84
fixing analog execution
Jul 27, 2022
0b927c4
added modifications to extract strings for TVM.
Jul 27, 2022
e012dc9
minor
Jul 28, 2022
745fda9
new backend for DianaTVM
Jul 28, 2022
fcc2a60
changed prototype of layer function
FrancescoConti Jul 28, 2022
c4eb561
Merge pull request #29 from pulp-platform/refactoring-rebased
ABurrello Jul 28, 2022
cf5ecad
Update README.md
FrancescoConti Jul 28, 2022
eb1fa8d
renamed folders to remove -
Jul 28, 2022
6fc01a8
modifying structure to create a package
Jul 28, 2022
4ff3f08
added __init__.py files to create dory library
Jul 29, 2022
81fc21b
minors
Jul 29, 2022
0f9ad4e
Minor changes to integrate correctly with TVM
maartenvds Jul 29, 2022
53c7d7a
Merge pull request #30 from maartenvds/master
ABurrello Jul 29, 2022
34a2bf4
updated README.md
Aug 1, 2022
bd14297
Update README.md
ABurrello Aug 1, 2022
e1bcc69
Add conditional import for dory_examples
JosseVanDelm Aug 1, 2022
61cf3c1
Merge pull request #32 from JosseVanDelm/master
ABurrello Aug 1, 2022
09d7a41
fix mixed-hw deployment
da-gazzi Aug 2, 2022
6165889
Use relative path for git submodules
JosseVanDelm Aug 2, 2022
80e00a7
Merge pull request #33 from JosseVanDelm/master
ABurrello Aug 2, 2022
c9605e2
fix HW_node._compress for signed inputs
da-gazzi Aug 5, 2022
e7d83a6
add support to GAP8_L2 for multiple inputs, signed inputs, sub-8b inputs
da-gazzi Aug 5, 2022
f24775b
GAP network templates: close file descriptors (memory leak)
da-gazzi Aug 10, 2022
5c9f9d8
vectorize HW_node._compress and use it also for weights
da-gazzi Aug 10, 2022
25d6651
use np.ndarray.tofile instead of for-loop writing of hex files
da-gazzi Aug 10, 2022
a6360e3
print memory usage when building
da-gazzi Aug 10, 2022
a1fc884
guard against non-functional n_inputs values in network_generate
da-gazzi Aug 10, 2022
5f2edf7
quit on GAP8_board + mixed-hw combination
da-gazzi Aug 10, 2022
109b6ba
add stub code for multiple inputs to GAP8_board and GAP8_gvsoc targets
da-gazzi Aug 10, 2022
1695a02
add n_test_inputs to every node correctly
da-gazzi Aug 11, 2022
6b79933
accept but ignore Squeeze layers
da-gazzi Aug 11, 2022
2112f3d
conv1d
da-gazzi Aug 11, 2022
649356c
unsigned conv1d w/ dilations working for gap8_gvsoc
da-gazzi Aug 16, 2022
d9071c1
unified outmult and outmul. The different names caused problems
Aug 24, 2022
76447b8
Merge branch 'master' of github.com:pulp-platform/dory
Aug 24, 2022
2c67bb7
added a first version of untested occamy backend
Aug 26, 2022
c97eb6d
Removing mixed kernels for wrong clone
Aug 31, 2022
924a76a
Added back pulp-nn-mixed kernels
Aug 31, 2022
f51eead
added branch for commit
Aug 31, 2022
aad120b
changed position of tests for docker
Aug 31, 2022
1362551
creating main.yml
ABurrello Aug 31, 2022
49ae096
Create docker-image.yml
ABurrello Aug 31, 2022
3c48141
Update docker-image.yml
ABurrello Aug 31, 2022
937dbd9
added Dockerfile
Aug 31, 2022
4389099
Create docker-image.yml
ABurrello Aug 31, 2022
d5a3e06
Update docker-image.yml
ABurrello Aug 31, 2022
4d7c646
Update docker-image.yml
ABurrello Aug 31, 2022
95286db
Update docker-image.yml
ABurrello Aug 31, 2022
ffee584
Update docker-image.yml
ABurrello Aug 31, 2022
6c93edc
Update docker-image.yml
ABurrello Aug 31, 2022
692e1da
minors
Aug 31, 2022
9ee36a2
Merge branch 'master' of github.com:pulp-platform/dory
Aug 31, 2022
5c7793c
Update docker-image.yml
ABurrello Aug 31, 2022
a807aeb
Update docker-image.yml
ABurrello Aug 31, 2022
bfd8523
Update docker-image.yml
ABurrello Aug 31, 2022
e37c3b7
Update docker-image.yml
ABurrello Aug 31, 2022
2d2b978
Update docker-image.yml
ABurrello Aug 31, 2022
fcc674b
Update docker-image.yml
ABurrello Aug 31, 2022
34c39a0
Update docker-image.yml
ABurrello Aug 31, 2022
768f1f9
finalized CI. Testing
Aug 31, 2022
1bff86f
Update docker-image.yml
ABurrello Aug 31, 2022
0fbd9eb
Update docker-image.yml
ABurrello Aug 31, 2022
db23495
Update docker-image.yml
ABurrello Aug 31, 2022
f595e54
Update docker-image.yml
ABurrello Aug 31, 2022
707d1fe
Update docker-image.yml
ABurrello Aug 31, 2022
40a580c
Update docker-image.yml
ABurrello Aug 31, 2022
ec6cb94
Update docker-image.yml
ABurrello Aug 31, 2022
20b45f3
Update docker-image.yml
ABurrello Aug 31, 2022
7ca9269
Update docker-image.yml
ABurrello Aug 31, 2022
f2bfc03
Update docker-image.yml
ABurrello Aug 31, 2022
2f5c979
Update docker-image.yml
ABurrello Sep 1, 2022
5a51b33
Update docker-image.yml
ABurrello Sep 1, 2022
db23ae6
Update docker-image.yml
ABurrello Sep 1, 2022
d3069d1
Update docker-image.yml
ABurrello Sep 1, 2022
4674638
Update docker-image.yml
ABurrello Sep 1, 2022
8c8fc71
Update docker-image.yml
ABurrello Sep 1, 2022
94786d2
testing CI
Sep 1, 2022
8a5801c
testing CI
Sep 1, 2022
6c9fd4c
Update test_GAP8.py
ABurrello Sep 1, 2022
da01cf1
testing CI
Sep 1, 2022
98064d3
Merge branch 'master' of github.com:pulp-platform/dory
Sep 1, 2022
14fbd4b
Update test_GAP8.py
ABurrello Sep 1, 2022
d1ac156
testing CI
Sep 1, 2022
33e9861
Update test_GAP8.py
ABurrello Sep 1, 2022
6a0e54a
Update docker-image.yml
ABurrello Sep 1, 2022
4bb2ca3
checking that CI fails on uncorrect tests
Sep 1, 2022
8e21bc2
checking that CI fails on uncorrect tests
Sep 1, 2022
8501e93
checking that CI fails on uncorrect tests
Sep 1, 2022
8688f0e
fixed error in checking CI
Sep 1, 2022
06b1b91
fixed error in checking CI. Checking the working / not working output
Sep 1, 2022
0988e57
Added support for PULP-SDK for L2-only applications
ccioflan Sep 2, 2022
5d48ee9
conv1d support for mixed-hw
da-gazzi Sep 5, 2022
dea350e
revert l2 size for GAP8_gvsoc
da-gazzi Sep 5, 2022
5a122dc
added bias in 2d tilers. They were missing. Modified the template to …
Sep 23, 2022
23ec6f9
fix layer_generate to "sandwich" layers
da-gazzi Sep 23, 2022
45e987b
fix different-precision adder nodes (mismatches but no crash)
da-gazzi Sep 23, 2022
05f2543
fix bias memory calculation
da-gazzi Oct 3, 2022
3e95875
fix mixed-prec add also for gap8_board/l2
da-gazzi Oct 3, 2022
f3837b1
whitespace
da-gazzi Oct 4, 2022
1518cdd
Merge branch 'georgr/fixes' into georgr/merge_candidate
da-gazzi Oct 4, 2022
4928890
update pulp-nn-mixed
da-gazzi Oct 5, 2022
08c887c
Merge pull request #34 from pulp-platform/l2_pulp_sdk
ABurrello Oct 6, 2022
7c5d2f8
reduced layers of the docker image
Oct 6, 2022
86ce9cf
Merge branch 'master' of github.com:pulp-platform/dory
Oct 6, 2022
6145cb4
Merge branch 'master' into georgr/merge_candidate
ABurrello Oct 6, 2022
70ee1ce
Merge pull request #38 from pulp-platform/georgr/merge_candidate
ABurrello Oct 6, 2022
f3a3902
modified docker. NOT WORKING VERSION OF DORY. NEED TO FIX OUTPUTS
Oct 6, 2022
9257733
modified docker. NOT WORKING VERSION OF DORY. NEED TO FIX OUTPUTS
Oct 6, 2022
ca6166b
fix all broken tests & add new mixed-hw/signed test
da-gazzi Oct 7, 2022
9c1f1a6
Merge pull request #39 from pulp-platform/georgr/fix_breakage
ABurrello Oct 7, 2022
58612e7
pointing to the new commits
Oct 7, 2022
0027a8c
modifying Diana backend to add parameters to disable l2_l1 transfers
Oct 13, 2022
d2419cd
allow manual CI running
da-gazzi Oct 13, 2022
4a66ff8
modifications to support DORI in TVM
Oct 14, 2022
803de25
Merge branch 'master' of github.com:pulp-platform/dory
Oct 14, 2022
2c283a4
testing analog, FC and conv in Diana
Oct 17, 2022
d5c3400
Georgr/pr (#40)
da-gazzi Oct 18, 2022
af9ce57
remove redundant commands from docker setup scripts
da-gazzi Oct 18, 2022
3620e58
testing Diana analog+FC
Oct 18, 2022
8ece309
fixed FC digital on Diana. Analog under test. Stack in HW
Oct 18, 2022
4369168
modifications to integrate FC in TVM
Oct 18, 2022
f48d0c7
[WIP] unify templates to fixed lmacan's versions (works for gap8_gvsoc)
da-gazzi Oct 19, 2022
f06798a
refactoring: forgot actual template files...
da-gazzi Oct 20, 2022
f8e7262
[WIP] fix small type warning
da-gazzi Oct 20, 2022
3d4a6c6
analog, and all digital tests working on Diana
Oct 20, 2022
af1b540
refactor DMA to prettier code & proper 2D transfers
da-gazzi Oct 20, 2022
f8b1313
split DMA functions into .c and .h, fix checksums
da-gazzi Oct 20, 2022
68ddc21
GAP8 refactor for GAP8_board - works w/ pulp-sdk & GVSOC
da-gazzi Oct 21, 2022
05f1e47
forgotten refactoring files
da-gazzi Oct 21, 2022
d5bd343
GAP8_board pooling layer was missing a DMA barrier!
da-gazzi Oct 21, 2022
077d38d
CI: adapt test_GAP8 for refactored template's output
da-gazzi Oct 24, 2022
a058099
unify GAP8 tilers (excluding board_L2)
da-gazzi Oct 24, 2022
1e23267
CI: try to fix docker build
da-gazzi Oct 24, 2022
a664424
refactor GAP8_board_L2 backend
da-gazzi Oct 24, 2022
089b2d7
change ifdefs in dory_dma.h
da-gazzi Oct 24, 2022
74faf5b
fix dory_dma.h - CI w/ GAP_SDK should work now
da-gazzi Oct 24, 2022
95ffadd
fix mem.c for GAP SDK (GAP8) (hopefully)
da-gazzi Oct 24, 2022
f791a50
try to fix GAP SDK build
da-gazzi Oct 24, 2022
7266724
CI: hail mary to get gap_sdk to work - don't install magick in docker
da-gazzi Oct 25, 2022
bb765b3
(hopefully) finally fix GAP8 compatibility!
da-gazzi Oct 26, 2022
4bf0d72
change HW_description.json of GAP targets back to gap_sdk
da-gazzi Oct 26, 2022
5476f7c
minor modifications on Diana tiler
Oct 28, 2022
5ea8ae8
fix refactoring on gapuino - everything working
da-gazzi Nov 1, 2022
51a647b
Patch template of analog Conv2D for Diana TVM backend
Nov 1, 2022
102803b
fix board_L2 pooling template
da-gazzi Nov 1, 2022
771e848
another fix of board_L2 pooling
da-gazzi Nov 2, 2022
03f4ec7
Merge pull request #43 from JosseVanDelm/master
ABurrello Nov 2, 2022
46a2651
fix GAP8_board_L2; add back multiple input option
da-gazzi Nov 2, 2022
5e87ae1
updated op_typ in C parser Diana
Nov 2, 2022
a29470e
fix multi-input with L3
da-gazzi Nov 2, 2022
53c5847
fix all refactored stuff, rename GAP8 backend to PULP
da-gazzi Nov 2, 2022
8499c4e
update dory_examples
da-gazzi Nov 2, 2022
f48b57e
make headers prettier
da-gazzi Nov 2, 2022
530df05
update examples
da-gazzi Nov 2, 2022
40b7b59
move identical layer templates to common
da-gazzi Nov 3, 2022
fa29858
add back TCDM 2D transfer flags to mchan.h
da-gazzi Nov 3, 2022
0b3ce35
update dory_examples
da-gazzi Nov 3, 2022
0e9011e
Merge branch 'master' into georgr/refactor_gap8
da-gazzi Nov 3, 2022
e35dfdb
Merge pull request #42 from pulp-platform/georgr/refactor_gap8
da-gazzi Nov 3, 2022
ca3a6f6
fixed layout FC. FC working with all dimensions
Nov 3, 2022
5b36c91
Merge branch 'master' of github.com:pulp-platform/dory
Nov 3, 2022
8b76844
added bias type in diana_tvm
Nov 4, 2022
20b3d61
minor
Nov 4, 2022
6e35982
updated dory_example
Nov 4, 2022
66cde41
fixed bias and nif <16 for FC on Diana
Nov 4, 2022
212735c
Fix regression in Conv2D template for Diana_TVM
Nov 4, 2022
202710d
Merge pull request #44 from JosseVanDelm/master
ABurrello Nov 4, 2022
a8e28ec
modified template writer to allow TVM to export TVM node
Nov 11, 2022
b4d5eed
further fixes for Add
Nov 11, 2022
9791d6c
removed useless include
Nov 11, 2022
db9ae6f
modified analog weights to fit L2
Nov 12, 2022
afddfa6
changed padding margins for analog
Nov 12, 2022
2fd2cdf
fixed error in weights transfer
Nov 14, 2022
c648174
Fixes ONNX spec compliance in parser
Nov 23, 2022
95ecc2b
Fixes bug where possibly undefined fields in HW_node class are called
Nov 23, 2022
0b5a0d3
Fixes bugs where bitwise operators were used with possibly float type…
Nov 23, 2022
4e4e299
Fixes ONNX compliance of Quantlab Parser
Nov 23, 2022
16c42d0
Merge remote-tracking branch 'origin/master' into neureka
Nov 28, 2022
a6ad6a2
Intermediate commit
Nov 28, 2022
44b31b7
Delete unneeded files
Nov 28, 2022
4266e47
Updated dependencies
Nov 29, 2022
07d8a4c
Merged main branch
Nov 29, 2022
7e3a290
WIP: Working on offloading, MV2 passes on cluster
Dec 12, 2022
5f54cb7
MV1 passes with offloading
Dec 13, 2022
b41095c
Cleanup
Dec 14, 2022
7a2d1fd
fixed layer generate
Dec 14, 2022
ba05976
Fixed dependencies
Dec 14, 2022
2b7075f
Fixed weight memory support in NEUREKA
Dec 15, 2022
54cc080
Fixed tiling skip
Dec 15, 2022
9be5d18
Fixed several N-EUREKA memory issues
Dec 15, 2022
d25484d
Single tile padding works
Dec 21, 2022
1ac6373
Slightly tested padding works
Dec 22, 2022
a9e3ded
Fixed MVNv1, stride==1
Jan 3, 2023
d61ce8f
Fixed geometrical constraint for s>1
Jan 3, 2023
e2ab531
Initial version with single-tile striding working
Jan 4, 2023
92e88f3
Fixed regression
Jan 4, 2023
f9126c9
Fixed striding implementation for tiled feature maps
Jan 5, 2023
d2e93b4
Fixed padding bug, MBNv1 runs
Jan 5, 2023
20ed649
Added offloading for non-unit stride
Jan 5, 2023
ed542ad
Improved reshuffle operation by DMA use
Jan 5, 2023
c89302b
Fixed MVN2 regression
Jan 6, 2023
0b2f224
Fixed signed output behaviour for NEUREKA
Jan 6, 2023
e34ab13
First version of MVN2 working
Jan 6, 2023
36c8f2b
Implemented L2 and mixed
Jan 9, 2023
0e76826
Checking in missing sources
Jan 9, 2023
acbdaf9
WIP: Regressing on real hardware
Jan 10, 2023
435066d
Fixed regression for MVN1 on board
Jan 12, 2023
21c3f73
End-to-end HW deployment flow
Jan 12, 2023
d4b06c3
Further fixes
Jan 12, 2023
e734696
Update for sharing w/ ETH
Jan 31, 2023
70b3c7e
Merge branch 'neureka' of github.com:Scheremo/dory into neureka
Jan 31, 2023
af0c6b2
Link pulp-nn libraries
Jan 31, 2023
8ac25b7
Update HandTracking Deployment parts
Feb 2, 2023
80d0839
Update paths
Feb 3, 2023
0269eff
Update dory submodule paths
Feb 3, 2023
a2aee12
Remove pulp-nn-1d
Feb 3, 2023
02b0470
Update module paths
Feb 3, 2023
ad1bfcd
Fixed sub-byte weight implementation
Feb 7, 2023
018db27
Adds JSON Dump of task before template write out
Feb 28, 2023
26cc7f9
Add RUNTIMEMEASUREMENT flag for measuring runtime on N-EUREKA
Feb 28, 2023
0837f43
Fix broken commit
Mar 1, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update for sharing w/ ETH
Moritz authored and Moritz committed Jan 31, 2023
commit e7346966b2390cf04f9e718ba431d3d195b8e9c5
2 changes: 1 addition & 1 deletion confs/testconfig_HT_frontEnd.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"BNRelu_bits": 32,
"onnx_file": "../../quant/doryStimuli_frontEnd/QL_testnet.onnx_ql_integerized.onnx",
"code reserved space": 260000,
"code reserved space": 570000,
"input_bits": 8,
"input_signed": false,
"use_wmem": true,
31 changes: 17 additions & 14 deletions dory/Hardware_targets/Siracusa/Common/C_Parser.py
Original file line number Diff line number Diff line change
@@ -31,15 +31,15 @@

class C_Parser_Siracusa(Parser_HW_to_C):
# Used to manage the ONNX files. By now, supported Convolutions (PW and DW), Pooling, Fully Connected and Relu.
def __init__(self, graph, config_file, config_file_dir, verbose_level, perf_layer, precision_library, app_directory, n_inputs=1):
def __init__(self, graph, config_file, config_file_dir, verbose_level, perf_layer, precision_library, app_directory, n_inputs=1, prefix=''):

file_path = self.get_file_path()
with open(os.path.join(file_path, "HW_description.json")) as f:
HW_description = json.load(f)
self.precision_library = precision_library
self.source_Constant_bits_library = config_file["BNRelu_bits"]
self.config_file = config_file
super().__init__(graph, os.path.join(config_file_dir, os.path.dirname(config_file["onnx_file"])), HW_description, verbose_level, perf_layer, "Makefile", app_directory, n_inputs)
super().__init__(graph, os.path.join(config_file_dir, os.path.dirname(config_file["onnx_file"])), HW_description, verbose_level, perf_layer, "Makefile", app_directory, n_inputs, prefix)
self.acc = Neureka()
try:
db = HW_description['double_buffering']
@@ -159,6 +159,7 @@ def mapping_layers_to_C_files(self):
n_memory_levels = self.HW_description['memory']['levels']

for i, node in enumerate(self.HWgraph):
node.prefix = self.prefix
if not hasattr(node, "offloadable") or not node.offloadable:
self.map_layer_to_C_file(node, n_memory_levels, tmpl_dir, out_dir)
else:
@@ -189,10 +190,10 @@ def create_hex_weight(self, node):
weights += bytearray([0] * (4 - len(weights) % 4))

weightstr = ''
weightstr += f"#include \"{node.name}_weights.h\"\r\n"
weightstr += f"#include \"{node.prefix}{node.name}_weights.h\"\r\n"
weightstr += f"#include \"pmsis.h\"\r\n"
weightstr += '__attribute__ ((section(".weightmem_sram"))) '
weightstr += f"unsigned char {node.name}_weights[{len(weights)}] = "
weightstr += f"unsigned char {node.prefix}{node.name}_weights[{len(weights)}] = "
weightstr += "{"
weightstr += ", ".join("0x"+format(x, '02x') for x in weights)
weightstr += "};\r\n"
@@ -201,25 +202,25 @@ def create_hex_weight(self, node):
if const != 0:
val = bytes(getattr(node,const)['value'])
weightstr += 'PI_L2 '
weightstr += f"unsigned char {node.name}_{const}[{len(val)}] = "
weightstr += f"unsigned char {node.prefix}{node.name}_{const}[{len(val)}] = "
weightstr += "{"
weightstr += ", ".join("0x"+format(x, '02x') for x in val)
weightstr += "};\r\n"

weightstr_h = f"#ifndef __INCLUDE_GUARD_{node.name}\r\n"
weightstr_h += f"#define __INCLUDE_GUARD_{node.name}\r\n"
weightstr_h += f"extern unsigned char {node.name}_weights[{len(weights)}];\r\n"
weightstr_h = f"#ifndef __INCLUDE_GUARD_{node.prefix}{node.name}\r\n"
weightstr_h += f"#define __INCLUDE_GUARD_{node.prefix}{node.name}\r\n"
weightstr_h += f"extern unsigned char {node.prefix}{node.name}_weights[{len(weights)}];\r\n"
for const in constants[1:]:
if const != 0:
val = bytes(getattr(node,const)['value'])
weightstr_h += f"extern unsigned char {node.name}_{const}[{len(val)}];\r\n"
weightstr_h += f"extern unsigned char {node.prefix}{node.name}_{const}[{len(val)}];\r\n"
weightstr_h += f"\r\n#endif"

filepath = os.path.join(self.app_directory, 'src', node.name + "_weights.c")
filepath = os.path.join(self.app_directory, 'src', node.prefix + node.name + "_weights.c")
with open(filepath, 'w') as file:
file.write(weightstr)

filepath = os.path.join(self.app_directory, 'inc', node.name + "_weights.h")
filepath = os.path.join(self.app_directory, 'inc', node.prefix + node.name + "_weights.h")
with open(filepath, 'w') as file:
file.write(weightstr_h)
else:
@@ -250,16 +251,17 @@ def create_hex_weight(self, node):
tk['weights_vectors'] = self.weights_vectors
tk['weights_dimensions'] = self.weights_dimensions
tk['DORY_HW_graph'] = self.HWgraph
tk['prefix'] = node.prefix
tk['sdk'] = node.HW_description["software development kit"]["name"]
root = os.path.dirname(__file__)
tmpl = Template(filename=os.path.join(root, "Templates/weights_h_template.h"))
s = tmpl.render(**tk)
save_string = os.path.join(self.inc_dir, 'weights.h')
save_string = os.path.join(self.inc_dir, f'{node.prefix}weights.h')
with open(save_string, "w") as f:
f.write(s)
tmpl = Template(filename=os.path.join(root, "Templates/weights_definition_h_template.h"))
s = tmpl.render(**tk)
save_string = os.path.join(self.inc_dir, 'weights_definition.h')
save_string = os.path.join(self.inc_dir, f'{node.prefix}weights_definition.h')
with open(save_string, "w") as f:
f.write(s)

@@ -294,11 +296,12 @@ def create_hex_input(self):
s += f"{hex(np.uint8(num+256))}, "
tk = OrderedDict([])
tk['input_values'] = s[:-2]
tk['prefix'] = self.prefix
tk['dimension'] = len(x_in)
tk['sdk'] = self.HW_description["software development kit"]["name"]
root = os.path.dirname(__file__)
tmpl = Template(filename=os.path.join(root, "Templates/input_h_template.h"))
s = tmpl.render(**tk)
save_string = os.path.join(self.inc_dir, 'input.h')
save_string = os.path.join(self.inc_dir, f'{self.prefix}input.h')
with open(save_string, "w") as f:
f.write(s)
19 changes: 9 additions & 10 deletions dory/Hardware_targets/Siracusa/Common/HW_Parser.py
Original file line number Diff line number Diff line change
@@ -133,29 +133,28 @@ def check_parameters(self):
@staticmethod
def is_offloadable(node: Layer_node) -> bool:
#SCHEREMO: Check if it's an 8-Bit x 8-Bit or lower convolution
try:
memEstimate = (np.prod(node.input_dimensions)*node.input_channels + np.prod(node.output_dimensions)*node.output_channels + np.prod(node.kernel_shape)*node.input_channels*node.output_channels)
# try:
# memEstimate = (np.prod(node.input_dimensions)*node.input_channels + np.prod(node.output_dimensions)*node.output_channels + np.prod(node.kernel_shape)*node.input_channels*node.output_channels)

# SCHEREMO: MVN2 Hack
# if memEstimate > 1500000:
# return False
except:
return False
# # SCHEREMO: MVN2 Hack
# # if memEstimate > 1500000:
# # return False
# except:
# return False

if node.op_type == "BNReluConv" and node.weight_bits == 8 and node.input_activation_bits == 8:
#SCHEREMO: Check if it's a pointwise convolution:
if node.group == 1 and node.kernel_shape == [1,1]:
print("1x1 dense - Offloading to NEUREKA...")
return True
#SCHEREMO: Check if it's a dense 3x3 convolution:
elif node.input_channels == node.output_channels and node.group == 1 and node.kernel_shape == [3,3]:
elif node.group == 1 and node.kernel_shape == [3,3]:
print("3x3 dense - Offloading to NEUREKA...")
return True
elif node.input_channels == node.output_channels and node.group == node.input_channels and node.kernel_shape == [3,3]: #and node.input_dimensions[0] < 8:
#print("Not offloading to NEUREKA...")
print("3x3 dw - Offloading to NEUREKA...")
return True

return False


@@ -166,7 +165,7 @@ def mapping_to_HW_nodes(self):
if 'offload' in self.config_file and self.config_file['offload'] == True:
print("Offloading to N-EUREKA")
for idx, node in enumerate(self.DORY_Graph):
if (self.HW_description['memory']['levels']>2 and idx==0):
if (self.HW_description['memory']['levels'] > 2 and idx==0):
node.offloadable = False
else:
node.offloadable = onnx_manager_Siracusa.is_offloadable(node)
4 changes: 2 additions & 2 deletions dory/Hardware_targets/Siracusa/Common/Templates/Makefile.t
Original file line number Diff line number Diff line change
@@ -22,8 +22,8 @@ APP = main
APP_SRCS := $(wildcard src/*.c)
# -O2 with -fno-indirect-inlining is just as fast as -O3 and reduces code size considerably
# by not inlining of small functions in the managemengt code
APP_CFLAGS += -DNUM_CORES=$(CORE) -Iinc -O2 -fno-indirect-inlining -w -g3
APP_LDFLAGS += -lm -Wl,--print-memory-usage
APP_CFLAGS += -DNUM_CORES=$(CORE) -Iinc -O3 -w -flto
APP_LDFLAGS += -lm -Wl,--print-memory-usage -flto
FLASH_TYPE ?= HYPERFLASH
RAM_TYPE ?= HYPERRAM

Original file line number Diff line number Diff line change
@@ -25,9 +25,9 @@
#define __INPUT_H__
#include "pmsis.h"
% if sdk == 'gap_sdk':
L2_DATA uint8_t L2_input_h[${dimension}] = {
L2_DATA uint8_t ${prefix}L2_input_h[${dimension}] = {
% else:
PI_L2 uint8_t L2_input_h[${dimension}] = {
PI_L2 uint8_t ${prefix}L2_input_h[${dimension}] = {
% endif
${input_values}};
#endif
8 changes: 4 additions & 4 deletions dory/Hardware_targets/Siracusa/Common/Templates/main.c.t
Original file line number Diff line number Diff line change
@@ -22,18 +22,18 @@ n_inputs = DORY_HW_graph[0].n_test_inputs
single_input = n_inputs==1
%>\
% if not l3_supported:
#include "input.h"
#include "${prefix}input.h"
% else:
#include "mem.h"
% endif
#include "network.h"
#include "${prefix}network.h"
#include "siracusa_padctrl.h"
#include "pmsis.h"

% if verbose:
#define VERBOSE 1
% endif

% if sdk == 'pulp-sdk':
unsigned int PMU_set_voltage(unsigned int Voltage, unsigned int CheckFrequencies) {
return 0;
@@ -89,7 +89,7 @@ int main () {
ram_read(l2_buffer, ram_input, l2_input_size);
% endif

network_run(l2_buffer, ${l2_buffer_size}, l2_buffer, ${"0" if single_input else "exec"}${f", L2_input_h{' + exec * l2_input_size' if not single_input else ''}" if not l3_supported else ""});
network_run(l2_buffer, ${l2_buffer_size}, l2_buffer, ${"0" if single_input else "exec"}${f", {prefix}L2_input_h{' + exec * l2_input_size' if not single_input else ''}" if not l3_supported else ""});

% if not single_input:
}
32 changes: 17 additions & 15 deletions dory/Hardware_targets/Siracusa/Common/Templates/network.c.t
Original file line number Diff line number Diff line change
@@ -22,15 +22,15 @@ l3_supported = DORY_HW_graph[0].HW_description['memory']['levels'] > 2
%>\
#define DEFINE_CONSTANTS
% if not l3_supported and files_list != ' ':
#include "weights.h"
#include "${prefix}weights.h"
%endif
#include "pmsis.h"
#include "network.h"
#include "${prefix}network.h"
#include "directional_allocator.h"
#include "mem.h"
#include <string.h>
% for layer in list_h:
#include "${layer}"
#include "${prefix}${layer}"
% endfor

% if sdk == 'pulp-sdk':
@@ -39,8 +39,10 @@ l3_supported = DORY_HW_graph[0].HW_description['memory']['levels'] > 2
% endif

% if verbose:
#define VERBOSE 1
#define VERBOSE 0
% endif


static int nb_callback_exec=0;

static void cluster_task_callback(void *arg)
@@ -50,11 +52,11 @@ static void cluster_task_callback(void *arg)

% if 'Yes' in performance or 'Perf_final' in verbose_level:
static void print_perf(const char *name, const int cycles, const int macs) {
float perf = (float) macs / cycles;
int32_t perf = macs / cycles;
printf("\r\n%s performance:\r\n", name);
printf(" - num cycles: %d\r\n", cycles);
printf(" - MACs: %d\r\n", macs );
printf(" - MAC/cycle: %g\r\n", perf);
printf(" - MAC/cycle: %d\r\n", perf);
printf(" - n. of Cores: %d\r\n\r\n", NUM_CORES);
}

@@ -70,11 +72,11 @@ static void checksum(const char *name, const uint8_t *d, size_t size, uint32_t s
printf("OK\r\n");
else{
printf("Failed: true [%u] vs. calculated [%u]\r\n", sum_true, sum);
printf("Got the following:\r\n");
for (int i = 0; i < size; i++){
printf("%u, ", d[i]);
}
printf("\r\n");
/* printf("Got the following:\r\n"); */
/* for (int i = 0; i < size; i++){ */
/* printf("%u, ", d[i]); */
/* } */
/* printf("\r\n"); */
}
}
#endif
@@ -132,9 +134,9 @@ void execute_layer_fork(void *args) {
% for i, node in enumerate(DORY_HW_graph):
case ${i}:
%if hasattr(node, "offloadable") and node.offloadable:
${func_name[i]}(args);
${prefix}${func_name[i]}(args);
%else:
pi_cl_team_fork(NUM_CORES, (void *)${func_name[i]}, args);
pi_cl_team_fork(NUM_CORES, (void *)${prefix}${func_name[i]}, args);
%endif
break;
% endfor
@@ -309,7 +311,7 @@ void network_run(void *l2_buffer, size_t l2_buffer_size, void *l2_final_output,
asm volatile("": : :"memory");

% if 'Yes' in performance:
//print_perf(Layers_name[i], perf_cyc, NODEs_MACS[i]);
print_perf(Layers_name[i], perf_cyc, NODEs_MACS[i]);
% endif

// TODO: What error?
@@ -433,7 +435,7 @@ void network_run(void *l2_buffer, size_t l2_buffer_size, void *l2_final_output,
dir = !dir;
}

memcpy(L2_output, l2_final_output, activations_out_size[${len(DORY_HW_graph)-1}]);
memcpy(l2_final_output, L2_output, activations_out_size[${len(DORY_HW_graph)-1}]);



23 changes: 15 additions & 8 deletions dory/Hardware_targets/Siracusa/Common/Templates/network.h.t
Original file line number Diff line number Diff line change
@@ -17,8 +17,15 @@
* limitations under the License.
*/

#ifndef __NETWORK_H__
#define __NETWORK_H__
#ifndef __${prefix}NETWORK_H__
#define __${prefix}NETWORK_H__

% if prefix != "":
// SCHEREMO: Let the preprocessor mangle for us...
#define execute_layer_fork ${prefix}execute_layer_fork
#define network_run ${prefix}network_run
#define layer_args_t ${prefix}layer_args_t
% endif

% if sdk == 'gap_sdk':
#include "pulp.h"
@@ -28,13 +35,13 @@
single_input = n_inputs==1
%>\
% if not l3_supported and files_list != ' ':
#include "weights_definition.h"
#include "${prefix}weights_definition.h"
% endif
#include <stddef.h>

%for node in DORY_HW_graph:
%if hasattr(node, "offloadable") and node.offloadable and hasattr(node, "use_wmem") and node.use_wmem:
#include "${node.name}_weights.h"
#include "${prefix}${node.name}_weights.h"
%endif
%endfor

@@ -78,7 +85,7 @@ static int layers_pointers[${len(DORY_HW_graph)}];
% endif
static char * Layers_name[${len(DORY_HW_graph)}] = {\
% for node in DORY_HW_graph:
"${node.name}"${'' if loop.last else ', '}\
"${prefix}${node.name}"${'' if loop.last else ', '}\
% endfor
};
% if l3_supported:
@@ -114,9 +121,9 @@ static int allocate_layer[${len(DORY_HW_graph)}] = {\
static char *Weights_name[${len(DORY_HW_graph)}] = {\
% for i in range(len(DORY_HW_graph)):
% if (not (hasattr(DORY_HW_graph[i], "offloadable") and DORY_HW_graph[i].offloadable and hasattr(DORY_HW_graph[i], "use_wmem") and DORY_HW_graph[i].use_wmem)) and( 'Conv' in DORY_HW_graph[i].name or 'FullyConnected' in DORY_HW_graph[i].name):
Weights_${DORY_HW_graph[i].name}${'' if loop.last else ', '}\
${prefix}Weights_${DORY_HW_graph[i].name}${'' if loop.last else ', '} \
% elif (hasattr(DORY_HW_graph[i], "offloadable") and DORY_HW_graph[i].offloadable and hasattr(DORY_HW_graph[i], "use_wmem") and DORY_HW_graph[i].use_wmem) and( 'Conv' in DORY_HW_graph[i].name or 'FullyConnected' in DORY_HW_graph[i].name):
${DORY_HW_graph[i].name}_weights${'' if loop.last else ', '}\
${prefix}${DORY_HW_graph[i].name}_weights${'' if loop.last else ', '}\
% else:
"None"${'' if loop.last else ', '}\
% endif
@@ -230,7 +237,7 @@ static int layer_with_weights[${len(DORY_HW_graph)}] = {\
static void* layer_wmem_ptr[${len(DORY_HW_graph)}] = {\
% for node in DORY_HW_graph:
% if hasattr(node, "offloadable") and node.offloadable and hasattr(node, "use_wmem") and node.use_wmem:
${node.name}_weights${'' if loop.last else ', '}\
${prefix}${node.name}_weights${'' if loop.last else ', '}\
% else:
NULL${'' if loop.last else ', '}\
% endif
Original file line number Diff line number Diff line change
@@ -22,7 +22,7 @@
#define __WEIGHTS_DEFINITION_H__
% for i in range(len(weights_vectors)):
% if weights_dimensions[i] > 0:
extern uint8_t Weights_${weights_names[i]}[${weights_dimensions[i]}];
extern uint8_t ${prefix}Weights_${weights_names[i]}[${weights_dimensions[i]}];
% endif
% endfor
#endif
Original file line number Diff line number Diff line change
@@ -26,9 +26,9 @@
% for i in range(len(weights_vectors)):
% if weights_dimensions[i] > 0:
% if sdk == 'gap_sdk':
L2_DATA uint8_t Weights_${weights_names[i]}[${weights_dimensions[i]}] = {
L2_DATA uint8_t ${prefix}Weights_${weights_names[i]}[${weights_dimensions[i]}] = {
% else:
PI_L2 uint8_t Weights_${weights_names[i]}[${weights_dimensions[i]}] = {
PI_L2 uint8_t ${prefix}Weights_${weights_names[i]}[${weights_dimensions[i]}] = {
% endif
${weights_vectors[i]}};
% endif
Loading