Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatch result? #27

Closed
xooxit opened this issue Mar 27, 2022 · 15 comments
Closed

Mismatch result? #27

xooxit opened this issue Mar 27, 2022 · 15 comments

Comments

@xooxit
Copy link

xooxit commented Mar 27, 2022

Hi I reproduce the project following under command lines

mkdir build
cd build
cmake ../ -DMM_DATA_TYPE=float -DMM_PARALLELISM_N=32 -DMM_PARALLELISM_M=8 -DMM_MEMORY_TILE_SIZE_N=512 -DMM_MEMORY_TILE_SIZE_M=512 -DMM_ADD_RESOURCE=FAddSub_nodsp -DMM_MULT_RESOURCE=FMul_nodsp
make
make hw

Then run like this

./RunHardware.exe 1024 1024 1024 hw

and get mismatch result like this

Verifying result...
Mismatch at (0, 0): 30790.4 vs. 30340

I also tried to adjust the threshold of determining the mismatch result to be larger (i.e. from 1e-03 to 1e-02) and printed out all mismatched results.

Mismatch at (258, 158): 32371.9 vs. 29222.8
Mismatch at (258, 348): 33150 vs. 30126.4
Mismatch at (258, 410): 33577.4 vs. 30521.6
Mismatch at (258, 690): 32677.6 vs. 29691.8
Mismatch at (571, 31): 32113.1 vs. 29157.5
Mismatch at (571, 72): 32030.5 vs. 29066.1
Mismatch at (571, 167): 30717.5 vs. 27857.2
Mismatch at (571, 386): 32130.5 vs. 29166.1
Mismatch at (571, 414): 32495.7 vs. 29537.6
Mismatch at (571, 419): 32113.1 vs. 29166.3
Mismatch at (571, 603): 32465.7 vs. 29466.6
Mismatch at (571, 675): 32653.2 vs. 29656.1
Mismatch at (571, 962): 32408.8 vs. 29393.5
Mismatch at (643, 457): 28775.7 vs. 32123.4

My vitis version is 2021.2,
xrt version is 2.12.427 and
platform is xilinx_u250_gen3x16_xdma_3_1_202020_1

Btw, I learned a lot from it. Thanks for the nice work.

@xooxit xooxit changed the title HLS code parameter naming Mismatch result? Mar 30, 2022
@definelicht
Copy link
Contributor

Hey there! I have not seen this, but I also haven't tried running the kernel in Vitis 2021.2, because there's a serious performance issue in the memory reader code that suddenly popped up, which I haven't figured out how to fix.

Can you check if your kernel had this elevated II because of aSplit, in that it's related?

Does it pass in simulation? You can run a really small matrix so it doesn't take too long.

Did you try running any other configurations? Did they succeed/fail?

Unfortunately I don't have much time to maintain this these days, since I'm no longer affiliated with the university, so I would appreciate as much help as you can give me to figure out what the issue could be :-)

@xooxit
Copy link
Author

xooxit commented Mar 30, 2022

Hi - good luck wherever u go!

By the way, I checked the asplit in in memory.cpp, every related II is set to 1 but in the v++_MatrixMultiplicationKernel_hw.log file, there is a similar issue with #25.

===>The following messages were generated while  performing high-level synthesis for kernel: MatrixMultiplicationKernel Log file: /home/lab/yong/SoonToBeRemoved/gemm_hls/build-wDSP/_x/MatrixMultiplicationKernel_hw/MatrixMultiplicationKernel/vitis_hls.log :
INFO: [v++ 204-61] Pipelining loop 'ReadA_N0_ReadA_K0_ReadA_N1_ReadA_N2'.
INFO: [v++ 200-1470] Pipelining result : Target II = 1, Final II = 16, Depth = 93, loop 'ReadA_N0_ReadA_K0_ReadA_N1_ReadA_N2'

The entire log files are below.

I'm gonna rebuild the kernel in hardware mode with the same configuration in Github on VITIS 2020.2
( cmake ../ -DMM_DATA_TYPE=float -DMM_PARALLELISM_N=32 -DMM_PARALLELISM_M=8 -DMM_MEMORY_TILE_SIZE_N=512 -DMM_MEMORY_TILE_SIZE_M=512)

And I built the kernel in hardware mode, not simulation mode.

I ran various n,m,k combinations from n=m=k=16 to n=m=k=2048, and the number of mismatch results was getting larger.
Something strange is that when the input matrix configuration is the n=m=k=16, repeated executions of the command (./RunHardware.exe 1024 1024 1024 hw) make diff in RunHardware.cpp of mismatch results (std::abs(testVal - refVal)) bigger.

@charliechou1001
Copy link

Hi, I'm also using gemm_hls project to build my own work. My simulation result based on the gemm is correct.

The simulation and hardware mode do the same thing, so if the mismatch exists in hardware mode, it may also have mismatch in simulation mode. And what kind of data type you use? Is that floating point?

@definelicht
Copy link
Contributor

@xooxit Did compiling it in 2020.2 make a difference? I'm curious if the II=16 issue is related to the verification error.

@xooxit
Copy link
Author

xooxit commented Apr 4, 2022

@definelicht I did compiling again in 2021.1 (before one is 2021.2), it has no verification error and there is also no II=16 issue.

In the above verification error issue, I did build with -DMM_ADD_RESOURCE=FAddSub_nodsp -DMM_MULT_RESOURCE=FMul_nodsp. When I built without nodsp option in the same 2021.2 version, it has no verification issue but there is II=16 issue.

(I edited the corresponding build commands at the top question.)

@xooxit
Copy link
Author

xooxit commented Apr 4, 2022

@charliechou1001 Hi -

I did build with nodsp option, and there were verification issues, but without nodsp option, there was no any verification error.
There was no verification error in the simulation mode above in both cases, and the data type was a floating point.

@definelicht
Copy link
Contributor

Wow, ok. So 2021.2 is slow because of II=16, and nodsp breaks 2021.2 correctness. Does nodsp also break 2021.1 correctness?

I would not recommend using FMul_nodsp, this is very expensive. FMul_fulldsp and FAddSub_nodsp is usually a good combo, since addition doesn't benefit much from DSPs, but multiply benefits a lot.

@xooxit
Copy link
Author

xooxit commented Apr 5, 2022

@definelicht I see-. nodsp option in 2021.1 does not break correctness and no II=16.

There is only a verification error on 2021.2 with nodsp in both ADD and MULT.

I'm now building with -MM_ADD_RESOURCE=FAddSub_nodsp both in 2021.2 and 2021.1. BTW could u let me know which aspect of using FMul_nodsp is expensive?

@definelicht
Copy link
Contributor

Ok, that's very strange. I suspect this is a bug on Xilinx' side, not in this repo. I think I will put a notice in the README that the accelerator is broken in 2021.2, and see if it improves in future versions, unless any new information comes up?

@charliechou1001
Copy link

Hi @xooxit , I made the project worked on Alveo U250. The Vitis I use is 2020.2, and the the parameter I use in CMakeList is the default one, and I also tried doubled the memory tile size m/n to 512, both works for me. Maybe the problem lies in the tool edition.

Here is the screenshot of my result:
Screenshot 2022-05-08 225543

@charliechou1001
Copy link

And from my workmate's experience, different edition of HLS will lead to different synthesis result with the same code, especially the hardware resource consumption, maybe the timing or other factors, such as the mismatch problems, is related to that.

@definelicht
Copy link
Contributor

And from my workmate's experience, different edition of HLS will lead to different synthesis result with the same code, especially the hardware resource consumption, maybe the timing or other factors, such as the mismatch problems, is related to that.

There is always a difference between different versions of the tools, but it's unfortunate if they even break the code :-(

@yunchenlo
Copy link

Hi all, I think I am facing the same issue.

The board is U50 and the VITIS version is 2021.2.

Here is the execution log for your reference. Hope it may help to target the bug.

Screen Shot 2022-05-23 at 10 10 13 AM

yclo

@definelicht
Copy link
Contributor

Hi all, I think I am facing the same issue.

The board is U50 and the VITIS version is 2021.2.

Here is the execution log for your reference. Hope it may help to target the bug.

Screen Shot 2022-05-23 at 10 10 13 AM

yclo

Did you try if it works when compiled with 2021.1 or older?

@yunchenlo
Copy link

Yes, I try on 2021.1 and pass the test!

yclo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants