You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This benchmark should show how many tuples a certain number of GPU threads can scan and copy to a send buffer per time unit: Outcome: x-axis, number of threads, y-axis, tuple throughput.
We should test this with different tuple sizes to see if there are interesting effects when the data to be copied per tuple is larger.
Knowing the results, we would be able to calculate roughly how large the tuple size must be and how many threads we would need to fill the send buffers fast enough to reach a throughput near BW in the shuffle.
The text was updated successfully, but these errors were encountered:
This benchmark should show how many tuples a certain number of GPU threads can scan and copy to a send buffer per time unit: Outcome: x-axis, number of threads, y-axis, tuple throughput.
We should test this with different tuple sizes to see if there are interesting effects when the data to be copied per tuple is larger.
Knowing the results, we would be able to calculate roughly how large the tuple size must be and how many threads we would need to fill the send buffers fast enough to reach a throughput near BW in the shuffle.
The text was updated successfully, but these errors were encountered: