-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix stride indexing bugs in reorg
and reorg_gradient
functions (CPU & CUDA)
#3012
Conversation
…PU & CUDA) and Add `add_to` Parameter
Oh, I will check it in more detail later on. But is it possible that it has worked for me because I've always used this with square images? |
The original and corrected output look identical to me. |
Hi @arrufat, I discovered this issue while searching for a method to split tensor columns in a 2D matrix mode. Upon closer examination of the example I provided, it's apparent that the function initially works correctly, but the very last row seems to be derived from memory artifacts rather than the actual matrix. Essentially, this indicates that the indices were incorrect. While I haven't tested all possible configurations of this class, the new indexed calculation appears to be correct now for multi-channel, multi-row, and multi-column scenarios. This correction ensures that the reorganization process accurately maps input tensor elements to their correct positions in the output tensor. An additional advantage introduced by this update is the optional accumulation of values. This feature allows for direct reversibility of the functions without requiring additional processes. I hope that these modifications have effectively corrected these utility functions... assuming they were indeed erroneous in their previous implementation... I'll let you confirm that please, as I don't have any examples of use for these specific layers. |
Upon re-reading the content of the Pull Request, I better understand your inquiry. You're correct; the initial calculation assumed square matrices but didn't cover the general case. Let me provide a more illustrative example that better demonstrates the problem:
|
Yes, your proposed fix makes sense, I should've noticed while porting this from Darknet, which only supports "square" strides, not different ones for rows and columns, that's why it was working in the tests, too, because we were only testing |
* remove using namespace std from headers * more std:: * more std:: * more std:: on windows stuff * remove uses of using namespace std::chrono * do not use C++17 features * Add Davis suggestion * revert some more stuff * revert removing include * more std::chrono stuff
Sweet, thanks for the bug fix :) |
Overview
This Pull Request addresses and resolves identified bugs in the
reorg
andreorg_gradient
utility functions within the Dlib library, ensuring accurate stride handling in both CPU and CUDA implementations. Additionally, a newbool add_to
parameter has been introduced to enhance the flexibility and reversibility of these functions for future layer integrations.Problem statement
Stride miscalculations: The original
reorg
andreorg_gradient
functions contained incorrect stride calculations, particularly in the handling ofrow_stride
andcol_stride
. This led to improper index mapping, causing erroneous data reorganization and gradient accumulation.Consistency issues: Discrepancies between CPU and CUDA versions resulted in inconsistent behavior across different execution environments.
Changes made
Example test case:
num_samples=1, k=1, nr=4, nc=4