Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frames getting retransmitted multiple times with one transmit call #4

Open
mjwells2002 opened this issue Feb 5, 2025 · 7 comments
Open

Comments

@mjwells2002
Copy link
Contributor

This issue presents as what appears to be a frame retransmit as can be seen here with the beacon example

Image

the retranmitted frame typically is after a CTS frame with either a random DA or all 00

looking in deeper at why this happens it seems the hardware is transmitting multiple slots when it has been asked to transmit just one of the slots, and due to the leftover pointer to the last frame in the previously used tx slot, and that the beacon example reuses the same buffer for every transmisison, we see a repeat of the current beacon frame

you can see that when the incorrect transmit of cts + dual frames happens the txq_complete_status register indicates that the hardware transmitted two slots, this is always the current slot n and n-1

Image

in this example to catch it i had the wmac panic whenever the interupt was raised for multiple slots at once

here i modified the interrupt handler to dump the PLCP0 register of each slot when the interupt is raised to indicate that multiple slots were transmitted (other registers were checked, but no changes were found)

Image

and you can see that the slots transmitted have had there PLCP0 register modified, i suspect by the hardware

i was able to confirm that the software does not seem to be modifying the PLCP0 register for slots that its not using

because the hardware is acting different with the exact same configuration each time i think this could potentially be a hardware issue

i also think this is the root cause of the issue partially fixed in #3 where a tx signal was being signalled when the slot was not in use, as this will cause not in use slots to be signaled as tx complete

i was able to work around the symptoms of this issue for my own use by reseting plcp0 to 0, and disabling other tx slots on my own fork here main...mjwells2002:esp32-wifi-hal-rs:main

however this is not a full fix for the issue, as the interupt still gets raised signalling that it did transmit the other slot which would cause issues if all tx slots were enabled, but the radio does not attempt to transmit the frame as the data pointer is invalid

if this is a hardware issue it raises the obvious question of how is the offical driver working around this issue, as this does not happen on the offical driver

@Frostie314159
Copy link
Member

I think the next step here is to attempt to reproduce this in the C driver, to verify that this is not just an issue of this implementation. I'll do that in a bit.

@mjwells2002
Copy link
Contributor Author

ah thanks, i had attempted to reproduce on the c driver but was unable to get it to compile

@mjwells2002
Copy link
Contributor Author

mjwells2002 commented Feb 5, 2025

some additonal notes, reproduction on the beacon example is normally extremly fast within <10s, but it can take as long as a 50-60s in my worst case during testing

CTS frames happen most of the time (maybe 95%) but not every single time, the surefire way to tell is to look at inter frame timings and sequence number, if you see the same frame back to back with very little time between that is probably this issue

@mjwells2002
Copy link
Contributor Author

behaviour has changed with latest commits, cts frames are now never sent the incorrect frame transmit happens first before the correct transmit, and two seperate interupts are raised for it instead of one

Image

reseting plcp0 after tx does still prevent the incorrect frame transmisisons (but the interupts are still raised)

@Frostie314159
Copy link
Member

I think that this additional interrupt may also be raised in the proprietary stack. As a band-aid solution, we could clear the enable bit after TX.

@mjwells2002
Copy link
Contributor Author

Clearing just the enable bit, or valid bit does not prevent the tranmission of the frame, only clearing the dma address from plcp0 prevents it

@Frostie314159
Copy link
Member

The latest commit resets plcp0 after every TX and resolves the channel setting issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants