-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
USB CDC RX interrupt routine calls blocking functions #167
Comments
Maybe the IRQ can set a flag and we can then service it in a non-blocking way on the core 2 loop? |
This is a non-trivial problem. Yes, you are right ... we generally want to stick it somewhere for deferred processing. Here, that's what the code is already attempting ... but it's choosing a blocking function, rather than losing terminal input. In addition, the ISR fires whenever the host sends another packet, which is not under the firmware's control. Thus, the ISR might be re-entered essentially immediately upon exit ... so the read bytes must be inserted into the queue or discarded? At the same time, we cannot force Core0 to read from the fifo. We just have to hope that Core0 is not also blocked somewhere. However, I think Core0 has many blocking functions where it will not read the terminal input for some time. For example, NAND erase or write appear to take significant time. So... yeah. Non-trivial. I can see why the choice was made here. Without moving to pre-emptive multitasking, maybe the best thing will just be to ensure that the interrupt is configured to fire only on Core1. That may end up being the right balance among many imperfect choices. I need to give it some more think...... |
I will need to dig further, but it is my understanding there is at least some kind-of-sort-of reliable flow control in USB CDC. These discussions are for other chips, but I recall that it should be possible to stall the next CDC packet until we process and acknowledge a pending packet in a cooperative multitasking loop. Is this feasible, or would it require digging into the USB stack, I do not at the moment know. |
We using hatach's tinyusb stack ... so the above is the right concept, but didn't shed light on implementation. I'll probably start by searching there how to implement control flow in the USB CDC, and if that fails, I'll ask him directly for advice. I've contributed to his repositories before. :) |
If I'm looking at this properly, the ISR will fire once. But, if we just set a flag, it looks like tinyUSB might properly respond with a NAK when it doesn't have space in its own buffer (thus host will stall/retry). So, core1's infinite loop can simply check for more data each iteration? If doing so unconditionally, maybe the ISR can just be a no-op ... so then the infinite loop be interrupted by the ISR also messing with the queue. Worth testing.... |
Notes:To have a chance of fixing this in a less-than-really-complex manner, I want to add some basic rules around the use of the USB FIFO queues:
These rules appear to be generally followed already. Known exceptions include:
|
It really not a good idea to call blocking functions in an Interrupt Service Routine. Doing so typically requires waiting on a spinlock (or similar), which means that core will be tied up until something else unblocks that ISR.
Here, the USB RX interrupt handler calls
queue2_add_blocking(&rx_fifo ...)
, which does a spin-wait for space if the FIFO queue is full. If lucky, the ISR is already configured to execute only on Core1. Else, if this ISR runs on Core0, this would cause a deadlock as Core0 is the only core that removes entries from that FIFO.Even if lucky, the use of blocking / spin-wait in an ISR is generally frowned upon, and can cause intermittent, incredibly difficult-to-track-down bugs (e.g., timeouts, poor interactions with USB hosts, etc.).
BusPirate5-firmware/src/usb_rx.c
Lines 62 to 84 in 7f9538e
Action Required
Review where blocking / spin-waits are currently used. Consider modifications to minimize or eliminate blocking / spin-wait code paths for at least Core1.
The text was updated successfully, but these errors were encountered: