-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faulty DMA Buffer Overflow in adc_trig example #104
Comments
Forgot to mention, I am using a 32L0538DISCOVERY dev board, which is an stm32l0x3 device. In order to get the example to build, I removed the I also had to change the UART peripheral to match the dev board, but I assume these changes shouldn't affect the behavior of the code. The ADC DMA transfers seem to be working, until the example code eventually panics due to the aforementioned BufferOverflow error. |
(Offtopic, but what did you need to adjust with the UART peripheral? Did you specify the |
Thanks for the report! I'll take a look later to see if I can reproduce this with the STM32L0x2 device I developed this with. |
Hi, I pushed my experiments into a new branch and will try to answer your questions: @dbrgn, I have a usb-uart device connected to PB6 and PB7 on my dev board, which requires using USART1 instead of USART2. Just switching the USART from USART2 to USART1 caused a compilation error though:
However, adding the Could this be another issue? I noticed this trait did not need to be imported if I use USART2. |
@hannobraun , Thanks for implementing this, while testing this out tonight, I got a nice picture that shows that the triggered DMA is essentially working correctly: That's a plot of the Power spectral density of the raw ADC readings in the buffer. You can clearly see the AC mains hum at 60Hz, as well as the first harmonic aliased at 80Hz. In order to get this picture, I changed the sample rate to 200Hz instead of 1 Hz, and changed the analog pin to PA5, since on my dev board, PA0 is connected to a button with a pull down. Also, in order to get these examples to build, I had to removed the Unfortunately, I am still seeing the BufferOverrun errors, and especially unfortunately, it seems that the exact behavior of when the BufferOverrrun happens depends on whether or not optimizations are turned on. If |
Interestingly, if you make the
In most use cases, I would expect higher level code to wait for one of these flags to be set true, or wait for an interrupt when these flags are set. I added helper functions to allow higher level code to see the DMA status flags with new pub functions. I modifed You can see an example of this here: adc-dma-transfer-wait. |
Thanks for digging deeper into this, @apgoetz. I haven't had a chance to take a look myself so far, but I hope I can get to it soon.
I added these restrictions because I didn't have an STM32L0x3 device to test with, so I figured I'd just restrict the code to what I did test with instead of giving a false sense of security. Since you confirmed that it basically works on STM32L0x3, a pull request to remove these restrictions would be a good idea (maybe after this bug is resolved?). |
@apgoetz: I can reproduce the error on STM32L0x2 and can also confirm the following part of your analysis:
I've modified The output I get every time is this:
It seems to me the issue is that the write position wraps, but I have yet to explore this further, but it sure looks like we have a race condition in Lines 469 to 471 in b8147f1
First we read the flags (before the write has wrapped), then the write wraps, then we read NDT (which reflects that the write has wrapped). No idea right now how to prevent this. I'll do some thinking and reading, but if anyone has an idea in the meantime, I'd love to hear it. |
@hannobraun , Thanks for the background on the I did some more reading on this topic today. Unfortunately, I think it will be really hard to have a sound interface for any code that accesses a dma buffer while a transfer is ongoing. A lot of the effort so far has been on determining safe traits to use to describe the buffers that DMA acts on. The state of the art on defining these traits seems to be located at ra-kete/dma-poc. Most of this effort focuses on non-repeating dma transfers, where the goal is usually to speed up data transfer through a peripheral. There has been much less effort on circular and double buffered dma transfers, where the end goal is to maintain strict timing requirements for real time systems. I looked at DMA implementations located in other stm32 crates, including stm32f1xx-hal stm32l1xx-hal stm32f7xx-hal stm32l4xx-hal stm32g0xx-hal Most of these hal crates just provide implementations around for single dma transfers. However, stm32f1xx-hal and stm32l4xx-hal do provide implementations that support circular buffers. The basic idea is to split the receive buffer into two halves, and then only allow access to the half of the buffer that does not currently have a transfer taking place. This still does not make the code safe, because there is no guarantee that the DMA transfer won't wrap around and clobber the data in the half that you are holding, but it provides the best timing guarantees, since at least the DMA transfer is in the other half of the buffer when you access it. A detailed explanation of this approach can be seen in teskje/dma-poc#1 Getting back to the topic at hand, I think maybe we should reconsider the API for the ADC-DMA interface. I don't think it really makes sense to expose an iterator over the individual samples in the buffer: Most of the time, you will want to batch up ADC samples and process them all at once. An interface that provides access to a completely filled DMA buffer that is triggered by the half-transfer or transfer complete interrupts might make more sense. It also might make sense to let users choose between a continuous DMA transfer and a one-shot DMA transfer. It is a lot easier to reason about data races if you know the DMA is deactivated when you access the buffer. |
Thanks for doing all this research, @apgoetz. There has been a lot of movement in the space since I worked on the the DMA code in this crate. It would definitely be good to incorporate all this here, although I don't know if now would be a good time, or if waiting for further progress in the working group's focus project would be beneficial. All that said, I'm already busier than I'd like, and I'm not available to work on any larger update of the DMA and/or ADC APIs any time soon. I am interested in fixing this bug though, and I hope it can be done with a focused change to the racy method I quoted above. |
I tried to fix the race condition by reading the two registers in a loop, only returning a result once two reads in a row would be the same. I did this under the assumption/hope that the race condition exists purely in the Rust code, and that the update of the registers by the DMA is atomic. Turns out, that's not the case. It's totally possible to read the registers twice in a row, get the same result each time, and still have read an inconsistent state (i.e. NDT having wrapped, without the flag having been set). I'll think some more about this, but right now it looks to me like the existing API can't be supported by the hardware, and we'll have to change the API, as @apgoetz suggests. |
OK. Too bad the quick fix didn't work. I think i would like a to wait a little bit longer until the best practices around this dma has settled a little bit. Once that has happened, I would be happy to work on a new implementation. |
@hannobraun , I have been working on a new implementation of the dma driver that only supports "oneshot" dma transfers. The idea here is that since the DMA controller can write to the buffer while you are accessing it, it makes sense to have a driver design where a buffer is given to the ADC driver, and only returned once the dma transfer has completed. You can still have continuous transfers by using two different buffers of samples, and the adc_dma.rs example shows how to do that. Can you take a look at the branch and see if this is something that be integrated into the crate? Branch is at apgoetz:adc-dma-testing |
I haven't taken a look at the code yet, but that sounds like it would mesh well with
Can you please submit this as a pull request (mark it as a draft, if you don't think it's ready). That makes it easier for me to add comments and track changes you make in response to them. Also, to clarify, I'm not a maintainer of this crate (although it seems I can merge pull requests, which I wasn't aware of?). In any case, @arkorobotics makes the decisions around here. |
Correction: Turns out I was a maintainer all along, just nobody told me :-) |
@apgoetz I took a look at it and my new working theory is that we've got an issue with non atomic reads, however we can work around this by using a stored position instead. Pull request created With this I get absolutely ZERO improper overruns even right up to the limit where the hot loop has trouble keeping up with the ADC. In my testing it also properly generates overruns when the sampling rate is slightly faster than the loop so this doesn't loose any overruns it would have otherwise generated. |
Fixed in #127. |
Hi,
While experimenting with the ADC DMA examples, I discovered a bug in the implementation. I haven't completely root caused it, but I wanted to file an issue to keep track of it.
I assume this bug is part of the DMA implementation in #86
Currently, the DMA peripheral is set up in circular buffer mode to repeatedly transfer ADC samples into a fixed buffer. Sometimes, when the DMA peripheral reaches the end of this buffer, it causes a spurious
BufferOverflow
error when using theread_available()
method.This only seems to occur when the DMA transfer has reached the end of the buffer space and is about to wrap around to the start. The example adc_trig.rs uses a buffer size of 256 and a sample rate of 1Hz, so it takes a while to surface this bug. If you speed up the sample rate or decrease the buffer size, you can reproduce this issue much more quickly.
The issue seems to be caused by the function
check_overrun_inner()
which is responsible for checking, based on the current DMA status, if the DMA peripheral has overwritten data that has not been read yet.It seems that if the DMA write pointer is located at position 0, but the
transfer_state.complete
flag is not set, this causes a spurious buffer overflow.The DMA write pointer is calculated from the
DMA_CNDTRx.NDT
register field, which is supposed to indicate the number of remaining items to transfer. It seems that if this counter has restarted, and is equal to the length of the buffer, but the transfer complete flag is not set, this error condition is triggered.I am not certain how this can happen: I assume that the transfer complete flag gets triggered by hardware pretty close to when the
NDT
register reloads its value, but somehow the example code is triggering this behaviour.The text was updated successfully, but these errors were encountered: