Skip to content
This repository has been archived by the owner on Jun 22, 2024. It is now read-only.

TinyUSB - Chance of RP2040 crashing when Serial communications are happening while input data is sent #3

Open
SeongGino opened this issue Nov 11, 2023 · 9 comments
Labels
bug Something isn't working

Comments

@SeongGino
Copy link
Owner

Blocking issue for mamehook support, but not due to code added from that - it's an issue I'd discovered early on with the original source but never paid it much mind until now.

It's possibly more likely to happen on, but perhaps not exclusive to RP2040. Must investigate.

@SeongGino SeongGino pinned this issue Nov 11, 2023
@SeongGino
Copy link
Owner Author

SeongGino commented Nov 12, 2023

It very much seems to come down to a race condition, and was further exacerbated by use of dual cores; hopefully 47a7b7c mitigates this issue, if not fixing outright.

If it doesn't, test with DUAL_CORE turned off - if it doesn't crash with that off, then parallelization may be causing more problems than solves, at least when serial connections are involved.

Yeah, still ran into issues. So now 8bb26ee just adds delays.

Interestingly, the TinyUSB library does allow changing the polling rate. Would we benefit from increasing that to 1ms?

@SeongGino
Copy link
Owner Author

SeongGino commented Nov 12, 2023

For future reference, a list of things tried/to try:

  • Disable DUAL_CORE on its own: Crashes
  • usbHid.setPollInterval(1);: Crashes
  • delay(1);s after serialMode input processing: Crashes
  • usbHid.setPollInterval(1); & delay(1); after camera processing block: Works! Nope, still crashes
  • delay(1); after camera processing block only: Crashes
  • DUAL_CORE w/ usbHid.setPollInterval(1); & delay(1); after camera processing block: Crashes

Main reason I'm hesitant on changing the USB polling rate to 1(ms) is processing overhead on original Cortex-M0's, if they're still a desired target - it works fine on an RP2040 in single threaded mode, which means it's also likely to work on a Cortex-M4. Is this what GUN4IR does?

@SeongGino
Copy link
Owner Author

The commonality with all the tests, is that the device only freezes on the frame an input is sent, and the solenoid is commanded to pull but before it's allowed to rest, causing it to stick (testing this with Haunted Museum for the record).

If that's the case, the only way I know to solve it is to only run the Serial Mode-specific button handling only when irPosUpdateTick is set, while the serial line isn't busy - so essentially, in serial mode, we limit the button polling from "as fast as the board fires" to the fixed camera update rate of ~209hz.

At least a max polling rate of 209hz is still sending inputs every ~4.8ms. It's a good thing for my sanity that serial handoff has us only handling the buttons. (^^;

@SeongGino
Copy link
Owner Author

SeongGino commented Nov 13, 2023

Not pushed to the repo, but even with an internal test where the button states are polled separately from the button responses, and we only send the updated button responses once alongside every camera update... it's still crashing.

At this point, I'm at a loss.
Either there's an additional place where a delay needs to be placed, there's a fatal bug in the RP2040 core that's somehow gone unreported, or we'll have to can Mamehook support as a headlining feature and put it behind a forever experimental state (which I REALLY don't want to have to do, but figuring this out on my lonesome has been taking its toll and I want to have a new build pushed sooner rather than later so I can just take a break already...).

For what it's worth, we should also check other optimization levels (we use -O3 by default, by Prow's original suggestion), to make sure that isn't somehow causing issues. No change in behavior here either.

@SeongGino
Copy link
Owner Author

SeongGino commented Nov 15, 2023

I think we might be getting somewhere with 385c900 - alternated between a stage and a half of Operation G.H.O.S.T., two stages of Virtua Cop (Model 2), and a full playthrough (again) of Haunted Museum for good measure on the same go, and the gun has had yet to crash... so far.

It may very well just be linked to the trigger button/mouse click signals - and if that is so, and a 2ms delay isn't noticeable on games that, if we're only talking about MAMEHOOK-compatible games, at best refresh at 16.67ms anyways, this may well be the needed solution!

Or at least, it's a hack that I'm okay with doing if it finishes this damn feature lol.

NOTE: this is with single core, haven't tested dual core yet. Likely that this does fix it there too (since the change is at a subroutine level that both loop()s access) but should check regardless.

@SeongGino
Copy link
Owner Author

None of the delays help, which maybe I was naive to think they would do anything substantial...

Because the issue is the device "sticking" (wherein it is doing SOMETHING, but simply not changing, so sols or motors remain powered - which is very distinctly not a crash), that would mean there's some point where an infinite loop gets stuck.

A quick once-over shows that potential failure points are either in the DFRobotIRPositionEx library:

// length mismatch, flush the read buffer
while(wire.available()) {
wire.read();
}

Or in the serial reading:

#ifdef MAMEHOOKER
// For stability, we only parse the serial line on the single core, as it seems to be clashing with the camera/mouse feed.
while(Serial.available()) { // Have we received serial input? (This is cleared after we've read from it in full.)
SerialProcessing(); // Run through the serial processing method (repeatedly, if there's leftover bits)
}
#endif // MAMEHOOKER

I think I'll try experimenting with these next to remove any more chances of being left in a stalling state.

@SeongGino
Copy link
Owner Author

Investigated the above.

still. locks up.

@SeongGino SeongGino changed the title Chance of Arduino crashing when Serial communications are happening while camera is tracking Chance of Arduino crashing when Serial communications are happening while input data is sent Nov 18, 2023
@SeongGino
Copy link
Owner Author

At least I've narrowed that camera movement doesn't contribute to the problem, so it isn't I2C. Just spamming trigger when not seen still causes it.

So it is down to the timing of inputs being sent vs serial data being read/sent.

@SeongGino
Copy link
Owner Author

Even with the serial and data lines taking turns, it continues to crash.

And judging by adafruit/Adafruit_TinyUSB_Arduino#293, it does in fact seem to be a fault of the USB stack and not the code itself! (thank goodness)

I've done a test with both patches in that issue thread, and it seems to resolve the fault. If it really is down to a deadlock in TinyUSB, then we could either wait until it is resolved upstream, or provide instructions for users to install the "fixed" RP2040 core.

Obviously, tampering with Arduino cores directly like this is another challenge for the less technically adept; but if this issue's shown anything, I have tried literally everything I could on the sketch side to no avail, so this will be what it comes down to.

@SeongGino SeongGino added the bug Something isn't working label Dec 25, 2023
@SeongGino SeongGino changed the title Chance of Arduino crashing when Serial communications are happening while input data is sent TinyUSB - Chance of RP2040 crashing when Serial communications are happening while input data is sent Dec 25, 2023
@SeongGino SeongGino unpinned this issue Dec 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant