Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP mode: Sequence number is incremented when repeating (retrying) a command #214

Open
probatom opened this issue Jan 10, 2025 · 0 comments

Comments

@probatom
Copy link

Describe the bug
I have a setup where our application using osdp in CP mode can started quicker than the PD hardware (a HID Signo 40 badge reader in this case) seems to boot up (this is within milliseconds after the reader is powered). The reader seems to have missed the first command sent to is by libosdp; there is no response. But it looks libosdp does not recover nicely:

L897 [WARN ] No response in 200ms; probing (1)
L489 [ERROR] Packet sequence mismatch (1/0)
L1210 [ERROR] Going offline for 300 seconds; Was in 'ID-Request' state
L1220 [DEBUG] StateChange: [ID-Request] -> [Offline] (SC-Inactive)

The CP sends a new ID-Request command with incremented sequence number (1), but the PD responds with sequence number 0, which indeed should be used according the standard when starting a new communication session (e.g. after boot). Although the standard is not very elaborate on how this should work in corner cases like this (retries).

Expected behavior
Second ID-Request attempt (retry) should use the same initial sequence number (0) as the first attempt.

Observed behavior
See above, CP reports "Packet sequence mismatch".

Additional context
Library version: 3.0.8

I've patched the libosdp code locally, by adding pd->seq_number -= 1; in cp_phy_state_update, right before LOG_WRN("No response in 200ms; probing (%d)", pd->phy_retry_count);. Then the scenario appears to work fine (with a retried command):

L900 [WARN ] No response in 200ms; probing (1)
L680 [DEBUG] CMD: ID(61) REPLY: PDID(45)
L1223 [DEBUG] StateChange: [ID-Request] -> [Cap-Detect] (SC-Inactive)
...etc...

If I trigger lost packets (manually disconnect RS485 wires) after the PD came online (active communication session with SC enabled) the retries using this modified code seem to work fine. A few "No response in 200ms" messages are logged, but the session remains intact.

I suspect that the same change would be an improvement for the other case that returns OSDP_CP_ERR_CAN_YIELD: the case of receiving an osdp_BUSY reply. Repeating the command the command with the same sequence number should allow the PD to complete the command it was already processing, while (if I understand the specs correctly) sending the command again with incremented sequence number should be interpreted by the PD a fresh request (so it should start over again). Both should work, but not restarting the action seems the nicer behavior to me. But I have no scenario for triggering osdp_BUSY responses.

Also, I'm not familiar enough with libosdp internals to know that if after a OSDP_CP_ERR_CAN_YIELD return, the next command going to the PD will always be a repeat of the previous command (or a full session reset). Because if something else can come in between, my proposed code change ("undoing" the sequence number increase in that location) will be incorrect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant