Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent stale command subscription entries in Device Connection Service #1858

Closed
calohmn opened this issue Mar 26, 2020 · 23 comments
Closed
Labels
C&C Command and Control Device Connection issues regarding the storage of device connection information
Milestone

Comments

@calohmn
Copy link
Contributor

calohmn commented Mar 26, 2020

Scenario:
A device, that is normally only interacting via a gateway, has done a ttd HTTP request to the Hono HTTP protocol adapter in order to receive a command message.
Having received a command message, there is some error when invoking the removeCommandHandlingAdapterInstance method in the Device Connection Service, as part of the protocol adapter removing the command subscription. This leads to a stale commandHandlingAdapterInstance entry concerning the device in the Device Connection Service.

Now, for all subsequent command subscriptions only done by the gateway of the device, there is the issue that the stale commandHandlingAdapterInstance entry will effectively prevent commands from getting delivered to the gateway. This is due to the device-specific commandHandlingAdapterInstance entry getting precedence.

Only a subsequent created and removed command subscription for the specific device (not the gateway), where the removeCommandHandlingAdapterInstance method succeeds, will mitigate such a situation.


To prevent or automatically resolve such a situation, this looks like a straightforward solution:

  • Implement a retry mechanism for the removeCommandHandlingAdapterInstance invocation for as long as it fails.

That is easy to implement but doesn't necessarily prevent the above problem if the adapter is restarted while the commandHandlingAdapterInstance entry hasn't been successfully removed yet.

Therefore it looks like we need to implement the following instead (or as well):

  • Let commandHandlingAdapterInstance entries expire after a certain time and implement periodic refresh requests from the protocol adapter while a subscription is still active.
@calohmn calohmn added C&C Command and Control Device Connection issues regarding the storage of device connection information labels Mar 26, 2020
@calohmn calohmn added this to the 1.3.0 milestone Mar 26, 2020
@ctron
Copy link
Contributor

ctron commented Mar 26, 2020

I think the TTD defines the timeout of such a mapping. Why not hand over the responsibility to the device connection service, to remove this latest this point in time (now+ttd).

Infinispan does have a TTL mechanism, and I guess for every SQL based backend you could easily implement a cleanup job like this.

@calohmn
Copy link
Contributor Author

calohmn commented Mar 27, 2020

@ctron yes, letting Infinispan do the work of actually deleting the entries would be the idea, and the TTD would define the lifespan of the entries - if there is a TTD.

In the case of MQTT where there is no TTD, we could use a predefined expiry value for the device connection service entries, defined so that the automatic re-creation interval for entries of still valid command subscriptions is less than that.

calohmn added a commit to bosch-io/hono that referenced this issue Apr 16, 2020
This makes it possible to restrict the lifespan of
entries set via the setCommandHandlingAdapterInstance
method of the Device Connection API.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Apr 16, 2020
This makes it possible to restrict the lifespan of
entries set via the setCommandHandlingAdapterInstance
method of the Device Connection API.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Apr 17, 2020
This makes it possible to restrict the lifespan of
entries set via the setCommandHandlingAdapterInstance
method of the Device Connection API.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Apr 21, 2020
This makes it possible to restrict the lifespan of
entries set via the setCommandHandlingAdapterInstance
method of the Device Connection API.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit that referenced this issue Apr 21, 2020
This makes it possible to restrict the lifespan of
entries set via the setCommandHandlingAdapterInstance
method of the Device Connection API.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Apr 22, 2020
@calohmn
Copy link
Contributor Author

calohmn commented Apr 22, 2020

PR #1916 has been created for using the ttd as the lifespan of the mapping entry in the device connection service.

For the cases where no ttd is given and a periodic refresh of the mapping entry shall be done, a new Device Connection API method is needed for the refresh operation, performing a conditional update.
It has to be ensured that the refresh operation doesn't overwrite a different adapter instance value, set in the meantime by a command subscription from another adapter instance.
For that, a new replaceCommandHandlingAdapterInstance operation, with parameters in analogy to Map.replace(key,oldValue,newValue), should be introduced in the Device Connection API.

@sophokles73
Copy link
Contributor

sophokles73 commented Apr 22, 2020

Can't we simply add an (optional) flag to the setCommandHandlingAdapterInstance operation indicating whether the request is to be handled as an update only?

@calohmn
Copy link
Contributor Author

calohmn commented Apr 22, 2020

I had thought about this, using replaceExisting as the name of the flag. replaceExisting=false (which should be the same as omitting the flag) could be confusing in what it actually does. Therefore I thought it best to use a separate method.

If we name the flag updateOnly however, things would be clear. So, yes, that should probably work.

calohmn added a commit to bosch-io/hono that referenced this issue Apr 22, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Apr 23, 2020
Also change the type of the lifespan parameter
from int to Duration in the methods of the
DeviceConnectionInfo, DeviceConnectionClient and
DeviceConnectionService interfaces.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Apr 23, 2020
Also change the type of the lifespan parameter
from int to Duration in the methods of the
DeviceConnectionInfo, DeviceConnectionClient and
DeviceConnectionService interfaces.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit that referenced this issue Apr 23, 2020
Also change the type of the lifespan parameter
from int to Duration in the methods of the
DeviceConnectionInfo, DeviceConnectionClient and
DeviceConnectionService interfaces.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Apr 23, 2020
…red.

With the 'lifespan' parameter of the 'setCommandHandlingAdapterInstance'
operation being mandatory to implement now for Device Connection
API implementations, the explicit removal of a command handling
adapter instance entry can be skipped if the entry has expired.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit that referenced this issue Apr 27, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Apr 28, 2020
…as error.

Command handling adapter instance entries may expire.
Therefore, getting a NOT FOUND result when invoking
removeCommandHandlingAdapterInstance() shouldn't be treated
as an error.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit that referenced this issue Apr 28, 2020
Command handling adapter instance entries may expire.
Therefore, getting a NOT FOUND result when invoking
removeCommandHandlingAdapterInstance() shouldn't be treated
as an error.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue May 4, 2020
calohmn added a commit to bosch-io/hono that referenced this issue May 4, 2020
calohmn added a commit to bosch-io/hono that referenced this issue May 4, 2020
calohmn added a commit to bosch-io/hono that referenced this issue May 5, 2020
calohmn added a commit to bosch-io/hono that referenced this issue May 5, 2020
calohmn added a commit to bosch-io/hono that referenced this issue May 6, 2020
calohmn added a commit that referenced this issue May 6, 2020
@calohmn
Copy link
Contributor Author

calohmn commented May 11, 2020

In the Hono community call last Thursday there was the question, whether we can replace the periodic adapter instance mapping entry refresh call (with its associated overhead), with something, where the stale mapping entry gets deleted when forwarding a command to that adapter instance fails.

Here's a closer look at the scenario:
The case that a protocol adapter instance is getting killed, so that corresponding adapter instance mapping entries in the Device Connection service get stale, would lead to such a scenario:
Stale_Instance_Entry_1
The adapter instance that forwards a command message will get a "no credits" error when trying to send the message (more specifically, the first such message would get "released" with an accompanying "drain=true" "flow" frame). This can be interpreted as meaning the target adapter instance doesn't exist anymore so that the adapter instance mapping entry in the Device Connection service can be removed (here by protocol adapter #2).

But, consider another scenario (featuring a qdrouter mesh):
Here, the adapter instance #1 is still alive, only the qdrouter instance that it was connected to has been killed. A new connection to another qdrouter instance hasn't been established yet.
Stale_Instance_Entry_2

Removing the adapter instance mapping entry would be a mistake here - all following commands would get released without the device getting a chance to reestablish the command subscription.

To mitigate this, an increased delay with additional checks for new credits before actually removing the mapping entry could be used. Accidentally removed mapping entries can't be prevented 100% in this case though.

Another solution would be the following:
Upon encountering a "no credits" error when trying to forward a command message, a protocol adapter instance will invoke a new setAdapterInstanceOffline(adapterinstance, offline=true) Device Connection API method, containing the target adapter instance. The getCommandHandlerAdapterInstances method in the Device Connection service will skip mapping entries with "offline" adapter instances.
In the event that the target adapter instance is still alive but has just lost the connection to the qdrouter, the target adapter instance will invoke setAdapterInstanceOffline(adapterinstance, offline=false) each time it successfully reestablishes a connection and the adapter instance command consumer link. In the device connection service this will mean that the "offline" marker gets removed. (In addition, removing an "offline" marker could also be done implicitly by letting setAdapterInstanceOffline(adapterinstance, offline=false) get invoked internally with each set/removeCommandHandlingAdapterInstance invocation.)

When a protocol adapter gets shut down, it could call setAdapterInstanceOffline(adapterinstance, offline=true) for itself, meaning the "no credits" errors above usually wouldn't occur in the first place.

An "offline" marker created by setAdapterInstanceOffline(adapterinstance, offline=true) could get a timestamp, so that a periodic (e.g. daily) cleanup task could remove old "offline" markers and associated stale mapping entries. So, stale mapping entries would first be ignored via the association with an offline adapter instance and eventually get removed via such a cleanup task.

I would have a better feeling using that 2nd solution with the offline markers, preventing accidental removals of mapping entries (however rare these might occur).
@dejanb , @sophokles73 WDYT?

_
EDIT: The periodic cleanup task could be omitted, if we implement it

slightly differently.

Instead of "setAdapterInstanceOffline", we add "setAdapterInstanceStatus` with the possible states "online", "offline" and "suspected", implemented as a cache with key "adapter instance" and "online"/"suspected" as value. "Suspected" entries get deleted after an associated lifespan and thereby become "offline".
Each adapter periodically calls "setAdapterInstanceStatus(online)" for itself.
A "no credit" error causes "setAdapterInstanceStatus(suspected)" to be called.

A call to "getCommandHandlerAdapterInstances" will exclude entries with "suspected" state from the result and will delete an entry with an "offline" (ie. non-existant) state.
An adapter that lost its Qdrouter connection will call "setAdapterInstanceStatus(online)" on reconnect.

Only downside I see here is the added adapter instance status check on every getCommandHandlerAdapterInstances invocation, which means an additional cache access on every received command message. For our HotRod-client based implementation, we could define a separate near cache for this, minimizing access times. However, our current implementation isn't made for using multiple caches for the device connection service (yet).

@sophokles73
Copy link
Contributor

@calohmn thanks for the detailed analysis. I didn't think about the case where another router instance in a mesh would crash. I agree that we cannot distinguish these cases properly.
In your 2nd solution, I do not quite understand if you still would want to attach a lifespan to the command handling adapter instance ID mappings or not. If you don't, how would stale records be removed? If you do, how would the information be updated after expiry? From what I see, the overall goal of getting rid of the periodic updates cannot be achieved that way, or am I mistaken?

@calohmn
Copy link
Contributor Author

calohmn commented May 11, 2020

how would stale records be removed?

Stale records would first be excluded from getCommandHandlingAdapterInstances results via the association with an offline adapter instance (the device connection service implementation has to exclude these entries) and eventually get removed via a periodic cleanup task inside the device connection service implementation (remove "offline" markers older than, say, 24hrs and remove all command handling adapter instance mapping entries containing the adapter instance of the "offline" marker).
(I've also edited the comment above to make this a bit clearer.)

That means, that for command subscriptions without a ttd, no lifespan and no periodic updates/refresh operations (triggered by the protocol adapter that initiated the subscription) are needed.
For command subscriptions with a ttd, I think having the lifespan attached to the mapping entry can still be useful as an additional means to achieve consistency in valid mapping entries.

Basically, the advantage of the 2nd solution is that with one device connection api request, all mapping entries gone stale because of a killed adapter instance can get deactivated. Each stale mapping entry doesn't have to get removed individually by a protocol adapter that gets a "no credit" error.
And if entries aren't actually stale because there was just a glitch in the AMQP messaging network, all the relevant mapping entries can get reactivated with just one device connection api request (removing the "offline" marker).

@sophokles73
Copy link
Contributor

Understood. Sounds good to me. So, we would undo the addition of the lifespan parameter to the setCommandHandlingAdapterInstance operation and instead introduce a new (mandatory) operation setAdapterInstanceStatus to the Device Connection API, right?

@calohmn
Copy link
Contributor Author

calohmn commented May 11, 2020

As I wrote above, I think the lifespan parameter for setCommandHandlingAdapterInstance can remain for command subscriptions with a ttd parameter.

What could be undone here would be the addition of the updateOnly parameter of setCommandHandlingAdapterInstance.

introduce a new (mandatory) operation setAdapterInstanceStatus to the Device Connection API, right?

Yes. Implementors wouldn't have to store the "online" state though, only a set of "offline" adapter instances is needed (I thought "setAdapterInstanceOffline" would make this more clear, but that distinction is probably not needed from an API user viewpoint).

@sophokles73
Copy link
Contributor

@dejanb @ctron WDYT?

@dejanb
Copy link
Contributor

dejanb commented May 11, 2020

@calohmn @sophokles73 I think proposed solution would work, but it introduces quite a complexity and I'm not sure if the problem it solves is that big.

With having only the existing methods we can do the following:

If adapter is not connected to the AMQP network, devices connected to it are unreachable. I think it's OK in that situation to disconnect all connection based devices (MQTT for example) anyways. I would also think that it'd be OK to let non-connection based devices be unreachable for the rest of their TTD (this shouldn't be a long period anyway). Next time they connect, they will be remapped or the connection will be refused (if AMQP network is still unreachable). If making these devices "offline" for the rest of their TTD is problem, an adapter could maybe cache this information and remap them on reconnect.

Anyways, I just wanted to see your opinions on all this before proceeding with the proposed solution.

@calohmn
Copy link
Contributor Author

calohmn commented May 11, 2020

If adapter is not connected to the AMQP network, devices connected to it are unreachable. I think it's OK in that situation to disconnect all connection based devices (MQTT for example) anyways.

I see these pros and cons here:

  • (+) no extra device connection service method needed
  • (+) no device connection service periodic cleanup task needed
  • (-) Qdrouter restarts that would otherwise have gone unnoticed by the devices, now would trigger potentially many parallel reconnect attempts and device connection api requests (related to the new subscription requests)
  • (-) The precondition for this to work in the case described above would be that the newly recreated MQTT command subscriptions happen after the Qdrouter connection has been reestablished; but this is not guaranteed here, subscription requests happening before that would have to fail (this is currently not done; when the qdrouter connection fails, it is getting reconnected under the hood; any concurrent "createCommandConsumer" requests are not influenced by that in the current implementation)
  • (-) "createCommandConsumer" needs a new handler param to trigger the MQTT disconnects

All in all, I would rather not do this.

If we want to choose an approach without the lifespan and periodic mapping entry updates, and without the added "adapter instance offline" marker, the following solution would avoid the disconnects from the devices in most cases:
When a protocol adapter gets a "no credits" error when trying to forward a command, it removes the adapter instance mapping entry (after having done a delayed 2nd check).
For the case that the target adapter instance had just lost the qdrouter connection, the target adapter instance will call getCommandHandlingAdapterInstances for each of its registered consumers when it has successfully reestablished the qdrouter connection and the adapter instance consumer link. If the result of that operation is empty (=> entry removed because of "no credits") or contains another adapter instance, the device connection will be disconnected (allowing the device to recreate the command subscription).

This will cause fewer device disconnects. Still, there are of course potentially many device connection API requests done here every time the connection to the qdrouter is lost. That's something I would rather want to avoid.

So, overall, I still consider the solution with the "adapter instance offline" marker to be the cleaner and preferable one.

@calohmn
Copy link
Contributor Author

calohmn commented May 12, 2020

Ok, here's an updated "adapter instance online/offline status" approach with reduced complexity (no cleanup job with potentially expensive traversal over all tenants and devices, no influence on getCommandHandlingAdapterInstances):

  • new Device Connection API method setAdapterInstanceStatus(String instanceId, boolean online, Duration lifespan), returns previous online state. (Values have no influence on getCommandHandlingAdapterInstances result.)
  • each adapter instance calls that method periodically with "online=true" (say, once a day)
  • When there is a "no credit" error when forwarding a command, the adapter that got the error will:
    => invoke setAdapterInstanceStatus with target adapter instance and "online=false"
    => remove the (probably stale) mapping entry

If the "no credit" error was due to a Qdrouter outage:
When an adapter was disconnected and then has reconnected to the Qdrouter, it calls setAdapterInstanceStatus with "online=true"; if the returned old value is "false", the adapter instance will validate its adapter instance mapping entries. It will:

  • call getCommandHandlingAdapterInstances for each of its registered consumers
  • if the result of that operation is empty (=> entry removed because of "no credits") or contains another adapter instance, the device connection will be disconnected (allowing the device to recreate the command subscription)

With that approach, there's only trivial implementation effort for implementors of the device connection API. Potentially expensive operations (validating adapter instance mapping entries for many consumers), or operations noticed by the devices (disconnecting the device connections), are only done in presumably rare cases.

@dejanb
Copy link
Contributor

dejanb commented May 12, 2020

@calohmn I'm generally OK with this, but here are a couple more thoughts to consider.

I still see relatively limited value in setAdapterInstanceStatus. The only place where that information is used is when adapter is reconnecting to let it know whether someone tried to send a command to it or not in the meantime. This have value in case re-syncing adapter state is costly and abruptly toward the devices.

Maybe a better optimization would be to introduce Future<List<DeviceConnectionResult>> getConnectedDevices(String adapterId) which would allow stores to efficiently return the known state for this adapter. The adapter then could sync with the real state (by adding missing entries or disconnecting devices). By doing this, it might not be so costly and abruptly to do this on every reconnect.

Of course, we can do both if it makes sense.

@calohmn
Copy link
Contributor Author

calohmn commented May 13, 2020

@dejanb With setAdapterInstanceStatus and its return value, I wanted to be able to skip the re-syncing, if for the time it takes the adapter to reconnect to an(other) Qdrouter instance (a few seconds at most?) there is no incoming command message for that adapter (so that the adapter status didn't get changed). But yeah, for the kind of message load we are developing for, this might not be the case often enough to justify the extra method here.

About getConnectedDevices(String adapterId): If we consider a case of 1.5Mio devices, with 20k connected to each adapter, that would make it a rather expensive operation. The result size would be at least 500kB. For the Hotrod implementation, it looks like we would either have to iterate over all entries (i.e. transfer all of them over the network), or store the entries differently. (Using the Hotrod remote API with custom filters doesn't seem practical, since we would have to deploy the filter via a jar to the infinispan server.)
This solution still is of course way more efficient than letting the adapter validate each consumer one by one (as mentioned in my previous comment), but anyway, I tend to think we should aim for a solution with less of an impact in such a scenario.

An alternative idea:
If we let the "no credit" error be followed by a removeCommandHandlingAdapterInstance operation that we might need to revert, we might want to add a flag to that operation with which the entry is moved to the recycle bin or attic, figuratively speaking, from which it can be restored later.
So that would mean:

  • the removeCommandHandlingAdapterInstance method having an added atticLifespan parameter. This will remove the mapping entry and create/update a cache entry for the adapter instance with a list of devices.
  • and an additional restoreCommandConsumingDevicesFromAttic(adapterInstance) method which will remove the above added cache entry and restore the adapter instance mapping entries from it. It will return a list of entries for which that has failed (so that the adapter can disconnect the corresponding devices).

I admit this doesn't look like a particularly pretty solution (other suggestions concerning the names are welcome :) ). But overall, I see this solution as having the least performance impact.

The cleaner solution I still think is the one above where offline/suspected adapter instances get excluded from the getCommandHandlingAdapterInstances result - see #1858 (comment) above with the added EDIT section at the bottom. The minor added complexity (and extra cache access) in getCommandHandlingAdapterInstances being the downside there.

@dejanb
Copy link
Contributor

dejanb commented May 13, 2020

@calohmn I started looking to all this from the beginning today and I would propose to hold a meeting to discuss it. As it we'll probably get to the solution faster. As some folks are offline this week, how about to do it early next week (Monday)?

Some things on my mind right now are:

I'm not sure we're supposed to deal with this problem at all at this level. Adapter having an intermittent connection with either AMQP network or device connection service is expected. And things work properly right now and are "eventually consistent" as they should be. For example, the original problem of having an issue of removing the subscription from the service. The future commands to this device will fail anyways as the device is disconnected anyways. So the fact we temporarily have a "stale entry" doesn't affect the functioning of the system. When device reconnects, this will be fixed.

The only issue is that device can be deleted in the meantime, so that this entry needs to be cleaned up somehow eventually, but I wouldn't put a burden of that job on the adapters in any way. That could also happen when the device is deleted from the registry but still connected. So maybe that's something we need to handle.

I think the bigger issue is that we currently have (I might be missing something) is that we don't have a way to deal with the situation when adapter is killed unexpectedly. That will leave a bunch of stale data in the service. One solution to that would be to introduce adapter status updates like proposed. But to me it seems like the job of the platform and we should be able to either provide a hook to be notified of such an event or do checks internally periodically.

In regard with all this, I don't think we need to "prevent the service to return stale entries" as that does not change the behaviour of the system in any way. I would focus instead on providing a way to enable other services and/or platform on notifying this service when data needs to be updated because of the external events. Or provide a way for the service to do this periodically on its own.

The other common theme in all these solutions is that we need to notify somehow the service that it has a stale information after we fail to send a command. I think if we solve the first one, we don't need this as it should be a "normal system state" from time to time. It could help however if we want to use it to trigger some kind of a cleanup or a check within the service (if we find it necessary). So, the equivalent "atticLifespan" idea would be to just have commandFailed(adapterId, deviceId) and let the service figure it out.

Anyhow, I hope this make at least some sense :) As I said, I think it'd be very beneficial to talk through it as we'll figure it out faster. Let me know what you think.

@calohmn
Copy link
Contributor Author

calohmn commented May 14, 2020

@calohmn I started looking to all this from the beginning today and I would propose to hold a meeting to discuss it. As it we'll probably get to the solution faster. As some folks are offline this week, how about to do it early next week (Monday)?

Yes, sure. Monday would be fine. Let's discuss details on Gitter.

For example, the original problem of having an issue of removing the subscription from the service. The future commands to this device will fail anyways as the device is disconnected anyways. So the fact we temporarily have a "stale entry" doesn't affect the functioning of the system. When device reconnects, this will be fixed.

The original imagined scenario that triggered this issue here was one, where things didn't get fixed:
Customer tests out device by connecting it directly, also subscribing for commands. Unfortunately, when unsubscribing, there is an error removing the mapping entry. From now on, customer uses device only via a gateway, i.e. the gateway is intended to receive commands for that device. However, the stale mapping entry (having precedence because it relates to the device directly, not the gateway) prevents any commands to reach the gateway. This doesn't resolve itself automatically.
Customer would have to connect the device directly one time and subscribe for commands to resolve this, but that's not at all obvious.
One could say this scenario is somewhat theoretical (however, we had a similar scenario with a customer with Hono 1.0 (then related to a stale "lastKnownGateway" entry) once). Therefore fixing this is not critical, it was just meant to make Command & Control handling more robust and solve an edge case.

Good points raised in your comment, how to address stale entries in general. Let's discuss them in the meeting then.

@dejanb
Copy link
Contributor

dejanb commented May 14, 2020

@calohmn Thanks for the description of the scenario.

That was really helpful. One question to further clarify all requirements:

So the problem exists for the devices that can connect both directly and using gateways. I would assume however that they can't be connected both ways at the same time? Or that is expected scenario as well? I don't see anything that would prevent that in the current APIs.

If that's not expected, one solution could be that the setLastKnownGateway clears any existing direct adapter instance mapping.

In other case, that's not a solution for this particular scenario.

@calohmn
Copy link
Contributor Author

calohmn commented May 14, 2020

So the problem exists for the devices that can connect both directly and using gateways. I would assume however that they can't be connected both ways at the same time? Or that is expected scenario as well? I don't see anything that would prevent that in the current APIs.

A device can indeed be connected both ways at the same time. It could send telemetry/events via its gateway and at the same time receive commands on a direct connection to the adapter.
That behaviour is defined by the fact that when choosing the command target device or gateway, the device-specific command subscription is given precedence over the last-known-gateway.
That means, sending a command to the device directly is preferred over using the gateway as an extra hop.

If that's not expected, one solution could be that the setLastKnownGateway clears any existing direct adapter instance mapping.

Even if we change the above described behaviour, clearing the mapping entry would mean that we would also somehow have to inform the adapter, which created the direct adapter instance mapping, to remove the command subscription link to the device then. How to do that?

calohmn added a commit to bosch-io/hono that referenced this issue Jun 2, 2020
…apterInstance.

This parameter won't be needed anymore.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit that referenced this issue Jun 3, 2020
This parameter won't be needed anymore.

Signed-off-by: Carsten Lohmann <[email protected]>
@calohmn
Copy link
Contributor Author

calohmn commented Jun 11, 2020

The outcome of the community call discussion on this has been put into the new issues #2028 and #2029. Therefore I'm closing this issue here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C&C Command and Control Device Connection issues regarding the storage of device connection information
Projects
None yet
Development

No branches or pull requests

4 participants