Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce CommandRouting service/component (replacing DeviceConnectionService) #2029

Closed
calohmn opened this issue Jun 11, 2020 · 9 comments
Closed
Assignees
Labels
C&C Command and Control enhancement
Milestone

Comments

@calohmn
Copy link
Contributor

calohmn commented Jun 11, 2020

The current Command & Control implementation uses tenant scoped command/[tenantId] links to initially receive commands from a downstream application. After having received a command message on that link, the protocol adapter instance that the command target device is connected to is being identified and the command is sent/routed to that adapter instance.

Currently, each adapter supports these tenant-scoped links and the corresponding routing logic.

This means added complexity for protocol adapter implementations and makes command processing also somewhat unintuitive (a command may be received first by an AMQP adapter and then routed to an MQTT adapter for example).
It also means that potential issues in a custom protocol adapter could have consequences for commands directed at one of the standard Hono protocol adapters (if these command messages are received by the custom protocol adapter first).

This all leads to the idea of introducing a separate component that will first receive all command messages and then route them to the appropriate protocol adapter instance.
Such a CommandRouting component could then include also the implementation of the current DeviceConnectionService. That means that the CommandRouting implementation has the service to identify the target protocol adapter instance of a command message right inside its own component. Thinking further along this line, we can just as well replace the DeviceConnectionService with the CommandRouting service.

Operations of the new CommandRouting service (to be used by the protocol adapters):

  • setLastKnownGatewayForDevice(tenant, device, gateway)
  • registerCommandConsumer(tenant, device, consumerId, lifespan)
  • unregisterCommandConsumer(tenant, device, consumerId)

(consumerId here means the same as what was called adapterInstanceId in the DeviceConnectionAPI; adapterInstanceId was a bit misleading, as there are usually multiple such consumers per adapter instance, corresponding to the number of vert.x verticle instances EDIT: using adapterInstanceId is probably more appropriate since we want to keep the number of related resources (e.g. Kafka topics) low. That requires mapping incoming commands to the vert.x verticles (via vert.x event bus), though.)

For #1276, this method will probably be needed as well:

  • registerCommandConsumer(tenant, device, correlationId, consumerId, lifespan)

The AdapterInstancesLivenessService (#2028) will then also be incorporated into the CommandRouting service. For that, an additional, optional operation could be added:

  • ping(consumerId)

This would then be used by custom protocol adapters that are not in the same kubernetes cluster as the other adapters and the CommandRouting component.

@calohmn
Copy link
Contributor Author

calohmn commented Jun 12, 2020

The architecture and command message flow would look like this then:
CommandRoutingService
(1) Device subscribes as a command consumer at a protocol adapter.
(2) Protocol adapter invokes registerCommandConsumer with device/tenant id and its adapter instance id on the CommandRouting service; the CommandRouting service stores this mapping from device id to adapter instance id in some (potentially external) key-value datastore (making sure the data can be retrieved from all CommandRouting component instances); the CommandRouting service makes sure there is a receiver link with the AMQP network on command/$tenant.
(3) Northbound application sends command.
(4) CommandRouting service receives command, identifies adapter instance id for the device of the command message (by querying key-value datastore mentioned above).
(5) CommandRouting service sends command on command_internal/$adapterInstance
(6) Protocol adapter receives command on that link and sends command to device/gateway


Variation:

If we use Infinispan anyway here (as storage for the mapping data currently still set by setCommandHandlingAdapterInstance) and if we let the CommandRouting component instances themselves make up the Infinispan cluster (i.e. use an embedded cluster), we could skip routing the command messages via the AMQP messaging network to the adapter instance and instead let the adapter instances open the command_internal/$adapterInstance receiver link with the CommandRouting component. When using multiple CommandRouting instances, a command may then be routed by means of the inter-cluster transport (e.g. vert.x clustered event bus) from one CommandRouting instance to the other.
This would look like this:
CommandRoutingService_InterCom
The extra step (5) here would be transferring the command message to the actual CommandRouting instance to which the target protocol adapter instance has opened its consumer link.

Advantage of this variation would be that the command routing would remain inside the Hono Kubernetes cluster (if the AMQP messaging network is external). Disadvantage would maybe be some added complexity for the communication between the CommandRouting instances, but when using the vert.x cluster features, this could be quite straightforward.


Timeline:

Introducing and using the CommandRouting service involves the following steps:

  1. deprecate the HotrodBasedDeviceConnectionClient that lets the protocol adapters connect directly to the Infinispan cluster
  2. deprecate the DeviceConnection API
  3. introduce the CommandRouting API and a server/client implementation. The server implementation could also still include the DeviceConnection API implementation so it stays backward compatible.
  4. Remove HotrodBasedDeviceConnectionClient
  5. Remove DeviceConnection API definition and client/server implementation.

I think we could do 1. and 2. with Hono 1.3 already, then integrate the new functionality (3) in Hono 1.4. If we stay with the current definition of the Device Connection API as still "subject to change" (ie. not following the strict versioning rules, I think we defined that in a recent call), we could also do steps 4. and 5. in Hono 1.4. (Otherwise we could keep Hono 1.4 compatible with older adapters and remove things in 2.0.)

@ctron, @dejanb, @sophokles73 What do you think (about architecture/variation and timeline) ?

@calohmn
Copy link
Contributor Author

calohmn commented Oct 29, 2020

Based on the discussion in #2273, here is an updated design.

With that, command routing on the command_internal/$adapterInstance address isn't done via the AMQP messaging network anymore. Instead, protocol adapters connect directly via AMQP with the CommandRouter.

As one protocol adapter shouldn't have to open connections with each CommandRouter instance, it has to be ensured that an incoming command arrives at the very CommandRouter instance that the command target protocol adapter has connected to.

This is achieved by simply letting all commands be sent via multicast on the command/$tenant address.
An incoming command in the CommandRouter will trigger a check with the stored "device id → consumer id" mapping to find the target adapter instance consumer. Then: Either the CommandRouter instance has a link to that adapter instance consumer, so that it will forward the command on that link, or it doesn't have such a link, ignoring the command message and relying on the fact that another CommandRouter instance (the one with the matching link) will have forwarded the command message.

CommandsAMQP_CommandRouter2

EDIT (2020-11-24):
As there are potential issues with such a multicast approach for the Kafka implementation (see #2273 (comment)), the plan is now to stick with the 1st option above, forwarding commands on the adapter instance address via the AMQP messaging network.

@dejanb
Copy link
Contributor

dejanb commented Oct 29, 2020

Looks good. Can we consider here switching "device connection management API" to HTTP at some point? I find it a better fit especially in k8s environment. It's not the necessary for the first phase of course.

@sophokles73
Copy link
Contributor

Note that based on http://qpid.apache.org/releases/qpid-dispatch-1.14.0/user-guide/index.html#configuring-routing-qdr an unsettled command message sent by an application will be settled with final state accepted if any of the receivers has accepted the message and none have rejected it.

IMHO this means that if a command routing service instance which does not have a matching link MUST NOT accept such a message (and ignore it) but should instead release the command message in this case.

You were probably already aware of this, I just wanted to mention it explicitly here for transparency.

@calohmn
Copy link
Contributor Author

calohmn commented Nov 3, 2020

IMHO this means that if a command routing service instance which does not have a matching link MUST NOT accept such a message (and ignore it) but should instead release the command message in this case.

Yes.

Can we consider here switching "device connection management API" to HTTP at some point?

Well, if the goal is to have an HTTP API, I think we can define the new "command routing API" as an HTTP API from the start.
(I would prefer the term "command routing API" to "device connection management API" here.)

What I'm still thinking about is whether we should put a setLastKnownGatewayForDevice method in there.
The lastKnownGateway is currently used for deciding to which gateway to forward a command to in case multiple gateways are a possible target. However, if a device only randomly sends telemetry/events, thereby updating the lastKnownGateway information, and if the device switches from one gateway to the other more often, then this information isn't necessarily useful and wouldn't prevent the command from potentially reaching a gateway that the device isn't connected to at that moment.
I think an alternative could be to just forward a command to all available (i.e. connected and subscribed) target gateways.
The decision on which disposition (accepted, released, rejected) to update the command delivery with, could then be done in the same way as the Qpid dispatch router does it. EDIT: See the newly created #2295 for that.

@calohmn
Copy link
Contributor Author

calohmn commented Nov 3, 2020

Can we consider here switching "device connection management API" to HTTP at some point?

Well, if the goal is to have an HTTP API, I think we can define the new "command routing API" as an HTTP API from the start.

But, the protocol adapters need an AMQP connection to the CommandRouting component anyway (for receiving the commands). So, this connection (and the corresponding configuration) could be used for the API methods as well, not requiring the adapters to use 2 CommandRouting component related endpoint configurations (HTTP and AMQP). Therefore I think I would rather use AMQP here.

calohmn added a commit to bosch-io/hono that referenced this issue Nov 4, 2020
@calohmn
Copy link
Contributor Author

calohmn commented Nov 4, 2020

I've added a PR for a AMQP based API in #2280.
(Even though we might skip the direct AMQP connection from protocol adapter to CommandRouter in favor of the hop back to the AMQP messaging network (see discussion in #2273), I still think it would be good to start out with an AMQP API. It's in line with the other Hono internal APIs between adapters and device registry. For HTTP, we would also first have to implement the authentication mechanism in connection to Hono Auth (then token based, I guess).)

calohmn added a commit to bosch-io/hono that referenced this issue Nov 10, 2020
@calohmn
Copy link
Contributor Author

calohmn commented Nov 10, 2020

Just to reiterate the next steps here:
In the next Hono 1.5 release, we would still support the current way commands get routed (first reaching a protocol adapter and then potentially being forwarded to another adapter). For that, the adapter would still use the configuration to access the DeviceConnection service and would use the ProtocolAdapterCommandConsumerFactoryImpl (perhaps renamed to "LegacyProtocolAdapterCommandConsumerFactoryImpl").

To use the new Command Router component, the adapters can be configured with a set of commandRouter configuration options (instead of deviceConnection), leading to a new CommandConsumerFactory implementation class to be used (by enabling a corresponding bean) - one that targets the Command Router component.

The Command Router component might internally still use the Device Connection API, but that's just an implementation detail. In a future Hono version we can then remove the Device Connection API.


One downside of this approach would be that the protocol adapters can't use the HotRod based client to store the "device→adapter instance" mappings in a very efficient manner anymore - instead these mappings get stored via the Command Router component. That would also apply to the "lastKnownGateway" entries. But, IMHO the added latency isn't that important when registering consumers. Only the "lastKnownGateway" information gets set frequently, so this could add extra load on the Command Router component. But I don't know if that would be a reason to keep the Device Connection API with just the "lastKnownGateway" methods. We can still consider removing the "lastKnownGateway" usage altogether (see comment above).

calohmn added a commit to bosch-io/hono that referenced this issue Nov 11, 2020
calohmn added a commit that referenced this issue Nov 12, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 12, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 12, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 12, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 13, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 13, 2020
calohmn added a commit that referenced this issue Nov 13, 2020
Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Nov 13, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 13, 2020
calohmn added a commit that referenced this issue Nov 16, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 17, 2020
calohmn added a commit that referenced this issue Nov 17, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 20, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 20, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 20, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 20, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 20, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Nov 23, 2020
…sts.

This adds a org.eclipse.hono.adapter.client.command.CommandConsumerFactory
implementation that uses the new Command Router component.
For the integration tests, a new maven profile 'command-router' is added to
let the tests run using the Command Router component.
The GitHub action workflow has been adapted to use that profile in the
test-run that uses the jdbc device registry (so that the other test-runs
still use the old command routing mechanism).

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Nov 23, 2020
…sts.

This adds a org.eclipse.hono.adapter.client.command.CommandConsumerFactory
implementation that uses the new Command Router component.
For the integration tests, a new maven profile 'command-router' is added to
let the tests run using the Command Router component.
The GitHub action workflow has been adapted to use that profile in the
test-run that uses the jdbc device registry (so that the other test-runs
still use the old command routing mechanism).

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Nov 23, 2020
…ng used.

When processing Command Router API requests, the original vert.x
context of the CommandRouterAmqpServer needs to be restored at
the end, because the commandConsumerFactory has switched to the
context in which the downstream AMQP connection is used.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Nov 23, 2020
…sts.

This adds a org.eclipse.hono.adapter.client.command.CommandConsumerFactory
implementation that uses the new Command Router component.
For the integration tests, a new maven profile 'command-router' is added to
let the tests run using the Command Router component.
The GitHub action workflow has been adapted to use that profile in the
test-run that uses the jdbc device registry (so that the other test-runs
still use the old command routing mechanism).

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Nov 25, 2020
…ng used.

The CommandRouterServiceImpl is no Verticle anymore, therefore
there is no issue anymore with CommandRouterAmqpServer requests
ending up being handled on the wrong vert.x context.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit that referenced this issue Nov 26, 2020
The CommandRouterServiceImpl is no Verticle anymore, therefore
there is no issue anymore with CommandRouterAmqpServer requests
ending up being handled on the wrong vert.x context.

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Nov 26, 2020
…sts.

This adds a org.eclipse.hono.adapter.client.command.CommandConsumerFactory
implementation that uses the new Command Router component.
For the integration tests, a new maven profile 'command-router' is added to
let the tests run using the Command Router component.
The GitHub action workflow has been adapted to use that profile in the
test-run that uses the jdbc device registry (so that the other test-runs
still use the old command routing mechanism).

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit that referenced this issue Nov 26, 2020
This adds a org.eclipse.hono.adapter.client.command.CommandConsumerFactory
implementation that uses the new Command Router component.
For the integration tests, a new maven profile 'command-router' is added to
let the tests run using the Command Router component.
The GitHub action workflow has been adapted to use that profile in the
test-run that uses the jdbc device registry (so that the other test-runs
still use the old command routing mechanism).

Signed-off-by: Carsten Lohmann <[email protected]>
calohmn added a commit to bosch-io/hono that referenced this issue Dec 1, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Dec 2, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Dec 3, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Dec 3, 2020
calohmn added a commit to bosch-io/hono that referenced this issue Dec 3, 2020
calohmn added a commit that referenced this issue Dec 4, 2020
calohmn added a commit that referenced this issue Dec 4, 2020
calohmn added a commit that referenced this issue Dec 4, 2020
@calohmn
Copy link
Contributor Author

calohmn commented Dec 4, 2020

Corresponding changes have been committed now.

@calohmn calohmn closed this as completed Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C&C Command and Control enhancement
Projects
None yet
Development

No branches or pull requests

3 participants