-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The sensor value override feature is obscure and needs visibility #32
Comments
Three bools gives 8 states. Are they all valid? I expect it would be better to use a "choice" style string-type property that behaves as an enumeration (and given its OpenBMC-related, needs to be defined in PDI and use the sdbusplus enum support). |
This was an intentional design decision, the expectation is that when completed running mocks/testing, the user by policy should reboot the bmc when the test completes. Doing anything else is a half-measure that very likely could lead to things not being reset properly, and us debugging code that isn't useful to a product. With that said, if you wanted to add support for something like this and document that it's best effort, feel free.
The compile time flag is to allow opting out of the feature for platforms that don't want it, or might consider it a potential DOS issue if an attacker were to get access to dbus. Why is returning an error message that a feature is not supported a problem in practice?
A dbus interface somewhat of defeats the purpose of having a mock, because now other services can code to "is this being mocked" and the behavior of the code could be different when in test mode versus not. I could certainly see the argument that we should log these things to the journal when they're mocked and when they're not, but a formal interface implies that we expect software to use it, which by design, we don't want to have happen.
What does this mean? There are absolutely test harnesses that run this test during a production build. In fact, that's a lot of the reason for having it, being able to troubleshoot software on a production signed build. For the logging portion, I think good old journal logs would get us where we need to be in short order, logging when sensors enter and exit mocking mode. |
Good point. There are less than 8 valid states, so an enum might be preferable. Here's the states that would be valid:
I still think using bools would be clearer than having to come up with strings to describe each of these. Another reason to use bools, instead of an enum, is to make it impossible to have to deal with the case of an invalid string, and also, to make it self-documenting what choices are available (instead of forcing the user to guess, or read the source, to learn which strings are valid). |
Interesting. Although inconvenient, it does reduce the attack surface for bugs.
Because it's a one-way thing, as you mentioned above, no way to undo it without restarting the sensor services (or rebooting the BMC). If I wanted to probe for the existence of sensor mocking, there's no way to do this safely, as if I try to write a value, even the same value that the sensor is already set to, it will either generate the error message or activate the sensor mocking feature, even if I don't want to activate it (I just wanted to see if it was enabled or not).
What would be the use case of a "stealth" mocking feature that would be enabled but not having any way for the user to know if it is enabled or not? As for the interface, we could do the same compromise that was done for
Agreed, and one of the tests I wanted to add in our product was a test to properly verify that the sensor mocking feature was disabled and inactive during a production build. Given the above, I can't do this test without inadvertently activating the sensor mocking feature (which is a test failure, but causes me to have to reboot the BMC, because of this). So, this test has to be classified as a disruptive test instead of a passive test, which, among other things, means I can't run it in parallel with other tests (in our particular test framework).
This would be good to add, as a low-hanging fruit. As for entering mocking mode, this would take place upon the first D-Bus write received (unless the sensor is a special one with its own write handlers, such as the fan PWM sensors, which are designed to push sensor writes down to the underlying hardware, to set the fan speed during normal running conditions). As for leaving mocking mode, there is currently no way to do this without a restart, as you mentioned, which already has fairly loud logging. |
Currently, the
dbus-sensors
sensor daemons support the idea of sensor value override, that is, the ability to receive a written value from a D-Bus incoming command, instead of reading the actual value from the underlying sensor hardware. This feature is also known as sensor mocking.This feature is very useful for debugging, but it is obscure. It has the following limitations:
Enabling or disabling sensor mocking is done at compile time. At runtime, there is no way to see if the sensor mocking feature is available or not, short of trying it to see if you get an error message or not.
The override is placed into effect when the first D-Bus write command is received, and then, it will remain in effect until the sensor daemon itself is forcefully restarted. There is no way to gracefully cancel the sensor override feature from being in effect.
There is no way, at runtime, to see if the current sensor value was placed there by use of sensor mocking, or by use of actually reading from the underlying hardware. This would be good to have, for troubleshooting.
So, to summarize, let's make an interface with three bools, and add it to each sensor.
Is the sensor mocking (sensor value override) feature allowed to be enabled at all for this sensor, or not? This decision was made at compile time. This is read-only at runtime.
Is the sensor mocking feature currently in effect? If allowed, it could be turned on or off, by writing to this boolean.
Is the current value of this sensor something that came from mocking, instead of being a value that came from the underlying sensor hardware? This would cover cases in which the mocking feature was enabled but the user has not written a value yet, or in which the mocking feature was disabled but the sensor hardware has not provided us with a new reading yet.
This is closely related to the Mutability feature we talked about a while ago. The difference is that of layering: sensors that are mutable are designed to have their values typically externally changeable by end users, such as fan speed settings in manual mode, and such. The sensor mocking feature is more of a development and test feature, however, and typically will not be allowed during production.
So, any thoughts on this?
The text was updated successfully, but these errors were encountered: