Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolution of issue "Non slewing time adjustments #203" V2 #354

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

PaulMartinsen
Copy link
Collaborator

📑 Description

Following extensive discussion in PR #238 and during SDPi Workshop #4 this PR builds on that draft PR with:

  • a new use-case for non-slewing time adjustments,
  • an extension to support versioning MDIB timestamps.

The original draft PR went wonky so this one rebuilds the key changes on a recent master.

☑ Mandatory Tasks

The following aspects have been respected by the pull request assignee and at least one reviewer:

  • Changelog update (necessity checked and entry added or not added respectively)
    • Pull Request Assignee
    • Reviewer

…solve non-slewing adjustments with a new sequence id.

(cherry picked from commit 1956dcd; resolved conflicts: asciidoc/volume1/use-cases/tf1-ch-c-use-case-stad.adoc)
Added new use-case for non-slewing time adjustments
Moved R1522 into new use-case.
@PaulMartinsen PaulMartinsen added Volume 1 Volume 1 contents Volume 3 Volume 3 contents Spec An proposal related to the content or organization of the specification labels Dec 11, 2024
@PaulMartinsen PaulMartinsen added this to the SDPi 2.0 Review milestone Dec 11, 2024
@JavierEspina JavierEspina linked an issue Dec 11, 2024 that may be closed by this pull request
.R1569
[sdpi_requirement#r1569,sdpi_req_level=may]
****
A <<vol1_spec_sdpi_p_actor_somds_participant>> may enter a fault state by, for example, setting the `MdsState/@ActivationState` to `Fail` upon detecting a non-slewing time adjustment that it otherwise cannot recover from.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want this to be even more strict:

While a provider cannot verify coherency of timestamps in the MDIB, for every MDS that is affected, the provider shall set pm:Clock/@ActivationState = Fail.

NOTE: If for technical reasons or risk related measures a provider cannot set a new sequence id after a non-slewing time adjustment ensued, the provider sets the clock into fail mode such that consumers know they must not trust any timestamps that are read from the provider's MDIB.


That includes setting the clock to fail even if epoch versions are used. This allows for the consumers that do not understand epoch versions to act as if the clock would not be reliable anymore. From that it follows that the epoch versioning extension is required to be uniquely identifiable by a consumer as a means to recover from a failed clock.


If a provider does not want to set its clock to Fail, it may produce a new MDIB sequence at any time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not sure I completely follow this one and its interaction with other points. I think it depends on whether we put mustUnderstand="true" on some extension in the clock descriptor, and whether we are trying to deal with participants outside SDPi. This is where I got to:

If we put mustUnderstand="true" on in the clock's descriptor for an versioning extension:

  • consumers have to understand whatever we come up with in SDPi otherwise they must not to process messages.
  • the 3rd option in 1522 provides more nuance on which timestamps are reliable and which are not:
    • sdpi:Epochs/@Version changing signals a abrupt time adjustment,
    • pm:ClockState/@LastSet signals when that occurred (in the current time-reference frame),
    • no timestamps before @LastSet can be relied on even if the consumer understands epoch versioning and the timestamps are versioned because:
      • the reason for the abrupt timestamp can't be determined by the participants, and
      • the timing for the abrupt time adjustment can't be determined by any participant to a resolution any higher than the sync timing.
  • I don't think there's really any downside to setting pm:ClockState/@ActivationState != "On" but in the above context it isn't necessary.
  • We could reserve Fail for clock failures (e.g., hardware fault) and use Stndby for synchronization failures; maybe we don't need to separate these things though.
  • I think the strict part is in 1522 (providers must chose one of the options); 1569 is more about consumers understanding of what the fault state means; I assume providers can go into a fault state whenever they like.
  • Perhaps a pm:AbstractDescriptor/@SafetyClassification="MedA" (display only) might have a relaxed attitude to abrupt time jumps happy to just convey the information while a MedC (clinical function) provider might fault the whole device.
  • I better make 1569 refer to the provider as consumers don't have a activation state to set.
  • Is world time always a vital part of system function contribution? Perhaps a heart-lung machine would be more interested in continuing oxygenation than time-stamping exactly when a time-discontinuity occurred?

If mustUnderstand isn't true, I don't think there is anything about how a consumer should interpret ClockState/@ActivationState != "On". It might surprise them but SDPi conformance testing could validate they do something sensible. Although consumers that are on-board with SDPi should know what do with mustUnderstand=true. Are we worried about non SDPi compliant consumers? This just makes me think we should have mustUnderstand=true.

In conclusion, I'm not sure what to do next here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think there is a lot of confusion going on - let my try to settle things:

  • The requirement I listed above is needed to determine the time frame in which the clock is not trustworthy - it is not sufficient to formulate something along the lines of R1522, because it does only determine how to act at the time when there is a non-slewing adjustment. However, there was no restriction on the duration of keeping the activation state in StdBy
  • Eventually, things need to be integratable in a way that consumers who do not support epoch versions can seemlessly rely on the StdBy to express incoherent timestamps - epoch versions is an extra feature consumers may use to adjust otherwise broken timestamps.

Now that I am getting depper into the topic, I think we can make the extension mustUnderstand=false, because we signify "something is being wrong" by using the StdBy flag and add epoch information to allow for consumers to calculate correct times based on epoch versions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This time business is much more complicated than it looks! I intended the latest R1522 to do all the things you mentioned:


When the <<vol1_spec_sdpi_p_actor_somds_provider>> detects an abrupt time adjustment of a system clock, used in making its System Function Contribution (<<acronym_sfc>>), the <<vol1_spec_sdpi_p_actor_somds_provider>> shall either:

  1. initiate a new MDIB sequence by assigning a new MDIB sequence identifier, or
  2. set pm:ClockState/@ActivationState to StndBy when any timestamp in a <<acronym_mdib>> version was not obtained from the time-reference frame of the active clock in the same version, or
  3. set pm:ClockState/@LastSet to the earliest time that is unambiguously in the current epoch and increment sdpi:Epochs/@Version.

To break it down:

  • (1) is the sledgehammer, providing no information about the timeframe for invalid clock but may be a useful option for some providers.
  • (2) signals to the consumer, using ClockState/@ActivationState, that there are one or more timestamps from a different time-reference frame in the mdib. They could be anywhere (metrics, components or extensions etc) and there is no way to tell which timestamps are trustworthy and which are not.
  • (3) does two things:
    • it provides equivalent information to consumers as (2), though it does it using sdpi:Epochs/@Version changing and pm:ClockState/@LastSet. That is a consumer can compare every timestamp (whether metric, component or extension) to the value of pm:ClockState/@LastSet; any that are strictly less are untrustworthy. That's the case at the time of the adjustment or six months later.
    • it lets consumers figure out which timestamps are untrustworthy (by comparing with pm:ClockState/@LastSet).

Note

the provider may generate timestamps from the current time-reference frame that are less than pm:ClockState/@LastSet. This can happen when a non-slewing adjustment moves the clock backwards in time and the MDIB previously contained timestamps from the "future".

The downside of (3) not setting the @ActivationState is that the consumer has to:

  1. understand sdpi:Epochs/@Version, and
  2. check every timestamp for validity.

One approach I can see providers taking with option 3 is leaving timestamps to update of their own accord. Notably, contexts may have untrustworthy timestamps until they are validated again, perhaps in a week or two when the provider is used on a new patient. It seems consumers might have a hard time figuring out what to do if @ActivationState="StndBy" for a week or two. However, if only setting @ActivationState="StndBy" is one of the approaches we want to have here, then I think it would make sense for the versioned one to also do that.

So, I'll update the text to:


When the <<vol1_spec_sdpi_p_actor_somds_provider>> detects an abrupt time adjustment of a system clock, used in making its System Function Contribution (<<acronym_sfc>>), the <<vol1_spec_sdpi_p_actor_somds_provider>> shall either:

  1. initiate a new MDIB sequence by assigning a new MDIB sequence identifier, or
  2. set pm:ClockState/@ActivationState to StndBy when any timestamp in a <<acronym_mdib>> version was not obtained from the time-reference frame of the active clock in the same version, or
  3. set pm:ClockState/@LastSet to the earliest time that is unambiguously in the current epoch and increment sdpi:Epochs/@Version and set pm:ClockState/@ActivationState to StndBy while any timestamp in a <<acronym_mdib>> version is less than pm:ClockState/@LastSet.

I worded it slightly differently for greater clarity with the relationship to pm:ClockState/@LastSet but the behaviour of pm:ClockState/@ActivationState is identical for 2 & 3.

@@ -0,0 +1,86 @@
[#vol3_clause_timestamp_versioning]
====== Timestamp versioning
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need some sort of a formula to provide guidance as to how a consumer may calculate actual timestamps based on the epoch version and a base/reference timestamp.

So far I did not read the whole text, so the following questions came up:

  • Is the offset based on the previous epoch or based on a reference timestamp?
  • How does this design take into consideration that there may be timestamps affected by non-slewing adjustments in extensions?
  • Do we need the epoch version extension to be mustUnderstand? If yes, we need to show that somewhere in the descriptor as otherwise consumers may bump into a mustUnderstand extension after inspection of the description (perhaps something attached to the decriptor to signify support of epoch versions)

Copy link
Collaborator Author

@PaulMartinsen PaulMartinsen Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I don't think it is possible for a consumer to calculate actual timestamps following an abrupt change in time. Here's an attempt to show why:

Let's say T0 is the current time obtained from a stratum 0 clock, the most accurate timekeeper on the planet, an atomic clock for example. And TP is the time from a provider's clock. Then: $$T_P = T_0 ( 1 + f(T_0)) + \Delta + \eta$$.

Here:

  • $$\Delta$$ is the average difference in time between the reference and provider's clocks,
  • $$\eta$$ is a random number, probably from a normal distribution, that reflects the uncertainty in transferring time from reference clocks through to the provider's clock. I think this would be quite small under the right (or even sensibly typical) conditions; maybe a few milliseconds or less.
  • $$f(T_0)$$ is a scaling function that represents the behaviour of the providers time service client tyring to keep the times in sync. Since non-slewing adjustments (loosely) occur when the difference between reference and provider clocks exceed a bit over four minutes, one possibility for $$f(T_0)$$ might be:
    $$f(T_0) = 5 minutes * T_0$$, although if it was then the provider's clock discipline algorithm should compensate for that (with a slewing adjustment)
    another might be:
    $$f(T_0) = \delta(T_0 - t_{adjust})*2$$, which could be a 2 hour forward adjustment of the reference clock at some point $$t_{adjust}$$; the provider doesn't know exactly when that happens because it is probably checking in with the reference clock every 10 minutes or 2 hours or something.

$$f(T_0)$$ might also be much more complicated. I'd actually expect something more like one of these examples because the clock-discipline algorithm seems to be a closed loop controller so it probably oscillates towards an equilibrium, or maybe (probably) its well tuned and just decays to a small error. Left graph illustrates time from provider plotted against time from precision source; right graph is the difference between them:
image

Note, these don't illustrate the non-slewing adjustment; this is just the slewing behaviour. And they should be considered a novice guess. I can't recall enough of the algorithm to know how it behaves following a non-slewing adjustment; possibly a step change followed by damped oscillations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we can do is be very clear about how the sdpi:Epoch adjustments must be implemented. This is covered inthe schema and 3:8.3.2.10.8.1 Model, but I've added a diagram too since it is critical every implementation does it the same way.

image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think there should be an extension in the clock descriptor with a must-understand attribute. I think this would prevent consumers that aren't fully up to date with SDPi requirements being surprised by epoch versions as if they don't understand they should cease any further communication with the provider.

I'll add something in the next couple of days to the schema as a starting point. Do you think this would be a place to declare how the provider handles non-slewing adjustments (e.g., new sequence id vs setting activation state), or is this just a guard against surprises?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a sdpi:EpochSupport element to advise support for the extension inside a pm:ClockDescriptor and updated requirements. It includes an optional MustUnderstand and a Version that defaults to 1. Not sure if this is the right approach, but a start.

After more consideration, I don't think it makes sense provide more detail on how non-slewing adjustments would be handled both (1) to reduce complexity and (2) a provider might choose different options in different scenarios. For example, it may switch from timestamp versions to starting a new sequence id if it decides to abandon a time server following a lot of non-slewing adjustments.

asciidoc/volume1/use-cases/tf1-ch-c-use-case-stad.adoc Outdated Show resolved Hide resolved
asciidoc/volume1/use-cases/tf1-ch-c-use-case-stad.adoc Outdated Show resolved Hide resolved
asciidoc/volume1/use-cases/tf1-ch-c-use-case-stad.adoc Outdated Show resolved Hide resolved
@d-gregorczyk
Copy link
Collaborator

General remark: shouldn't we deliver options 1 and 2 and only then continue crafting option 3? I fear it may take some time until we are able to merge option 3 to master.

* Fixed actor in R1560
* Added option to make the activation state of the clock standby while there were timestamps from a different time-reference frame present in the mdib.
* Updated R1521 to express intent and build on established methods.
* added a glossary, temporarily in the use case section
@PaulMartinsen
Copy link
Collaborator Author

General remark: shouldn't we deliver options 1 and 2 and only then continue crafting option 3? I fear it may take some time until we are able to merge option 3 to master.

I don't know. My thoughts FWIW:

  • It seems to me the most useful approach involves an extension because it isn't enough to just set ClockState/@LastSet when the provider isn't using NTP. I'll see if I can think of a way to avoid that.
  • Using only activation state doesn't seem as good when there are long lived things like location and patient contexts. However, since non-slewing adjustments should never happen in practice, maybe it isn't so bad; still making progress?
  • If we need part of an extension we might as well go the whole way.
  • I think we are close to something that, while not perfect, is a big step forward.
  • Perfection is the barrier to progress and change isn't impossible.
  • I'd probably lean towards including something that can be implemented and tested (particularly if there will be a must-understand element) both because the problem isn't going away and folk want to deploy implementations. Maybe we could start with the must-understand as false to avoid breaking changes as a first step?
  • Getting everything finalized by the 14th deadline looks challenging.

But I'll leave the decision to wiser heads and see if I can place some asciidoc ifdefs to give us useful options over the next couple of days.

@ToddCooper
Copy link
Collaborator

SDPi Friday 2024.12.13 -

Still too many open issues. PR pushed to 2.1

.R0601
[sdpi_requirement#r0601,sdpi_req_level=shall]
****
A <<vol1_spec_sdpi_p_actor_somds_provider>> shall reset all versioned timestamps when it assigns a new MDIB sequence identifier (`pm:MdibVersionGroup/@SequenceId`).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Reset" is a problematic term as there is no definiton of "reset" in XML instances. There needs to be explicit instruction to what has to be deleted, inserted, or updated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I didn't know how to describe so I followed R0047 in 10207.

How about:

A <<vol1_spec_sdpi_p_actor_somds_provider>> shall set the version of all versioned timestamps to 0 when it assigns a new MDIB sequence identifier (pm:MdibVersionGroup/@SequenceId).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Spec An proposal related to the content or organization of the specification Volume 1 Volume 1 contents Volume 3 Volume 3 contents
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Non slewing time adjustments
4 participants