# Infinite Processing of Device API Messages

# Overview

The Ability platform features improved functionality that supports a retry strategy for http transient requests, allowing for an infinite number of retries, as well as transient error handling for all Azure dependent components. As such, no transient_http_error code response is generated when an acknowledgement is requested.

Device API messages now allow specifying a timeout for request / action processing.

# Timeout Default Setting

For all DeviceAPIv1 and DeviceAPIv2 messages, a new parameter is introduced called timeout. Note that:

  • this value is optional

  • this value is of integer type (0 denotes infinity)

  • the following default values apply:

    • for DeviceAPI v1 actions - set to 86400 seconds (to indicate 24 hours for processing)

    • for DeviceAPI v2 actions - set to 300 seconds (to indicate 5 minutes for processing)

The new timeout parameter will be used to indicate the maximum number of seconds which are available for processing of the event until the timeout occurs, where:

  • the start of processing is counted when the DCS-PlatformEventProcessor receives the message from the input D2C queue and internally starts processing the message

  • the end of processing is counted as the internal end of processing of the D2C message by the DCS-PlatformEventProcessor

NOTE

C2D message processing is not included in the timeout time range, i.e., the confirmation of the C2D message sent and the confirmation of delivery are not part of the processing which can "time out".

Once the timeout occurs, the processing of the D2C message should be terminated. If requested, an acknowledgement with the code timeout_exceeded should be delivered to the device.

NOTE

A timeout will have an effect on DCS processing only. If a message "stops" in other components, e.g., IoT Hub, the elapsed time there will not be subtracted from the original timeout value.

# Example Use Case 1

Note that using only the infinite timeout allows the user to maintain the order of processing of each D2C message.

For example, assume that one device is sending several consecutive messages:

  • If the messages have an infinite timeout, they will be processed in FIFO order.

  • If, however, the messages have a finite timeout, a timeout of any one of the messages will result in the following messages being processed out of sequence. If the timed out message is resent by the user, it will be processed last.

# Example Use Case 2

Note that implementation of this feature can potentially cause bottlenecks to appear.

For example, assume two devices are being assigned to one EventHub partition:

  • one device with infinite timeout on the message

  • the second device with default timeout on the message

This timeout asymmetry can break the Ability Platform / DCS processing capabilities, as the first message will be processed first in order (FIFO).

If we assume this to be an undesired outcome, we must accept that the Ability Platform Operations team would have to intervene.

In this use case, if we reach a state where the first message is processed for a very long time, this means that there are transient errors happening consistently in the platform.

As such, if processing were finite for this first message, the message would timeout, however transient errors would still occur for the second and all subsequent messages.

This contingency would require intervention from the Operations team regardless of the timeout values, due to a glitch in the platform.

Resending the same messages to the platform could actually put more stress on the D2C queue.

Last updated: 9/6/2021, 1:25:50 PM
Feedback