20.10.2 Platform Update

This release replaces Cloud 20.10.1 (Chicago) package and delivers changes only in the Automation code, that resolve 3 critical deployment issues. Issues were caused by unexpected changes in the underlying technology, which made previous release undeployable. No client-side functionality has been introduced in this update as it corrects internal deployment issue only.

# 20.10.1 (Chicago) Release Notes for Cloud

# Resolved Issues

# [76861] - Querying for multiple objectIds in a filter and condition result in the same filter

DA wasn't able to split large conditions from AuthZ, which was caused by limitations of the split method in QueryParsingLibrary. Below list of changes have improved the query filter:

  1. Change nested "OR" conditions for same variable into array to shorten filter.
  2. Split array conditions when splitting by OR for all cases. Maximum count of elements in array is right now hardcoded to 5 in DA, but if it won't be passed to method we assume maximum count in array to 10.
  3. Split array conditions by AND for simple conditions

This query optimization will support larger filters, but still our recommendation is to use filters as minimal as possible, because there is limit of 2000 rules per SB topic. Right now we have SB topics as below:

  • timeseries
  • alarm
  • event
  • platformevent
  • devicefeedback
  • c2d-messages-to-send

Example: if a  user creates 500 subscriptions for alarms, each with filter that is split into four SB rules then we'll hit the SB limits and have errors.

# [77980] - AuthZ had a spike of 700 "500" errors and slow response times

The slow response time and 500 error was due to too many calls to read certificates from the Key vault. The issue was fixed by upgrading the 3rd party TLS library, version Stable12.

# [78079] - Last value falls behind at 16,000/s and 6000/s

Fixed performance issue with Last Known value, where latency was above 5 seconds.

# [78080] - DCS tracing is enabled by default

Global discussion about the diagnostic log levels was initiated for the Ability Platform. Those logs are used by developers / testers when trouble shooting the platform (not to be mistaken with Audit Logging). This fix is adjusting the log levels for the Ability Platform Device Configuration Service (DCS) component to a new default of "Warning". Further trouble shooting should be re-configured on instance level case-by-case.

# [78921] - Warm data api failure while accessing Variables

The problem with accessing data from Tsiv1 has been fixed. It was caused by worker thread exhaustion in Data Access.

# [79325] - EDGE node deviceDelete time required has been improved

The Device API responsiveness (C2D direction) is determined by few factors, like Azure IoT Hub pricing tier, number of IoT Hub units and the fact that device is connected to the platform or not. The C2D messages can be send to device immediately, or when-device-is-not-busy - reschedule the delivery in the future by 30 seconds. This patch is tweaking this behavior to be more stable in terms of removal of retry mechanism when trying to send C2D message back to the device. Previously, we tried to retry the operation for a disconnected device - now the C2D message will be immediately rescheduled. This behavior is a subject for further analysis and optimizations in the future.

# [79472] - Data Processing Pipeline default configuration quality additions by default is set to off

Data Processing Pipeline is formed of the following steps:

  • decompression
  • authorization
  • validation
  • data quality decoration

Up until now, the default configuration was setup in a way, in which DPP used authorization, validation and data quality decoration steps.

Data quality decoration adds a substantial amount of data to telemetry messages, which caused a considerable increase in size of data being sent to TSI.

Default configuration has been updated and now uses authorization and validation only.

Data quality decoration and decompression steps are still part of the delivery and are available for the clients to be turned on (on per Platform instance basis) on demand.

# [79646] - Data Access 500/504 errors returned

Fixed problem with querying TSIv1 with more than allowed concurrent requests, which can lead to 500/504 errors from Data Access.

# [80415] - DCS - timeout_exceeded returned despite of value of timeout property

When using DeviceAPI's "timeout" header, under certain conditions (for long-running d2c actions) the business logic behaved in an incorrect way - resulting in false-positive results in acknowledgements. This fix is patching the behavior, setting proper handling of "timeout" header.

# Known Issues and Limitations

# [84304] - Max limit of properties in TSI

Max limit of properties (columns) in TSI v1 is equal to 600 on S1 and 800 on S2 SKU.

# [81401] - Ingress limit limitation

On TSI v1 S1 Ingress limit is equal to 720 events per minute. On TSI v1 S2 Ingress limit is equal to 7200 events per minute.

# [81547] - Downloading files requires a filter change adding 'dt' to the filter string

An optimizations for filter parsing in all Data Access components was added, this improved system stability and reliability, and introduced following change in the API format due to a library update:

Before this change Data Access allowed filters format supported the following "timestamp > '2019-01-01T00:00:00Z'". After - It is required that all filter format include the following format change: "timestamp > dt'2019-01-01T00:00:00Z'".

Affected endpoints:

  • POST request for variables data: /api/v1/data/variables
  • POST request for events data: /api/v1/data/events
  • POST request for alarms data: /api/v1/data/alarms
  • object storage search for files: /api/v1/storage/object/files/search
  • global storage search for files: /api/v1/storage/global/files/search
  • create subscription for variables: /api/v1/subscriptions/variables
  • create subscription for events: /api/v1/subscriptions/events
  • create subscription for alarms: /api/v1/subscriptions/alarms

# [73963] - Latency issue causes newly created app in principal manager to be created without secrets

Known Issue: Occasionally a newly created application in the principal manager service will be created without secrets causing the app to become unusable because a bearer token cannot be obtained.

Workaround: Try creating the application once again after about 60 secs.

# [77948] - Data Process Pipeline may store additional duplicate values

Known Issue: The data processing pipeline may generate additional values for hot path, warm path, and cold path storage. The values can be recognized by an identical timestamp. This behavior may have existed in prior versions, but is more likely seen in the current version with the enhancements in the processing pipeline (new validation and data quality decoration functionality, revamped hosting. infrastructure).

Workaround: The user applications must ensure that they apply a filter prior to processing the data in the application. There is no workaround when aggregates are processed. For example, if aggregates count, sum or avg are used the duplicate values will be included in the results.

# [58452] - Device Registration Function (DPS) fails to respond to device requests when 15 Edges are starting simultaneously

Known Issue: Any attempts to register more than 8-10 devices parallelly will result in device registration failures

Workaround: Limit the parallelism to a maximum of 10 devices to overcome this issue

# [77522] - AuditLog events Count Mismatch for Device Created, Updated and Deleted operations

The body of platform events is stored in Audit Logging storage. However, in some cases this body contains a JSON object which exceeds Azure Table Storage column limitations.

In this case, when a platform event body is longer than 16k of characters, Audit Logging saves the following warning information into the "data" column: {"auditLogInformation": "Event body too long"}.

The original body of the event is not saved, however the user can still navigate to actual changes by using the event correlationId.

Limitations source:

# [69975] - Few devices are missing while searching devices for tenants

Known Issue: In the Principal Manager APIs, we currently return a maximum of 1000 results due to performance considerations. Due to this limitation when the number of objects under a resource exceeds 1000, the request client will not see all the objects in the database.

Workaround: The only option currently available to the user is to apply filters and parameters in the request, to narrow down to the number of returned results to be less than 1000 items.

# [73453] - A Solution it is currently not able to register more than 40K devices.

There is no workaround to this except to create a duplicate solution to add additional devices beyond 40K devices.

Limitation in caused by a maximum 2MB document size in CosmosDb.

# [73073] - Bad Request on variable subscription for long request not including objectIDs

Known Issue: When creating a subscription with a user token, filters that do not contain an objectId may result in an error preventing the subscription from being created of which the http response code may be 4xx. This is due to a limitation of a service bus filter only allowing for up to 1024 characters. This is usually hit due to a large ability-condition header being unable to be broken down into small enough filters for service bus.

Workaround: To workaround, include an objectId in the filter property for the Data Access request. Up to 40 objectids can be included in the filter.

Background tokens do not have tenancy therefore the authorization service does not optimize the ability-condition based on the given objectId in the filter.

# [76007] - DSL query escape sequence handling for backward slash ("\") in property value filter is not consistent

Known Issue: When using the backslashes ("\") in the object model properties and then trying to query them using DSL, the user cannot obtain it by a single escape character ("\\"), which is expected behavior.

Workaround: The workaround is to use double escaping in DSL query ("\\").

For example, having property:

{
    "browseName": {
        "value": "some\\path"
    }
}

one needs to use the DSL:

models(...).hasProperty("browseName", "some\\\\\\\\path")

# [75339] - Sorting functionality which has been implemented as part of Pagination & Searching feature in Principal Manager APIs, is case sensitive

Sorting functionality which has been implemented as part of Pagination & Searching feature in Principal Manager APIs, is case sensitive.

For example: when trying to sort a set of tenants, {ABB01, Robotics01, abb02, Volvo01, robotics02, volvo02} the result which will be returned when sorting ascending is {ABB01, Robotics01, Volvo01, abb02, robotics02, volvo02}

# [75168] - Instance API return - error code 502 Bad Gateway

Known Issue: Occasionally when calling any API endpoint on Instance stamp 502 Bad Gateway error can be returned.

Workaround: Contact the Ability Operations team, indicate that a 502 bad gateway error has been received, reference this release note an indicate that the AuthZ and principle manager should be restarted.

# [62908] - Principle Manager API fails to remove tenants - BadGateway

The problem can occur based on concurrent requests to the principal manager API. The Principal Manager APIs are using Azure B2C services to create Applications for business entities, e.g. Application, Solution, etc. The workflow in the PM is sequential and dependent on the result of the B2C operation. After a successful result from the B2C operation, the request is further processed to provide the respective response to the caller. For any B2C related request, some buffer time needs to be provided so that the action can be completed.

It is advised to maintain a gap of 60 secs between two requests.

# [59246] - Requests for bearer token for new apps lead to Bad Request

Known Issue: When concurrent requests to get a bearer token are sent, a client can receive a Bad Request response.

Workaround: The recommendation from Microsoft is that we should wait for a few seconds before trying to get the token for the application that has been created. According to Microsoft, it takes a maximum of 60 seconds to replicate the Application Settings across Azure regions.

# [55864] - Solution create audit log shows wrong event for audit log

The audit log for create solution shown is incorrect as an update instead of create.

# [77345] - When creating a solution or resource, principle manager service sporadically returns a 400 Bad Gateway response code

Known Issue: When creating a solution (or possibly another resource), Microsoft Graph API sporadically returns a 400 Bad Gateway response code with the message, "One or more of your reply urls is not valid". As a result the Solution is not created.

Workaround: The end user will need to resubmit the request

# [68309] - Unable to search for a file using user token after upload

Know Issue: When searching for files uploaded via Edge, requests using a user token are failing when the number of objects exceeds 500.

Workaround: When querying, the objectid, along with the path, can be passed in QEL format to overcome this limitation.

# [74065] - User cannot access applications when his "read" permission is limited to "user" delegation

Known Issue: When querying for applications using the "Query apps" endpoint, passing 'user' instead of 'User' for the delegation parameter returns empty results. Similarly, when using the get Apps endpoint, passing 'User' instead of 'user' for the delegation parameter returns empty results

Workaround: When querying for applications using the "Query apps" or "Get apps' endpoint, limited to user delegation, pass (delegation='user' OR delegation='User') for delegation parameter to get the expected results

Last updated: 9/6/2021, 1:25:50 PM
Feedback