Skip to content

Troubleshooting Logs

The Connectivity Platform offers live log monitoring capabilities, enabling real-time scrutiny of Device/Solution Data/Status as well as Error/Warning messages.

Devices Logs

The real-time device connectivity logs provide insights on the traffic incoming to the connectivity component and enable troubleshooting of device miss behavior.

Connectivity logs are JSON formatted with following structure.

Key Description
connectionId Identifier associated with the active connection. Use to build the tracking_id for internal events (see Solution logs).
direction INBOUND or OUTBOUND
domainId Public sub-domain (empty if "protocol=api")
event Event type (See Event types table)
id Device Identity
ipAddress IP address of the client (empty if "protocol=api")
key The path targeted by the event (eg. topic, resource or signal id)
message The description of the event
protocol Protocol used: http, mqtt, api (from scripting)
time Time of the event in date time format
timestamp Time of the event in microseconds

Event Types

Filter Type Description
Data data_in Messages received from the device.
Device Status deleted / expired / provisioned Device state changes
Connection Status connect / disconnect Connection or disconnection events, if disconnected, will also associate with a disconnect_reason
Error / Warning debug Unexpected events happing during connection, authentication or message parsing.

Error Logs

Connectivity error logs will provide understanding the reason why a client fails to connect or to publish data to the platform. Most notable errors are detailed bellow:

Message Description What to do
MQTT permission denied don't have enough permission Give permission on UI device page
MQTT subscription denied don't have enough permission Set permission in device page
[ScriptAsync] Dispatcher publish failed Script trigger failed No action needed, it will keep retrying in background
Message persist_status: timestamp too far in the future timestamp is too new should not publish future data, timestamp is in microsecond
Message persist_status: timestamp too far in the past timestamp is too old should not publish data that is 10 years ago, timestamp is in microsecond
Error: values should be map When publish batch data, values should be a map Use map for values
Error: timestamp is not number When publish batch data, timestamp should be a valid number Use valid timestamp
Error: timestamp and values are required. when publish batch data, both keys need to be provided Follow expected data format to publish data
Error: signal not exists signal is not created before publish data Need to publish to config_io to define the signal first
Error: resource not found Resource is not created before publish data Need to create resource on UI first
Error: invalid resource value Resource value does not match definition Need to match resource definition: number or boolean or string
Error: invalid data_in data_in is not a valid map after parsing into json format need to publish a valid json map
Error: invalid data_in json parsing error data_in is not a valid json string Need to publish a valid json string
Error: invalid signal value data_in is not in valid format data_in should contains key value pair, key is signal name, value is signal value
Error: invalid config_io config_io is not a valid json map need to publish valid config_io
Error: invalid config_io json parsing error config_io is not a valid json string need to publish valid config_io
Message process: Signal definition should be a map. signal definition is not a valid map publish map definition
Error: invalid config_io value signal definition is not valid value need to be a json map
MQTT save_signal_error payload too large publish payload is too large send a smaller payload, less than 1MB
MQTT rate limit error inflight messages more than max size send with a slower pace
save data to db has an error save data to db has an error try again later
MQTT bad_payload_format payload is not a valid format should be a valid json map
MQTT invalid_format resource value format does not match resource defined type publish expected data type
MQTT invalid payload not valid json payload is not a valid json array batch publish should be a json array
MQTT invalid topic publish topic is invalid publish to valid topic
MQTT publisher_failed event trigger failed try again later
MQTT Connection failed connection reached rate limit try again later
unsupported_topic topic is not valid need to publish to supported topics

Solution Logs

Note

Currently solution logs are only available in the logging system (See Historical Logs). Real time view in the portal shall be added in a later version.

Solution logs are JSON formatted with following structure.

Key Description
Solution_id The solution namespace Id in scope of the log event
Severity A numerical entry matching Syslog log severity standards. From 0=Emergency, to 7=debug
Timestamp Unix timestamp of the event in milliseconds
Type Type of the event defining the content of the data field (See below)
Service The IoT service ID
Service_type The single character referencing the type of service
Event The event ID
Tracking_id Operation ID causing the event log
Message A human readable textual information about the event
Data Additional contextual information in JSON format
Code A numerical status code following HTTP standard, reflecting the execution result of the operation

Solution Log types

Found in the "type" key of the log entries.

Type Description Example
config Configuration events Adding a service configuration to a solution.
event Runtime event trigger Incoming message from device is transmitted to scripting for processing.
call Service operation request from scripting Solution script calls the connectivity service to list devices.
script Logs emitted by the solution script using the "print" function

Configuration Logs

Every modification in a solution object is subject to structured log to provide troubleshooting and audit capability on the system. Configuration logs follow the same structure as other type of solution logs in JSON format (See Solution Logs structure table).

The key 'Data' carries diverse definitions depending on log type:

  • Data: Additional contextual information

The log is then emitted in the system Standard Out interface. This structure enables the logging system to index solution event and perform monitoring, statistical and alerting based on the user requirements. See the Configuration Specification document for additional information.


Routing Logs

For user to troubleshoot IoT solutions, logs are emitted to reflect runtime executions in the system for both event handling and service calls. The created logs follow a consistent structure with configuration logs (See Solution Configuration Logs) and the scripting logs (See Script Logs) as follow:

The key 'Data' carries diverse definitions depending on log type:

  • Data: Additional contextual information as follow:
    • Elapsed: The duration in microseconds required for the request execution,
    • Processing_time: The amount of processing time in microseconds required for the request execution,
      • The processing of the request could be on-hold and therefore use fewer computing resources that the elapsed time. Or in the opposite, a request could be executed by parallel processes and use more processing time than the clock time.
    • Data_in: The payload size in bytes part of the request,
    • Data_out: The payload size in bytes part of the response,
    • Memory_usage: The byte size of memory used by the scripting execution. For event triggers only.

This information enables statistical reporting, monitoring and alerting of the IoT Solution. Runtime execution is processing heavy, and a typical instance of the Service API is expected to handle up to 10000 executions per seconds. Production of dedicated execution log for each execution would not be viable from a resource cost perspective. To limit this issue, produced logs are aggregated and buffered prior to be emitted.


Scripting Logs

To enable troubleshooting, script developer leverages the Lua native “print” function which would traditionally output the log in the application Standard-Out. As running in a virtual machine, the output of the script is handled by the scripting engine and therefore log entry emitted by the script must be relayed to the infrastructure logging system. Due to potential high log throughput, a user can create a recursive loop of “print” in a script and thousands of scripts can be running in parallel, logs are not forwarded to the Engine Standard-Out channel and instead are written into dedicated log files following the same approaches as solution routing logs (See Routing logs) where a file storage folder location is configured for the logging agent to read the files directly. Logs files are limited to 100MB and a maximum of 10 files are kept. Each log entry follows the common structure (See Solution Logs structure table).

The key 'Data' carries diverse definitions depending on log type:

  • Data: Additional parameters provided by the script log function

As the engine cannot aggregate logs produced by the user script without potentially losing its meaning, logs are not aggregated. This can potentially cause a large number of logs to be emitted which can challenge the system resources. To prevent this issue, by default, only logs of severity “Error” or above are emitted and lower severity entries are discarded. To enable user granular troubleshooting of low severity log entry, the Script Engine exposes an API operation to temporarily enable the emissions of all logs for a parametrized duration of maximum 2 hours.


Historical Logs

Logs are emitted by the platform in the infrastructure logs. Therefor if configured, you should be able to access them through your infrastructure log system (A link is integrated in the platform user interface). A typical log system is Elasticsearch or OpenSearch with Kibana.

In the log system, you can search and find Solution logs and device error logs (Data message are not emitted in the logs system).

Query desired log

You can now refine your log search by filtering for specific criteria, such as device ID, over a defined period.

Note

Keep in mind, selecting a broad time range may return a substantial amount of log hits.

Narrow down your filter search as follow:

  1. The desired infrastructure namespace
  2. The desired time window
  3. The desired solution namespace by its solution ID (Application or connector)
  4. The desired platform component (see Components table) using the key kubernetes.container_name (subject to difference based on log system)

Example for searching ingest information about a device MYDEVICE01 in a solution ID k972nv8vb8hk0000.

Query: ("k972nv8vb8hk0000.m2" AND kubernetes.container_name: haproxy) OR (kubernetes.container_name: connectivity_hub AND (k972nv8vb8hk0000 OR MYDEVICE01))

Components of interest

Following notable components name can be used to search for relevant logs:

Component ID Description What logs
pegasus_api, pegasus_dispatcher, pegasus_script_engine Core components of the platform Solution logs
bizapi The administration backend component Logs relevant to the platform admonistration
sphinx_api The WebServices gateway component Logs relevant to Solution APIs
connectivity_hub The connectivity component (Device2) Device logs
haproxy Ingress reverse proxy Connection logs coming from Admin portal, Devices & clients of the Web-service Service. Each connection ending is registered. A normal closure logs includes "--". See HAProxy documentation for other closure codes