Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The infrastructure layer includes all AWS resources used by the  applicationImage Added application. These resources are typically controlled Infrastructure as Code as described in Assets and Services(3.0).

...

All Kubernetes pods deployed by define Image Added define Liveness and Readiness probes according to https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

...

Kubernetes Logs

All processes in a system Image Added system produce log data that is collected by Kubernetes log function and can easily be transferred to a common log target. Also the centralized System Log (System Log) can be forwarded to the same log target,  The stack Fluentd, ElasticSearch and Kibana to collect, store and visualize log data. See Configure Log Collection, Target, and Visualization - AWS for a description on how to set this up.

...

To monitor error conditions in the system layer, has Image Added has a very flexible Event Notifier feature with targets like AWS SNS Topic and several others. See Event Notifications(3.0) for info on how to configure this. 

System and Process Logs

logs Image Added logs events from the entire system in the central System Log. See System Log(3.0)

Metrics 

All metrics in are Image Added are exposed on a REST interface in a format that can be scraped by Prometheus. This means if Prometheus is installed according to Setting up Prometheus(3.0) it will automatically start scraping metrics from all system resources.

...

For workflow troubleshooting, the Workflow Monitor can be used to view detail debug information, see Workflow Monitor on the Web Interface(3.0).

Execution Layer Metrics

has Image Added has extensive monitoring capabilities for troubleshooting workflow scheduling and execution. See  Metrics(3.0) for more information on available metrics. See Reading JMX Metrics, MIMs and Prometheus Agent Metrics from Execution Context Endpoint(3.0) for information on how to expose execution layer the metrics in Prometheus and/or Grafana.

...

When unexpected things happen when processing payload data, like for instance decoding errors or unexpected type codes, provides Image Added provides a powerful subsystem called Data Veracity to help resolving the error condition. See Data Veracity(3.0) for information on how to configure this.