SaltStack Anomaly Detection and Root Cause Analysis

2 min

The DevOps team at Domotz is constantly working to keep the company cloud infrastructure reliable. That’s a crucial part of the service quality experienced by our customers. This kind of activity requires the ability to find and remediate production issues promptly.

We monitor system parameters like CPU, memory, number of processes, network latency and so on. The huge amount of information collected, which produce historical trends can be noisy. As a matter of fact, simple threshold-based alerts may trigger a lot of false positives.

We’ve started to experiment with anomaly detection algorithms to identify patterns that differ from the expected system behavior integrating the same in our infrastructure.

Saltstack implementations of Anomaly Detection

At Domotz we use Saltstack to manage our cloud infrastructure. In particular operations like configuring new virtual machines, deploying and configuring services into production are performed via Saltstack. We have even started leveraging its event-driven system for monitoring and managing our cloud infrastructure. 

In fact, Saltstack offers an event-driven infrastructure to raise events related to some system parameters you can monitor.

We extended the Saltstack event-driven infrastructure to perform advanced anomaly detection with machine learning models.

We have followed two different approaches with Saltstack:

 Approach 1: minion oriented

Implementation of custom Saltstack ‘beacons’ with anomaly detection algorithms for monitored resources. Our time series anomaly detection is based on the Luminol python library by LinkedIn.

 Approach 2: master oriented 

Adoption of Saltstack/Umbra project to define machine learning pipelines for monitored resources. Umbra leverages the PyOD python library which offers several state-of-the-art Outlier Algorithms.

See how Domotz implemented AIOps

At SaltConf19, Giancarlo Fanelli, our CTO and Massimiliano Cuzzoli, our Head of Cloud and System Engineering led a breakout session discussing how Domotz is using SaltStack to deploy features commonly exposed in AIOps (Artificial Intelligence for IT Operations), specifically for anomaly detection and root cause analysis. Check out the video below…

You might also like…

Read more top posts in this category

Introducing Unified Alerts: A More Consistent, Visible, and Scalable Approach to Alerting 

Introducing Unified Alerts: A More Consistent, Visible, and Scalable Approach to Alerting 

3 minAlerting in distributed IT environments is hard to keep consistent. Different alert types behave differently, configuration is fragmented, and visibility into coverage gets murky as you scale. Unified Alerting is a new framework in Domotz built to fix that — bringing alert rules, severity, history, and device profile management into one consistent system. It is now available as an opt-in beta for existing customers and on by default for new users. Here is what is changing and how to get started.

Top Network Monitoring Protocols for Network Performance

Top Network Monitoring Protocols for Network Performance

13 minNetwork performance problems are protocol problems before they are tool problems. This guide ranks the most important network monitoring protocols in 2026, including SNMP, NetFlow, ICMP, syslog, and WMI, then reviews the top 10 network performance monitoring tools that implement them. Compare Domotz, SolarWinds, PRTG, Auvik, Nagios, Zabbix, and more side by side. Learn which protocols to deploy, which tools support them cleanly, and how to pick the right combination for MSPs, IT teams, and network engineers.