Link Search Menu Expand Document

Solutions Monitoring Cheat Sheet

Concepts

TSDBs (Time Series DBs)

Wide-range TSDB Comparison

  • M3 (Prometheus, etcd, replication, Scale at Uber: 500Mio/s, Billions Storage)
  • Thanos (Prometheus, federation)
  • Grafana Mimir (Prometheus, scale up to 1Mrd active time series)
  • InfluxDB (commercial, replication, good scale)
  • eXtremeDB (commericial)
  • TimescaleDB (Postgres, replication)
  • Graphite/Whisper (no replication)
  • Prometheus
  • DalmatinerDB
  • Riak-TS

Alarming / Paging / SMS Notification

All SaaS

  • PagerDuty
  • VictorOps
  • BigPanda
  • OpsGenie
  • AlertOps
  • iLert

DNS, Ping

Network Mapping

Mapping Solutions

Network Forensics

Host-based Service Monitoring

Self-hosted:

  • Nagios
  • Icinga 2
  • check_mk
  • Shinken
  • Splunk
  • Sensu
  • Groundworks

Saas APMs:

  • NewRelic
  • AppDynamics
  • DataDog
  • Dynatrace
  • Stackify Retrace
  • Ruxit
  • Sysdig Cloud
  • Instana
  • SignalFX
  • SemaText (Metrics & Logs combined, correlation, Influx DB API for metrics, Elasticsearch API for logs)

Docker/Kubernetes

See also this review

  • Prometheus
  • Hawkular
  • DataDog (SaaS)
  • Sensu
  • Scout
  • Sysdig Cloud

External Website Monitoring

Status Page Hosting

Event Correlation