Why the TICK stack probably isn't for me
Recently, I've been experimenting with upgrading my monitoring system. The TICK stack consists of a number of different elements:
Together, these 4 programs provide everything you need to monitor your infrastructure, generate graphs, and send alerts via many different channels when things go wrong. This works reasonably well - and to give the developers credit, the level of integration present is pretty awesome. Telegraf seamlessly inserts metrics into InfluxDB, and Chronograf is designed to integrate with the metrics generated by Telegraf.
I haven't tried Kapacitor much yet, but it has an impressive list of integrations. For reference, I've been testing the TICK stack on an old Raspberry Pi 2 that I had lying around. I also tried Grafana too, which I'll talk about later.
The problems start when we talk about the system I've been using up until now (and am continuing to use). I've got a Collectd setup going - with Collectd Graph Panel (CGP) as a web interface, which is backed by RRD databases.
CGP, while it has it's flaws, is pretty cool. Unlike Chronograf, it doesn't require manual configuration when you enable new metric types - it generates graphs automatically. For a small personal home network, I don't really want to be spending hours manually specifying what all the graphs should look like for all the metrics I'm collecting. It's seriously helpful to have it done automatically.
Grafana also stumbles here. Before I installed the CK part of the TICK stack, I tried Grafana. After some initial installation issues (the Raspberry Pi 2's CPU only supports up to ARMv6, and Grafana uses ARMv7l instructions, causing some awkward and unfortunate issues that were somewhat difficult to track down). While it has an incredible array of different graphs and visualisations you can configure, like Chronograf it doesn't generate any of these graphs for you automatically.
Both solutions do have an import / export system for dashboards, which allows you to share prebuilt dashboards - but this isn't the same as automatic graph generation.
The other issue with the TICK stack is how heavy it is. Spoiler: it's very heavy indeed - especially InfluxDB. It managed to max out my poor Raspberry Pi 2's CPU - and ate all my RAM too! It look quite a bit of tuning to configure it such that it didn't eat all of my RAM for breakfast and knock my SSH session offline.
I'm sure that in a business setting you'd have heaps of resources just waiting to be dedicated to monitoring everything from your mission-critical servers to your cat's lunch - but in a home setting it takes up more resources passively when it isn't even doing anything than everything else I'm monitoring..... combined!
It's for these reasons that I'm probably not going to end up using the TICK (or TIG, for that matter) stack. For the reasons I've explained above, while it's great - it's just not for me. What I'm going to use instead though, I'm not sure. Development on CGP ceased in 2017 (or probably before that) - and I've got a growing list of features I'd like to add to it - including (but not limited to) fixing the SMART metrics display, reconfiguring the length of time metrics are stored for, and fixing a super annoying bug that makes the graphs go nuts when you scroll on them on a touchpad with precise scrolling enabled.
Got a suggestion for another different system I could try? Comment below!