Skip to content

Grafana

Positive Internet setup grafana for us, so we could have full control over our metric and logging data collection. We have grafana dashboards for monitoring performance and buffers, for systems such as Conductor (command), DPS (Segmenter/Visscore), DCS & SEOPT.

Grana Dashboard Examples

Grafana command Dashboard

Grafana DCS Dashboard

Grafana DPS Segmenter Dashboard

How It Works

There is an Influx DB to store all the metric data, and Loki which scrapes files.

Positive setup telegraf agents to run that talk to specific end points, for example on the dcs and dps they look at /@server which returns a JSON of the state of the server (including workers, buffers etc.). This is scraped every 15 seconds, and is what is used to report on.

Loki is configured to point at files on servers, and read them (line by line in real time) and updates itself. It does not index, so it leaves a small data footprint. The developers wanted to make it behave similar to using grep on linux, which might sound scary to some but once you get used to just the basics, you will find it easy enough to trawl through logs on any box.

Dashboards

We have already configured dashboards for most of the important parts of our system, such as DCS, DPS, Conductor (command) and SEOPT. In the side menu of grafana if you click the Dashboard and then click Browse, it will take you to our custom build dashboards.

Adding graphics

To build influx graphs into existing dashboards, use the Add pannel -> Add a new pannel (top-right) button. Everything is context sensitive, so if in the first drop down you pick environment = production, the next drop downs will only give me items that are available under that. If you then pick service name = dcs, it will limit to just the files Loki is looking on the DCS in Prod. A neat trick is to copy the Sample Query code from another graphic and amend it to your needs.

Grafana Adding Graphs

Filtering on Logs

To look and filter on logs, use Explore. In the side menu of grafana if you click the compass icon Explore, the first thing to do is click the drop down to pick a data source (top left, next to the Explore title). I go to Loki, then start "filtering" for what I want.

Advancements

We've recently updated our 'informational' endpoints on the dps/dcs such as @health, @accounts, @workers and @server. The idea of these is you can request GET them and they return the state of different parts of the boxes. If we're scraping each DCS /@account for example, we could then create a table in grafana to read from those. We would need to work with Positive on giving it the correct bucket_name so we'd know which data source to read from to set this set up.