Server Layout¶

Each server has roughly the same layout:

SystemD services¶

We use systemd to manage all of our services, so they are restarted in event of a crash or system restart.

Here are the common services you might find running on the VMs:

Service Name	Description	Config Path
backend	Django/Dashboard	`/etc/systemd/system/backend.service`
pydcs	DCS	`/etc/systemd/system/pydcs.service`
pydps	DPS	`/etc/systemd/system/pydps.service`
promtail	Promtail agent	`/etc/systemd/system/promtail.service`
telegraf	Telegraf agent	`/etc/systemd/system/promtail.service`

Promtail¶

Promtail is an agent which ships the contents of local logs (e.g. /var/log/attrib/backend.error.log etc) to a Grafana Loki instance.

Configuration¶

Promtails configuration is stored at /etc/promtail/config.yaml. An example might look like:

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /data/promtail/positions.yaml

clients:
  - url: http://loki.cubed.internal:3100/loki/api/v1/push

scrape_configs:
  - job_name: dcs
    static_configs:
      - targets:
          - localhost
        labels:
          hostname: cubed-stge-dcs-a
          environment: stge
          service: dcs
          __path__: /var/log/pydcs/*.log

An interesting part to note is this is where the labels are applied for which you can later filter by in Loki/Grafana. (using a query such as {service="dcs", environment="stge"})

Telegraf¶

Telegraf is an agent which ships metrics such as CPU, Memory etc as well as custom metrics to InfluxDB. Most metrics used by the DCS dashboards in Grafana are captured using Telegraf and the DCS /@server endpoints.

Telegraf is configured remotely, using InfluxDB's web interface.

NFS Mounts¶

The code for each project is not stored on the server directly, but rather instead is mounted using NFS to a central location (usually cubed-nfs-a/b or cubed-stge-nfs). See the utility servers documentation on NFS mounts for more information.