Skip to content

Server Layout

Each server has roughly the same layout:

Server Layout

SystemD services

We use systemd to manage all of our services, so they are restarted in event of a crash or system restart.

Here are the common services you might find running on the VMs:

Service Name Description Config Path
backend Django/Dashboard /etc/systemd/system/backend.service
pydcs DCS /etc/systemd/system/pydcs.service
pydps DPS /etc/systemd/system/pydps.service
promtail Promtail agent /etc/systemd/system/promtail.service
telegraf Telegraf agent /etc/systemd/system/promtail.service

Promtail

Promtail is an agent which ships the contents of local logs (e.g. /var/log/attrib/backend.error.log etc) to a Grafana Loki instance.

Configuration

Promtails configuration is stored at /etc/promtail/config.yaml. An example might look like:

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /data/promtail/positions.yaml

clients:
  - url: http://loki.cubed.internal:3100/loki/api/v1/push

scrape_configs:
  - job_name: dcs
    static_configs:
      - targets:
          - localhost
        labels:
          hostname: cubed-stge-dcs-a
          environment: stge
          service: dcs
          __path__: /var/log/pydcs/*.log

An interesting part to note is this is where the labels are applied for which you can later filter by in Loki/Grafana. (using a query such as {service="dcs", environment="stge"})

Telegraf

Telegraf is an agent which ships metrics such as CPU, Memory etc as well as custom metrics to InfluxDB. Most metrics used by the DCS dashboards in Grafana are captured using Telegraf and the DCS /@server endpoints.

Telegraf is configured remotely, using InfluxDB's web interface.

NFS Mounts

The code for each project is not stored on the server directly, but rather instead is mounted using NFS to a central location (usually cubed-nfs-a/b or cubed-stge-nfs). See the utility servers documentation on NFS mounts for more information.