Skip to content

Debugging Command

Enable/Disable Accounts

Which accounts are enabled is controlled by the COMMAND_ALLOWED_ACCOUNTS environment variable. This is a comma separated list of account tokens, e.g.

COMMAND_ALLOWED_ACCOUNTS=c-a-yard-uk,c-a-visscore

These are configured in Bitbucket under the deployment settings.

Warning

You must add the same variables for both Production-Command and Production-Control, as they use the same setting to determine if they should run commands for accounts or not.

You must also deploy both of these environments when making a change.

Connected to the Command database

The command system runs on its own dedicated database.

Environment URI
Production cubed-command.cevhomzj8can.eu-west-1.rds.amazonaws.com
Staging cubed-command-staging.cevhomzj8can.eu-west-1.rds.amazonaws.com

For users created by Terraform, you can use your usual username/password to connect. Otherwise, you can find the root credentials in AWS Secrets Manager under Staging/RDS/Command or Prod/RDS/Command.

Debugging Queries

Recent Activity

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
SELECT
b.name as definition_name,
c.token as account,
CONCAT(a.start_date, " to ", a.end_date) as date_range,
(
    SELECT
    GROUP_CONCAT(
        CASE
            WHEN (x.status = 0) THEN "Waiting"
            WHEN (x.status = 1) THEN "Running"
            WHEN (x.status = 2) THEN "Timeout"
            WHEN (x.status = 3) THEN "Failed"
            WHEN (x.status = 4) THEN "Success"
            ELSE "Unknown"
        END
        ORDER BY x.waiting_at ASC
    )
    FROM command_run x
    WHERE x.slot_id = a.id
) AS timeline,
(
    SELECT
    SUM(
        CASE
        WHEN (x.status = 2) THEN 1
        WHEN (x.status = 3) THEN 1
        ELSE 0
        END
    )
    FROM command_run x
    WHERE x.slot_id = a.id
    ORDER BY x.waiting_at DESC
    LIMIT 10
) = 10 as is_dead,
(
    SELECT
    (
        CASE
        WHEN (x.status = 0) THEN "Waiting"
        WHEN (x.status = 1) THEN "Running"
        WHEN (x.status = 2) THEN "Timeout"
        WHEN (x.status = 3) THEN "Failed"
        WHEN (x.status = 4) THEN "Success"
        ELSE "Unknown"
        END
    )
    FROM command_run x
    WHERE x.slot_id = a.id
    ORDER BY x.waiting_at DESC
    LIMIT 1
) as last_status,
(
    SELECT
    COALESCE(
        success_at,
        failed_at,
        timeout_at,
        running_at,
        waiting_at
    )
    FROM command_run x
    WHERE x.slot_id = a.id
    ORDER BY x.waiting_at DESC
    LIMIT 1
) as last_changed_at,
(
    SELECT x.id
    FROM command_run x
    WHERE x.slot_id = a.id
    ORDER BY x.waiting_at DESC
    LIMIT 1
) as last_command_run_id
FROM command_slot a
JOIN command_definition b ON a.definition_id = b.id
LEFT JOIN command_account c ON a.account_id = c.id
ORDER BY last_changed_at DESC
LIMIT 10
Filtering by account

You can filter by account by appending something similar to this to the above query:

1
WHERE c.token = "c-a-yardstore-uk"
Filtering by status

You can filter by status by appending something similar to this to the above query:

1
2
3
4
5
6
7
WHERE (
    SELECT x.status
    FROM command_run x
    WHERE x.slot_id = a.id
    ORDER BY x.waiting_at DESC
    LIMIT 1
) = @CommandRunStatus

Where @CommandRunStatus is one of the values below:

Friendly Name EnumValue Description
Waiting 0 Run is waiting to be executed
Running 1 Run is being executed
Timeout 2 Run has not reported it has finished or succeeded before a timeout
Failed 3 Run reported that it failed
Success 4 Run reported that it finished successfully
Filtering by date

If you want to see which commands should be run on a particular date, simply remove the limit (LIMIT 10) and add something similar to below:

1
2
WHERE (NOT a.start_date >= '2022-03-03 23:59:59')
AND (NOT a.end_date <= '2022-03-03 00:00:00')

Often you want to check what is happening with slots that were supposed to have been run yesterday.

1
2
WHERE (NOT a.start_date >= DATE_FORMAT(DATE_SUB(NOW(), INTERVAL 1 DAY),'%Y-%m-%d 23:59:59'))
AND (NOT a.end_date <= DATE_FORMAT(DATE_SUB(NOW(), INTERVAL 1 DAY) , '%Y-%m-%d 00:00:00'))

The hours/minutes/seconds are important and should not be changed.

After running the query, you should see a result with the following columns:

Column Name Description Example
definition_name Definition name (see command_definition table). Usually the name of a cron. update_fabric_seogd_traffic
account The account token the command was executed for c-a-yardstore-uk
date_range The time slot the command was executed for, see command_slot 2022-03-02 00:30:49 to 2022-03-03 00:30:48
timeline The history of executions for current time slot Failed,Timeout,Failed,Failed,Success
last_status The status of the last command_run for this slot Success
last_changed_at The timestamp of the last command_run for this slot 2022-03-03 03:10:14
last_command_run_id The id of the last command_run for this slot b6afe06ef11146558fd6969df75ad999

Debugging failures

If you have found a failed command and you know its command_run_id.

You can find the logs via:

1
2
3
4
5
6
7
8
SELECT e.name, a.text, a.created 
FROM attrib_command_log a
JOIN attrib_command_log_lookup b ON a.id = b.command_log_id
JOIN attrib_command_item c ON b.command_item_id = c.id
JOIN attrib_command_run_lookup d ON d.command_item_id = c.id
JOIN attrib_log_type e ON a.print_type_id = e.id
WHERE d.command_run_id = @CommandRunId
ORDER BY a.order ASC

You can find the stack trace via:

1
2
3
4
5
6
SELECT d.failed, d.running, c.exception, c.stacktrace
FROM attrib_command_item a
JOIN attrib_command_run_lookup b ON b.command_item_id = a.id
LEFT JOIN attrib_command_exception c ON c.command_item_id = a.id
JOIN attrib_command_state d ON d.command_item_id = a.id
WHERE b.command_run_id =  @CommandRunId

Note

These should be run on the cubed-config databases and not the command ones.

Debugging Agents/Services

Currently everything related to Command is run on a single EC2 instance. These can be found in the following locations:

Environment URI
Production command-control.withcubed.com
Staging command-control.staging.withcubed.com

There are three services that run continuously on these instances:

Service Name Description
command-agent-engine The brains of the operation, in charge of updating all the tables (command_definition, command_slot etc) and scheduling the execution of commands
command-agent-cron In charge of actually executing "cron" style commands. It runs commands in parallel, with up to 5 running at a time (although this is configurable)
command-agent-metrics Published metrics to Datadog

These agents are setup as standard systemd services, and so can be debugged with the usual tools. (systemctl, journalctl)