Skip to content

Missing Data

To check any data issues, you can either use dash.withcubed.com and check an account or simply log into control.withcubed.com/admin and then go to the main page control.withcubed.com/. This will display all accounts and a data point for each day that has had a successful cron job.

If it looks like data hasn't run for more than 1 day then it means it has failed. Our crons do not fully run if the previous day didn't finish.

Note

Some accounts in this list are on the live DB but are for testing different parts of our system. This means may show "blank" more often (as they have no data).

These are: Adobe Test, BGB, Cubed AI, Cubed Brand Website, Facebook Dev, PyDCS Test

Example

Check Control

Looking at image below, I can hover on the last known day that ran successfully.

Control - missing data

This means the cron(s) did not reach the command update_report_account_usage which is runs after the report command block.

Check the Database

Log into live RDS, and switch to the attrib db. We need to know now when/where a command failed, and what caused it.

Cron Parent Command

We can check the state it got up to by running the following sql:

use attrib;
select
    d.token, 
    a.id, a.name, a.created, e.running, e.updated,
    b.message, b.exception, b.stacktrace, e.duration
from attrib_command_item a
left join attrib_command_exception b on b.command_item_id = a.id
join attrib_command_account c on c.command_item_id = a.id
join attrib_account d on c.account_id = d.id
join attrib_command_state e on e.command_item_id = a.id
where a.created between '2021-08-14 00:00:00' and '2021-08-14 23:59:59'
and d.token in ('c-a-yard-uk')
and a.name = 'Batch agg and pct update commands'

This will give you the top level parent command for this Account on this cron job.

Note

If you are missing data for the 1st of the month, your command that failed started on the 2nd.

The above sql will give you top level information about the parent cron command. Useful information here is the id, created, duration, and whether there is an exception.

With the command Id above you can go to the following URL : https://control.withcubed.com/command/{command-id}. Here you will see the standard output for all commands that ran, and if there was an error.

You can use this same URL for any command Id you find if you want to see more information about it - this includes whether it spawned other commands and all of it's print logs.

If there was an error you can now Rerun Commands.

Cron Children Commands

The below sql is another way to see all commands that ran that weren't the parent command:

use attrib;
select
    d.token, 
    a.id, a.name, a.created, e.running, e.updated,
    b.message, b.exception, b.stacktrace, e.duration
from attrib_command_item a
left join attrib_command_exception b on b.command_item_id = a.id
join attrib_command_account c on c.command_item_id = a.id
join attrib_account d on c.account_id = d.id
join attrib_command_state e on e.command_item_id = a.id
where a.created between '2021-08-14 00:00:00' and '2021-08-14 23:59:59'
and d.token in ('c-a-yard-uk')
and a.name <> 'Batch agg and pct update commands'

This will now the list the cron jobs individually. If there was an error it should be the last row, and contain an exception with stacktrace.

If commands suddenly stop, and there is no exception, chances are the box ran out of memory and killed everything. In this case - usually - a handful of accounts will have been affected. You can now jump to the next step.

A dead give away for this "Out of memory" scenario if you know a command isnt running but the running flag is True. You can clarify this 1 of 2 ways, either checking the DB to see if anything is running, or dialing into the control ec2, and checking htop. Read here for more information.

Re-run Commands

Once you have found the command that failed, check update_agg_pct_batch and find the command block it was in. Next step is to log into the control ec2, and then creating a screen to run your command behind. This is important as it means if there's a network issue and we lose connection to the box, it will keep running.

The main screen commands to use are:

  • screen -ls this will list all screens. Where (Detached) is mentioned this means they are probably not in use.
  • screen -S {name} this will create a new screen with that name.
  • screen -r {name} will reconnect to a named screen.
  • screen -d -r {name} will detach who ever is currently on the named screen, then allow you to reconnect to it.

Once you are into a screen you can navigate to the django manage folder:

cd /srv/attrib-backend/backend

Then you can simply run the cron command :

sudo python3 manage.py update_agg_pct_batch

Please see command file for other arguments that can be passed in.

Parameter Description Default Example
account_token An account token found in the DB all --account_token="c-a-client-uk"
account_list A comma separated list of account tokens (no space). none --account_list="c-a-client-uk,c-a-client-de,c-a-client-fr"
startdate The date to start running the command from. Must be in YYYY-MM-DD HH:MM:SS format. yesterday 00:00:00 --startdate="2021-06-01 00:00:00"
enddate The date to stop running the command at. Must be in YYYY-MM-DD HH:MM:SS format yesterday 23:59:59 --enddate="2021-06-01 23:59:59"
date_list A comma seperated list of dates to run for. Must be in YYYY-MM-DD format. *Note, this can be a quicker way to run a single day in the past. none --date_list="2021-06-01, 2021-06-10"
parallel Force the commands to run in parallel not serially. none --parallel
daily Force the commands to run daily not the whole time period specified. none --daily
ignore_post_pre This will ignore the pre and post block of commands False --ignore_post_pre

sudo python3 manage.py update_agg_pct_batch --startdate="2021-06-01 00:00:00" --enddate="2021-06-03 00:00:00" --account_list="client1,client2" --parallel --daily

Note

Only pass ignore_post_pre if you know the cron was successful for pre command block. This is usually the case, but it is important this DID run before you try again.

Htop

Htop is a terminal version of window's Task Manager. Once you've opened it press F4 (Filter) and type "manage.py" or "python" - this will filter all the noise down to just our commands.
Press F5 to toggle "tree view". This can sometimes help to see which commands are children of another.