Skip to content

Overview

The concept of the Replayer is to "replay" any visit data back through the DCS. The first step was to handle a .csv file with specific columns filled in - some are mandatory, some are not.

This is done from the DCS but it could be, and should be, ported to the Attrib Backend project. There's a few potential issues there but we can discuss those later.

The structure of the Replayer was to have a folder per client to hold the files locally NOT to be commited to repo, along with another folder output which will hold all the generated files used during the command's process.

In the root directory is the run_replayer.py file which holds the Replayer class.

Replayer

The replayer class accepts a config (see ReplayerConfig below) to use, a request_url to fire to, and a couple of debug flags.

The whole class will use the config class to build a request per row in the .csv and fire them all to the given end point.

dump_data - will generate files into the output directory. fire_tag - will fire the tag to the request_url

ReplayerConfig

This is where we parse/format every piece of data from the .csv file and prepare it to be sent to the DCS. There is a function for basically each column all called handle_<column_name>.

format_data is called and will return a json ready dictionary to be sent to the DCS.

/run_replayer.py

The important part here is the dictionary "mapping" passed into ReplayerConfig() class, which is then passed to the Replayer class, before load_file() is called. The param name is used for when generating files in the ouput directory.

output

There are potentially 5 files in the output directory, all following the same formatting: {name}-{type}-{token}-{date}.txt.

The name is taken from name passed into the Replayer, token is auto generated to allow for multiple files run for the same client on the same date, date is the day it ran, and type is which data dump is in the file.

The file types are: requests, responses, requests-validated, warnings and exceptions.

type description
requests A list of URL strings ready to be fired. Do not accidentally click one of this as it will open in a browser, and it will have essentially "fired".
responses A list of all responses from the end point.
requests-validated Same as responses but it's a json version making it easier to read what has been generated. Useful for checking before sending to the DCS.
warnings A file of all warnings that occured during the request generating. Not enough to not fire a row
exceptions Any issue(s) that occured during request generating, and meant a row was not fired.

How to run

To run this command you just need to cd into the replayer/ directory, and then simply run python3 run_replayer.py passing the --name and --load-file flags, for example, sudo python3 run_replayer.py --name='c-a-norton-uk' --load-file='clients/norton/norton.csv'. We have added functionality which handles duplication of transactionIDs in the CSVs (duplicate transactionIDs cause false inflation of sales in our system, hence we've added functionality to skip if we see a dupcliate). This requires you to add the client account_token and the connection string to your local attrib database, table attrib_account. Please check /etc/pydcs/config.yml file to find where your local DB is, as it could be 127.0.0.1, rather than our staging DBs hosted by positive. If so, please use sudo mysql in your vagrant to spin up your local DB to add the connection details.

Note

Please please please check the file you are about to run.


migrate to attrib-backend

It would be good if someone could just upload their CSV via the frontend, then we replay the data. There would need to be some frontend validation while uploading, and then it would need to be stored in the DB, in a table where it could be deleted once the command has finished. The command should also do an important thing that is not done here, and can not be done here. It should check if the transaction_id is already in attrib_event_item, and if it is - it should not fire. This shouldnt be part of "validating" the data, we dont care there, we only care that we shouldn't duplicate the same sale if we've already seen it before.