Overview¶
The concept of the Replayer is to "replay" any visit data back through the DCS. The first step was to handle a .csv
file with specific columns filled in - some are mandatory, some are not.
This is done from the DCS but it could be, and should be, ported to the Attrib Backend project. There's a few potential issues there but we can discuss those later.
The structure of the Replayer was to have a folder per client
to hold the files locally NOT to be commited to repo, along with another folder output
which will hold all the generated files used during the command's process.
In the root directory is the run_replayer.py
file which holds the Replayer
class.
Replayer¶
The replayer class accepts a config
(see ReplayerConfig
below) to use, a request_url
to fire to, and a couple of debug flags.
The whole class will use the config
class to build a request per row in the .csv and fire them all to the given end point.
dump_data
- will generate files into the output
directory.
fire_tag
- will fire the tag to the request_url
ReplayerConfig¶
This is where we parse/format every piece of data from the .csv
file and prepare it to be sent to the DCS. There is a function for basically each column all called handle_<column_name>
.
format_data
is called and will return a json
ready dictionary to be sent to the DCS.
/run_replayer.py¶
The important part here is the dictionary "mapping" passed into ReplayerConfig()
class, which is then passed to the Replayer
class, before load_file()
is called.
The param name
is used for when generating files in the ouput
directory.
output¶
There are potentially 5 files in the output
directory, all following the same formatting: {name}-{type}-{token}-{date}.txt
.
The name
is taken from name
passed into the Replayer
, token
is auto generated to allow for multiple files run for the same client on the same date, date
is the day it ran, and type
is which data dump
is in the file.
The file types
are: requests
, responses
, requests-validated
, warnings
and exceptions
.
type | description |
---|---|
requests | A list of URL strings ready to be fired. Do not accidentally click one of this as it will open in a browser, and it will have essentially "fired". |
responses | A list of all responses from the end point. |
requests-validated | Same as responses but it's a json version making it easier to read what has been generated. Useful for checking before sending to the DCS. |
warnings | A file of all warnings that occured during the request generating. Not enough to not fire a row |
exceptions | Any issue(s) that occured during request generating, and meant a row was not fired. |
How to run¶
To run this command you just need to cd
into the replayer/
directory, and then simply run python3 run_replayer.py
passing the --name
and --load-file
flags, for example, sudo python3 run_replayer.py --name='c-a-norton-uk' --load-file='clients/norton/norton.csv'
. We have added functionality which handles duplication of transactionIDs in the CSVs (duplicate transactionIDs cause false inflation of sales in our system, hence we've added functionality to skip if we see a dupcliate). This requires you to add the client account_token
and the connection
string to your local attrib
database, table attrib_account
. Please check /etc/pydcs/config.yml
file to find where your local DB is, as it could be 127.0.0.1, rather than our staging DBs hosted by positive. If so, please use sudo mysql
in your vagrant to spin up your local DB to add the connection details.
Note
Please please please check the file you are about to run.
migrate to attrib-backend¶
It would be good if someone could just upload their CSV via the frontend, then we replay the data.
There would need to be some frontend validation while uploading, and then it would need to be stored in the DB, in a table where it could be deleted once the command has finished.
The command should also do an important thing that is not done here, and can not be done here. It should check if the transaction_id
is already in attrib_event_item
, and if it is - it should not fire. This shouldnt be part of "validating" the data, we dont care there, we only care that we shouldn't duplicate the same sale if we've already seen it before.