Skip to content

Overview

All functions are static methods on the Segmenter Rules class. They all require the same params to be passed in, and must return bool True/False.
A function should be designed to know exactly what it is expecting to recieve from the "function" parameter and how to interpret it. The function object will hold user configured variables including operator and value.

@staticmethod
def custom_function(worker, conn, visit, function, *args, **kwargs):
    return True or False

Params

Here is a break down of the params passed in:

Name Type Description
worker BaseWorker Object representing the Worker class that has called this function. Worker object holds a lot of functions relative to the current thread and connection object including SQL getting/dictionary converting.
conn Umysql.Connection Umysql object to make DB queries.
visit SegmenterVisit The Visit class specific for Segmenter. This holds basic information to get Visitor/Visit specific information. By this point the class has both visit_id as .id and visitor_id as .visitor_id.
function AccountFunction This object holds the relevant information for this specific function as configured by a cubed customer in our db.

Worker

For more information about the Worker class please see it's helper functions

SegmenterVisit

def __init__(self):
    self.id = None
    self.visitor_id = None
    self.visitor_token = None
    self.session_token = ""
    self.created = arrow.utcnow()
Name Type Description
id Int This is the PK Id from attrib_visit
visitor_id Int This is the PK Id from attrib_visitor
visitor_token String This is the token from attrib_visitor
session_token String This is the token from attrib_visit

You should only need id and visitor_id to get any data needed for your function to work. The tokens are provided if needed.

AccountFunction

def __init__(self):
    self.data_type = None
    self.category = None
    self.value = None
    self.operator = None
    self.has = None
Name Type Description
data_type String Holds the data type this function should interpret its self as. Mostly used for the Dashboard frontend but might be needed at this processing level.
category String This tells the function if it is at the Visit or Visitor level. This is set from the frontend but may be needed during processing if the function should operate at either level specifically.
value String All values are stored as String to make it easier. You can use the data_type if you are unsure of how it should be interpreted. Though a function by it's design should know exactly what it is being passed and behave very specifically.
operator String Please see operators for further information.
has Bool This is set in the frontend and is used here to calculate if the function should treat its self as opposite. Similar to returning !Success.

Example

time_since_last_visit

This function will find the difference between a Visitor's current visit and compare against their previous. The settings allowed in the frontend will include "==, >, <" etc.. as it is just comparing a number. We store time based data as seconds where possible.
So when function is passed in, we can expect the following:
function.operator will be something like ">"
function.value will be an Int of seconds like "172800"

If we were to recieve these 2 values, we're saying in our code:
"if current visitor's visit is greater than previous visit by 172800 seconds, return True".

@staticmethod
def time_since_last_visit(worker, conn, visit, function, *args, **kwargs):
    '''
    Calculate if the time since this visitor's current visit
    and previous visit is within the time set
    '''
    query = '''
        select a.first_visit, a.last_visit from attrib_visit a
        where a.visitor_id = %s
        order by a.first_visit desc
        limit 2
    '''
    rows = worker.find_single_full_row(conn, query, (visit.visitor_id, ))
    if len(rows) < 2:
        return False

    recent = arrow.get(rows[0])
    previous = arrow.get(rows[1])

    seconds = (recent - previous).seconds

    condition = prepare_expression(seconds, function)
    result = eval(condition)

    logging.debug('time_since_last_visit: {}\n:{}'.format(condition, result))

    return result

Quickly break down this simple function into steps:

  1. Get relevant specific data needed for this calculation only.
  2. Use the helper function on the Worker class to extra all rows returned.
  3. Convert the values into arrow objects for easy DateTime calculations.
  4. Get the seconds difference.
  5. Use the helper function to translate the value we've calculated from the DB and the function object holding operator and value set by customer.
  6. Pass the prepared expression into python's eval() and return that bool result.

Note

If the "has" variable was set to False, we would return the OPPOSITE of what the function here is doing. Meaning we could do: python if not function.has return !result But this is handled inside the prepare_expression.

Tests

Using pytest we run tests against all of our functions within rules.py, to run a test, simply navigate into cd /srv/pydps/tests/segmenter and run pytest * to run checks on all functions or pytest {file}.py to check a specific file.

All test functions can be located in tests/segmenter/functions/ and are named after the function they test - prefixed with test_. When creating the test, you will need to utilise the visit-generator package to generate a visit that is stored in your database connection, this allows us to create specific conditions when testing our functions. Additionally, helpers.py contains many helper classes with generic paths, events, and products that can be used to generate the aforementioned visits.

Note

When creating your condition object please ensure you use the correct Condition<Type> class found within condition_builder.py, else this can lead to incorrect data-type failures when testing your function(s).