API Development¶

This is a brief rundown of the steps you will likely need to take when integrating the Cubed platform with an external API. These are not exhaustive steps, and you will of course need to tailor the development to your own task, but these steps should provide a framework.

Make an account¶

Most techs we want to integrate with will require a development account for integration. So the first step in any API work will be to create an account for the tech you are integrating with. As these will often be advertising techs, it is worth checking with Yard/your line manager at the outset to see if we already have an advertising account. If so, it is likely to be easier to add a development option to this account rather than making a fresh account.

The requirements for creating a development account can range from simply signing up by email, to sending a video of our site to describe how we will be using the data we pull from the API. Each tech will be distinct, so you will have to work through whatever their requirements are.

Once you've successfully created an account, you will need to find out how to log in to the development console for your chosen tech - it is likely we will need this in order to configure our API access later.

Obtain development credentials¶

In the development console you should be able to obtain credentials you can use to test integration with the API. This will most likely be a token that needs to be passed with the request to the API, with a username or other account identifier.

Simple API request¶

APIs are very specific in how they accept requests for data. As a next step, it can be helpful to make a test request to the API to ensure that our credentials are correct and allow us to pull the data we're looking for. Once we're getting a response from the API that isn't some form of 401 error, we can probably consider this step complete. Some things to check as well as getting our authorisation correct:

our tech account is configured to allow access to the resource we're requesting
there is data associated with the resource we're requesting
whether we are using a 'sandbox' token - these sometimes work differently to developer or production tokens

You can make a sample API request however you feel most comfortable - curl, Postman, requests, or a simple Python script or Django command. The important thing is that we are authorised to use the API using the credentials and setup you have designed.

The API may have a client library and Python examples you can use to retrieve data. It is worth looking into these examples (if they exist) as they can be a huge help when designing your API requests.

Simple command wrapper¶

At this point it is worth creating a Django command to call the simple function you have created above. This is so it can be easily called using manage.py. You will need to use one of our custom classes DateClientPromptCommand for this to keep in line with the other commands we use for retrieving and processing data. It is worth familiarising yourself with this command to better understand the options that can be passed when running it. Remember that you will need to override the handle_client_prompt function in your command in order for it to run.

Rough idea of the data we need¶

Now that we are retrieving data from the API, we can work with our sales team, client relationship managers and/or our data science team to figure out what data we need to retrieve. A good process for this is probably to schedule a meeting with whoever requested the API work, show them the data we have available and narrow down exactly what we need to retrieve.

Adding database tables¶

Once we have an idea of the data we want to get back from the API, we can create database tables to store it. This should be done using Django models and migrations. We will likely need a techname_connection and techname_account table to store the details of client accounts integrated with us, and the connection details we use for the API. The Bing Ads API is a good example of this. We may also need to support the concept of an MCC or manager account - discuss this in more detail with your line manager or the ticket creator if this arises.

We will also need to create a table to store the data we pull from the API. We should create one table per resource - for example we pull five datatypes from the GoogleAds API, and the data we pull is inserted into five distinct tables. The data science team should have an eye on how they will join these tables in the step above. Your table will need to have a unique key so we can ensure that if your command is run twice for the same daterange there is no duplication of data.

Complex command wrapper¶

At this point we can update our command wrapper to take into account some of the things we have added above. Your command needs to run per advertising account associated with a given Cubed account, so we will almost certainly need to retrieve a full list of accounts and run our API pull per account. A simple for loop is fine for this purpose. You will then want to consider adding functions that retrieve the data from the API, process the data into a format we can insert, and then insert the data into your chosen table. This is to ensure that your command is readable for future developers, and will help you with debugging. Again, the Bing Ads API pull (bing-ads.py) is a decent example of this.

When inserting the data, we must have an eye on performance. Using Django's ORM is the most sensible and safeguarded way of inserting data, but it is also very much the slowest, so it might be that we have to look at using raw SQL queries to ensure that our command is performant.

Refine data requirements¶

This step can really be done at any stage, but before we can call the work complete we need to be happy that we are pulling all of the data required by the data science/client-facing teams. You will also need to adjust your database tables accordingly - you will likely need to recreate your migrations to do this.

Testing¶

It is vital to test all of the above parts of our API request. At the very least, we should be able to pass sample data to our function to insert it into the database and confirm that the data appears in the database as we expect. Ideally we would also mock out the API response and call the mocked API directly to better end-to-end test our commands, but that is an advanced topic that is out of scope to be covered here. You should also consider testing the API init and callback urls discussed further below.

Oauth2¶

Oauth2 is a protocol that allows client to grant us access to their API accounts without ever sharing their login credentials with us. They can login normally to their API console when Cubed requests they do so, and in doing so we will receive various tokens that will authorise us to pull their data.

In the web console for the tech we are integrating with we will likely need to create an app for our project. This basically gives us a portal within the API tech for clients to integrate with - and when they grant us access by logging in, our app will be authorised to pull their data.

In Cubed, we will need to add (at least) two urls backed by views. You can see an example of this in api/external/bing/views.py in Cubed. The first of these will send our app information to the API, along with a request to allow our client to log in. The second will wait to recieve a response from our API app with the access tokens we will use to request data from the API. To achieve this, in the API console we will need to set a redirect_uri from the API app pointing back to the callback url we defined in Cubed earlier.

As you will see from the example, the bing_init view instantiates various classes from the bingads client library, and using those finds a URL to send the client to. The bing_callback view uses the data returned to pull out an access token and refresh token, as well as some metadata about the account. This is then stored in our database, and will be used when calling the command we wrote earlier.

Access and refresh tokens¶

Access tokens allow us access to a client's data. Refresh tokens are exchanged for refreshed access tokens. When writing your complex command, you will need to ensure that you are refreshing your access token each time the command is called, and saving both the updated tokens.