Further Explanation¶

To understand what our CubedReportResource class is doing, its best to understand the underlining data and how aggregation works.

Let's say we have the following Model:

class ReportMyData(models.Model):
    date = models.DateTimeField(db_index=True)    
    referer = models.ForeignKey('Referer', on_delete=models.PROTECT)
    visits = models.IntegerField()
    sales = models.IntegerField()
    revenue = models.DecimalField(max_digits=10, decimal_places=2)

    class Meta:
        db_table = 'report_my_table'
        unique_together = (('date', 'product'),)

And now in our DB we have a table of data like this:

date	referer_id	visits	sales	revenue
01/01/2023	1	4	3	100
01/01/2023	2	4	2	50
01/01/2023	3	4	1	40
02/01/2023	1	8	5	75
02/01/2023	2	8	4	80
02/01/2023	3	8	2	35
03/01/2023	1	2	7	55
03/01/2023	2	2	3	25
03/01/2023	3	2	2	15
04/01/2023	1	6	4	35
04/01/2023	2	6	1	10
04/01/2023	3	6	0	0

Lets now create a Tastypie resource using our custom CubedReportResource

class MyDataReportResource(CubedReportResource):    
    referer = fields.ForeignKey(RefererLiteResource, 'referer', full=True)

    class Meta(ReportMeta):
        queryset = ReportMyData.objects.prefetch_related('product')
        resource_name = "report-my-data"
        config_fields = BaseFields(show_totals=True, full_ordering=True)

        self.config_fields.install('referer', DictField(), filtering=ALL_WITH_RELATIONS, groupby=True)
        config_fields.install('date', DateField(), filtering=ALL_WITH_RELATIONS, groupby=True)
        config_fields.install('visits', IntegerField())
        config_fields.install('sales', IntegerField())
        config_fields.install('revenue', MoneyField())

The important bits to note here are: we are using install for base metrics on the table. If we wanted to add a custom metric we'd do so via install_additional_mettric. Both functions pass a Cubed Field type so we now how to format the returned values.
Also see that the 2 dimension fields on the table are flagged as groupby=True as we want to be able to group by only these columns.

Note

You can inherit meta from BaseReportMCSMeta which comes with a few of the "base" fields already defined and ready to be used on config_fields.

There are 3 main data sets we can return, here they are with their matching query params:

Data	Query params	Use group_by
Table	totals:false&graph:false	yes
Totals	totals:true&graph:false	no
Graph	totals:false&graph:true	yes
None	totals:true&graph:true	n/a

"Totals" does not need to `group by`` a dimension as we're going to get the complete totals of the entire aggregated data set.

Examples¶

Lets now look at some examples based on query params sent up, and see what SQL would get generated, and thus what data would be returned.

Table Data¶

/report-my-data/?account=1&date__gte=2023-01-01&date__lte=2023-01-04&product__in=1,2,3&group_by=referer

select 
    b.name, 
    sum(a.sales) as `sales`,
    sum(a.revenue) as `revenue`,
    sum(a.visits) as `visits`    
from report_my_table a
join attrib_referer b on a.referer_id = b.id
where a.date between '2023-01-01 00:00:00' and '2023-01-04 23:59:59'
group by b.name

This will return to the frontend:

Referer	Sales	Revenue	Visits
First	19	265	20
Second	10	165	20
Third	5	90	20

Table Totals¶

/report-my-data/?account=1&date__gte=2023-01-01&date__lte=2023-01-04&product__in=1,2,3&totals=true

select 
    sum(x.sales),
    sum(x.revenue),
    sum(x.visits),
from (
    select 
        b.name, 
        sum(a.sales) as `sales`,
        sum(a.revenue) as `revenue`,
        sum(a.visits) as `visits`   
    from report_my_table a
    join attrib_referer b on a.referer_id = b.id
    where a.date between '2023-01-01 00:00:00' and '2023-01-04 23:59:59'
    group by b.name
) x

This will return to the frontend:

Sales	Revenue	Visits
34	520	60

Graph data¶

/report-my-data/?account=1&date__gte=2023-01-01&date__lte=2023-01-04&product__in=1,2,3&graph=true&graph_group_by=date&graph_group_by_type=date__day&graph_metrics=sales

select 
    date(x.date)
    sum(x.sales)
from (
    select 
        date(a.date) as `date`
        b.name,
        sum(a.sales) as `sales`,
        sum(a.revenue) as `revenue`,
        sum(a.visits) as `visits`   
    from report_my_table a
    join attrib_referer b on a.referer_id = b.id
    where a.date between '2023-01-01 00:00:00' and '2023-01-04 23:59:59'
    group by date(a.date), b.name
) x
group by date(x.date)

This will return:

Date	Sales
01/01/2023	6
02/01/2023	11
03/01/2023	12
04/01/2023	5

Having¶

The reason we have to build the "base query set" and then wrap that in another select, is to allow for users to filter on the data they see in the front end. Given the examples above, lets say we're looking at this table of data:

Referer	Sales	Revenue	Visits
First	19	265	20
Second	10	165	20
Third	5	90	20

If we were to "filter" on "sales" where "sales is > 5", without the nested subquery, we would generate this:

select 
    b.name, 
    sum(a.sales) as `sales`,
    sum(a.revenue) as `revenue`,
    sum(a.visits) as `visits`    
from report_my_table a
join attrib_referer b on a.referer_id = b.id
where a.date between '2023-01-01 00:00:00' and '2023-01-04 23:59:59'
and a.sales > 5
group by b.name

Which would return:

Referer	Sales	Revenue	Visits
First	7	55	2

This is because we're filtering on the individual rows in the table, rather than the aggregated data which is what our user has seen. However, because of how we apply the having filtering to our outter query, we now get this:

select
    x.name,
    sum(x.sales),
    sum(x.revenue),
    sum(x.visits)
from (
    select 
        b.name, 
        sum(a.sales) as `sales`,
        sum(a.revenue) as `revenue`,
        sum(a.visits) as `visits`    
    from report_my_table a
    join attrib_referer b on a.referer_id = b.id
    where a.date between '2023-01-01 00:00:00' and '2023-01-04 23:59:59'
    group by b.name
) x
having sum(x.sales) > 5

Which will return:

Referer	Sales	Revenue	Visits
First	19	265	20
Second	10	165	20

Each query generated for table, totals, and graph, will apply having in the same way so they will all always reflect the correct data sets.