Customer Retention

Customer retention for a subscription service or website is the % of customers that purchased or logged on the previous time period that also purchased or logged on in the current time period. Depending on the application, the time period might be measured in days, weeks or months.

customer_retention = num_customers[t] / num_customers[t-1]

When doing this calculation, you need to be careful to filter out new customers in the current time period.

Suppose we have a table with four columns, ['date', 'customer_id', 'order_num', 'order_amount_dollars'].

date customer_id order_num order_amount_dollars
1 2019-08-01 1 1 30
2 2019-08-05 2 2 40
3 2019-09-01 1 3 30
4 2019-09-07 3 4 50

We are interested in calculating the monthly customer retention between the most recent month and the month prior.

Write a function customer_retention(df) which takes a dataframe as above and returns a single number representing the most recent monthly customer retention.

Example Input

Code to generate input

df = pd.DataFrame({'date': pd.to_datetime(['2019-01-01', '2019-01-05', '2019-02-01', '2019-02-07']),
                   'customer_id': [1, 2, 1, 3],
                   'order_num': [709871, 709872, 709874, 709875],
                   'order_amount_dollars': [30, 20, 30, 80]})


Table generated

date customer_id order_num order_amount_dollars
0 2019-01-01 00:00:00 1 709871 30
1 2019-01-05 00:00:00 2 709872 20
2 2019-02-01 00:00:00 1 709874 30
3 2019-02-07 00:00:00 3 709875 80

Example Output

0.5