Classifying buyers into different groups based on purchase dates

qoefvg9y  于 2023-02-28  发布在  其他
关注(0)|答案(2)|浏览(100)

I would like to assign customers into one of these five groups, based on their historical purchases. Here are the definitions of those five groups.

  1. Lost - purchased in the prior 12 month period, but not current 12 month period
  2. New - purchased in current 12 month period and has no purchases prior to that
  3. Restarted (1 year) - purchased in current 12 month period, did not purchase in the 12 month period before that (13-24 months ago), but did purchase 25-36 months ago
  4. Restarted (other) - purchased in current 12 month period, did not purchase in the 12 month period before that (13-24 months ago), but did purchase at some other point historically
  5. Continuing - purchased in the current 12 month period and the previous 12 month period
    | Account_no | Group |
    | ------------ | ------------ |
    | 123 | New |
    | 124 | Lapsed |
    | 544 | Returning_1yr |

My current code gets a list of accounts for each group into their own temp table. I thought I might be able to use a 'case when' within select to create the 'Group' field that I'm trying to create, but that's where I got stuck. I'm sure there's a more elegant solution than what I'm doing here, so certainly open to suggestions. Thank you!

/*Group customers who bought in current 12 month window*/
drop table if exists #current12
select distinct txn.account_no
into #current12
from transactions txn
where tran_date between dateadd(day,-365,getdate()) and getdate()

/*Group customers who bought in previous 12 month window (13 - 24 months ago)*/
drop table if exists #previous12
select distinct txn.account_no
into #previous12
from transactions txn
where tran_date between dateadd(day,-730,getdate()) and dateadd(day,-366,getdate())

/*Group customers who bought in prior 12 month window (25 - 36 months ago)*/
drop table if exists #prior12
select distinct txn.account_no
into #prior12
from transactions txn
where tran_date between dateadd(day,-1095,getdate()) and dateadd(day,-731,getdate())

/*Group customers who bought at any time before current 12 month window (13+ months ago)*/
drop table if exists #beforecurrent12
select distinct txn.account_no
into #beforecurrent12
from transactions txn
where tran_date < dateadd(day,-365,getdate())

/*Create table of lost customers*/
drop table if exists #lost
select distinct txn.account_no
into #lost
from transactions txn
where txn.account_no IN (select account_no from #previous12)
and txn.account_no NOT IN (select account_no from #current12)

/*Create table of new customers*/
drop table if exists #new
select distinct txn.account_no
into #new
from transactions txn
where txn.account_no IN (select account_no from #current12)
and txn.account_no NOT IN (select distinct account_no from client.dbo.acct_sales_trans where tran_date < dateadd(day,-366,getdate()))

/*Create table of restarted 1 yr customers*/
drop table if exists #restarted1yr
select distinct txn.account_no
into #restarted1yr
from transactions txn
where txn.account_no IN (select account_no from #current12)
and txn.account_no NOT IN (select account_no from #previous12)
and txn.account_no IN (select account_no from #prior12)

/*Create table of restarted other customers*/
drop table if exists #restartedother
select distinct txn.account_no
into #restartedother
from transactions txn
where txn.account_no IN (select account_no from #current12)
and txn.account_no NOT IN (select account_no from #previous12)
and txn.account_no NOT IN (select account_no from #prior12)
and txn.account_no IN (select distinct account_no from transactions where tran_date < dateadd(day,-1095,getdate()))

/*Create table of returning customers*/
drop table if exists #returning
select distinct txn.account_no
into #returning
from transactions txn
where txn.account_no IN (select account_no from #previous12)
and txn.account_no IN (select account_no from #current12)
fcg9iug3

fcg9iug31#

Rather than trying to define each Grouping as a new temp table I'd be tempted to do something more like for each account flagging if they made purchase in each year. Then using these flags to identify which group they should be in.

In this code I've left the individual flags as well as the CustomerGroup. I would use the flags to sanity check the logic on the grouping CASE.

SELECT DISTINCT txn.account_no,
                CASE WHEN txn1.account_no IS NOT NULL THEN 'Y' ELSE 'N' END AS Purchase1Year,
                CASE WHEN txn2.account_no IS NOT NULL THEN 'Y' ELSE 'N' END AS Purchase2Year,
                CASE WHEN txn3.account_no IS NOT NULL THEN 'Y' ELSE 'N' END AS Purchase3Year,
                CASE WHEN txn4.account_no IS NOT NULL THEN 'Y' ELSE 'N' END AS PurchaseOlder,
                CASE
                    WHEN txn1.account_no IS NULL AND txn2.account_no IS NOT NULL THEN 'Lost'
                    WHEN txn1.account_no IS NOT NULL AND txn2.account_no IS NULL AND txn3.account_no IS NULL AND txn4.account_no IS NULL THEN 'New'
                    WHEN txn1.account_no IS NOT NULL AND txn2.account_no IS NULL AND txn3.account_no IS NOT NULL THEN 'Restarted (1 year)'
                    WHEN txn1.account_no IS NOT NULL AND txn2.account_no IS NULL AND (txn3.account_no IS NOT NULL OR txn4.account_no IS NOT NULL) THEN 'Restarted (other)'
                    WHEN txn1.account_no IS NOT NULL AND txn2.account_no IS NOT NULL THEN 'Continuing'
                    ELSE 'other not captured'
                END AS CustomerGroup
FROM transactions AS txn
     LEFT JOIN transactions AS txn1 ON txn.account_no = txn.account_no AND txn1.tran_date BETWEEN DATEADD(DAY, -365, GETDATE()) AND GETDATE()
     LEFT JOIN transactions AS txn2 ON txn.account_no = txn.account_no AND txn2.tran_date BETWEEN DATEADD(DAY, -730, GETDATE()) AND DATEADD(DAY, -366, GETDATE())
     LEFT JOIN transactions AS txn3 ON txn.account_no = txn.account_no AND txn3.tran_date BETWEEN DATEADD(DAY, -1095, GETDATE()) AND DATEADD(DAY, -731, GETDATE())
     LEFT JOIN transactions AS txn4 ON txn.account_no = txn.account_no AND txn4.tran_date < DATEADD(DAY, -1095, GETDATE());

There is a customer group for other that under the current definition will include anyone who has not purchased in year 1 nor 2, but has made an older purchase. You may want to include this filter

WHERE txn1.account_no IS NOT NULL OR txn2.account_no IS NOT NULL
ryoqjall

ryoqjall2#

It's really helpful for questions like this to include sample DDL and DML. Based on your description I put together this as example data. I've used table variables because there's less to clean up, but that's just a personal preference.

DECLARE @Orders TABLE (OrderID BIGINT PRIMARY KEY IDENTITY, CustomerID BIGINT, OrderDate DATETIME);
DECLARE @Customers TABLE (CustomerID BIGINT PRIMARY KEY IDENTITY, FirstName NVARCHAR(20), LastName NVARCHAR(20));

INSERT INTO @Orders (CustomerID, OrderDate) VALUES
(1, '2023-02-01'),(1, '2022-02-01'),(1, '2021-02-01'),
(2, '2023-02-01'),
(3, '2023-02-01'),(3, '2021-06-01'),
(4, '2023-02-01'),(4, '2020-06-01'),
(5, '2015-06-01'),(5, '2023-02-01'),
(6, '2020-06-01');

INSERT INTO @Customers (FirstName, LastName) VALUES
('Pike','Trickfoot'),('Percival','de Rolo'),('Vax','Vessar'),('Vex','Vessar'),('Grog','Strongjaw'),('Scanlan','Shorthalt')

Using that, I then crafted a case expression to look over several LEFT OUTER joins to the orders which capture the specifics you talked about:

SELECT c.CustomerID, C.FirstName, C.LastName, COUNT(o36.OrderID) AS o36, COUNT(o24.OrderID) AS o24, COUNT(o1223.OrderID) AS o1223, COUNT(o012.OrderID) AS o012,
CASE WHEN COUNT(o36.OrderID) = 0 AND COUNT(o24.OrderID) > 0 AND COUNT(o1223.OrderID) = 0 AND COUNT(o012.OrderID) = 0 THEN 'Lost'
     WHEN COUNT(o36.OrderID) = 0 AND COUNT(o24.OrderID) = 0 AND COUNT(o1223.OrderID) = 0 AND COUNT(o012.OrderID) > 0 THEN 'New'
     WHEN COUNT(o36.OrderID) > 0 AND COUNT(o24.OrderID) = 0 AND COUNT(o1223.OrderID) = 0 AND COUNT(o012.OrderID) > 0 THEN 'Restarted (Other)'
     WHEN COUNT(o36.OrderID) = 0 AND COUNT(o24.OrderID) > 0 AND COUNT(o1223.OrderID) = 0 AND COUNT(o012.OrderID) > 0 THEN 'Restarted (1 Year)'
     WHEN COUNT(o1223.OrderID) > 0 AND COUNT(o012.OrderID) > 0 THEN 'Continuing'
END AS CustomerStatus
  FROM @Customers c
    LEFT OUTER JOIN @Orders o36
      ON c.CustomerID = o36.CustomerID
      AND o36.OrderDate < DATEADD(MONTH,-36,CURRENT_TIMESTAMP)
    LEFT OUTER JOIN @Orders o24
      ON c.CustomerID = o24.CustomerID
      AND o24.OrderDate BETWEEN DATEADD(MONTH,-36,CURRENT_TIMESTAMP) AND DATEADD(MONTH,-24,CURRENT_TIMESTAMP)
    LEFT OUTER JOIN @Orders o1223
      ON c.CustomerID = o1223.CustomerID
      AND o1223.OrderDate BETWEEN DATEADD(MONTH,-24,CURRENT_TIMESTAMP) AND DATEADD(MONTH,-12,CURRENT_TIMESTAMP)
    LEFT OUTER JOIN @Orders o012
      ON c.CustomerID = o012.CustomerID
      AND o012.OrderDate > DATEADD(MONTH,-12,CURRENT_TIMESTAMP) 
 GROUP BY c.CustomerID, C.FirstName, C.LastName

We're joining to the orders table several times for the periods you mention, and then just doing a simple count for each of the periods. The case expression then returns the status dependent on what's found.

CustomerIDFirstNameLastNameo36o24o1223o012CustomerStatus
1PikeTrickfoot0111Continuing
2Percivalde Rolo0001New
3VaxVessar0011Continuing
4VexVessar0101Restarted (1 Year)
5GrogStrongjaw1001Restarted (Other)
6ScanlanShorthalt0100Lost

You may want to tweak exactly how the joins work to tweak it to better fit your needs.

相关问题