I am scanning licence plates at my car wash company. Every plate is written to a database with plate, and datetime.
I would like to know now, if a customer is here for the first time, or if he has been here in the last 12 months already.
Using this query I get first results:
WITH thisMonth AS (
SELECT DISTINCT pl.plate
FROM POS.dbo.plate AS pl
WHERE pl.datetime >= DATEFROMPARTS(YEAR(CURRENT_TIMESTAMP), MONTH(CURRENT_TIMESTAMP), 1)
),
past12Months AS (
SELECT pl.plate
FROM POS.dbo.plate AS pl
WHERE pl.datetime >=
CAST(DATEADD(MONTH, -1, DATEFROMPARTS(YEAR(CURRENT_TIMESTAMP) - 1, MONTH(CURRENT_TIMESTAMP), 1)) AS DATE)
)
SELECT count(tm.plate) as frequency, tm.plate as plate
FROM thisMonth as tm
right join past12Months as pm ON tm.plate = pm.plate
group by tm.plate
frequency | plate |
---|---|
5 | TUT_D_7-DEU |
1 | TUT_BM_224-DEU |
3 | TUT_AI_789-DEU |
1 | TUT_RS_261-DEU |
3 | TUT_EG_123-DEU |
7 | TUT_V_741-DEU |
My goal is to get to know how many plates have a frequency = 1 and how many do have a frequency > 1. In best case I get this as percentage.
Sample Data could be like this. The last 4 rows are plates which were there before. So they are existing customers. the other plates are there only once. so they are new customers.
| plate_id | plate | datetime | updated_at |
| ------------ | ------------ | ------------ | ------------ |
| 15736 | TUT_UK_70-DEU | 2023-02-09 14:09:59.0000000 | 2023-02-21 12:50:19.400 |
| 15737 | TUT_MW_941-DEU | 2023-02-09 14:11:09.0000000 | 2023-02-21 12:50:19.403 |
| 15738 | TUT_RP_88-DEU | 2023-02-09 14:13:11.0000000 | 2023-02-21 12:50:19.417 |
| 15739 | TUT_KB_666-DEU | 2023-02-09 14:14:06.0000000 | 2023-02-21 12:50:19.417 |
| 15740 | TUT_IP_22-DEU | 2023-02-09 14:16:23.0000000 | 2023-02-21 12:50:19.417 |
| 15741 | SIG_FR_608-DEU | 2023-02-09 14:17:19.0000000 | 2023-02-21 12:50:19.420 |
| 15742 | TUT_EA_39-DEU | 2023-02-09 14:21:00.0000000 | 2023-02-21 12:50:19.420 |
| 15743 | TUT_F_1976-DEU | 2023-02-09 14:23:05.0000000 | 2023-02-21 12:50:19.420 |
| 15744 | TUT_UH_11-DEU | 2023-02-09 14:25:52.0000000 | 2023-02-21 12:50:19.420 |
| 15745 | TUT_KF_509-DEU | 2023-02-09 14:26:41.0000000 | 2023-02-21 12:50:19.420 |
| 15746 | TUT_BZ_999-DEU | 2023-02-09 14:27:30.0000000 | 2023-02-21 12:50:19.420 |
| 15747 | TUT_M_995-DEU | 2023-02-09 14:29:54.0000000 | 2023-02-21 12:50:19.420 |
| 15748 | TUT_JR_247-DEU | 2023-02-09 14:33:38.0000000 | 2023-02-21 12:50:19.423 |
| 15749 | TUT_KF_509-DEU | 2023-02-09 15:26:41.0000000 | 2023-02-21 12:50:19.420 |
| 15750 | TUT_BZ_999-DEU | 2023-02-09 15:27:30.0000000 | 2023-02-21 12:50:19.420 |
| 15751 | TUT_M_995-DEU | 2023-02-09 15:29:54.0000000 | 2023-02-21 12:50:19.420 |
| 15752 | TUT_JR_247-DEU | 2023-02-09 15:33:38.0000000 | 2023-02-21 12:50:19.423 |
Total plates are 17. Plates with frequency = 1 are 13. plates with frequency > 1 are 4.
Result should be like:
New Customers: 76% Returning Customers: 24%
3条答案
按热度按时间1tuwyuhd1#
Only look at plates that have been here in the last year.
Count their frequency of visits in that year.
Keep only those that visited in the last month.
Divide "visited more than once in the year" by "all"
gab6jxml2#
The maths in your expected results appears to be wrong, or your sample data is wrong; I don't know which. You state that you have 17 plates in your sample data, 13 that appear once, and 4 that appear more than once, but there are only 17 rows in your sample data. If you you 13 rows that have a distinct value once that only leaves 4 rows, which means that if they are duplicates there are only be 2 different plates, not 4.
Looking at your sample data, 4 plates do appear twice, however, only 9 appear once; I don't know where you got the other 4 from. If we use those numbers, you get ~69% and ~31%, not 76% and 24%.
oaxa6hgo3#
I am guessing that the column [datetime] is the date that the customer came for a car wash so what you could do something like this
What I did is find the number of people that came for more than one time and the number of people that came exactly once and in the end I calculated the percentages.