I need to calculate the median transaction amount for each customer in the past 52 weeks, but percentile_cont returns NULL if there's no transaction for a particular customer. In such a case, I have to replace NULL with zero, I acheived this by using a CASE clause in sql, however I am using PERCENTILE_CONT twice for this purpose which makes the query slow for a huge list of customers to process. is there a better way to use the PERCENTILE_CONT only once inside the CASE clause?
SELECT DISTINCT customer,
CASE WHEN
PERCENTILE_CONT(0.5)
WITHIN GROUP (ORDER BY
transamt) OVER
(PARTITION BY
customer) IS NOT NULL THEN
PERCENTILE_CONT(0.5) WITHIN
GROUP (ORDER BY transamt)
OVER (PARTITION BY
customer)
ELSE 0
END AS median_amt
FROM trans_table
WHERE trans_date BETWEEN DATEADD(WEEK, -52, GETDATE() ) AND GETDATE()
2条答案
按热度按时间eaf3rand1#
I tried COALESCE() function as JHH suggested and didn't see much difference performance wise. it seems COALESCE() is internally a CASE statement.
However when I switched to:
the computation time dropped by a factor of 2.
I don't know if this can be generalized to any scenarios or is it merely my particular query.
rjee0c152#
According to "...percentile_cont returns NULL if there's no transaction for a particular customer...", it could be a data issue or by design there could be entries for a customer without any transactions could have
trans_date
in the past 52 weeks buttransamt
is null. If that's the case, maybe this work for you by changingorder by transamt
to the following:Second guess: if percentile_cont() return NULL then show 0. Using coalesce().