I've got a table in my MSSQL server, lets call it blogPost
. I've also got two tag tables, lets call them fooTag
and barTag
. The tag tables are used to tag the blogPost
table which are identically structured.
blogPost
| postId | title | body |
+--------+---------------------+-------------+
| 1 | The life on a query | lorem ipsum |
+--------+---------------------+-------------+
fooTag and barTag
| postId | tagName |
+--------+--------------+
| 1 | sql |
| 1 | query |
| 1 | select-query |
+--------+--------------+
I want to get a single blogpost along with all it's tags in a single row so then STRING_AGG()
feels suitable to make a query like this:
SELECT blogPost.*, STRING_AGG(fooTag.tagName, ';') as [fooTags], STRING_AGG(barTag.tagName, ';') as [barTags]
FROM blogPost
LEFT JOIN fooTag ON blogPost.postId = fooTag.postId
LEFT JOIN barTag ON blogPost.postId = barTag.postId
WHERE postId = 1
GROUP BY blogPost.postId, title, body
When making this query I'd expect to get the result
| postId | title | body | fooTags | barTags |
+--------+---------------------+-------------+-------------------------+-------------------------+
| 1 | The life on a query | lorem ipsum | sql;query;select-query | sql;query;select-query |
+--------+---------------------+-------------+-------------------------+-------------------------+
But I'm getting this result instead where bar tags (i.e. the last STRING_AGG selected) are duplicated.
| postId | title | body | fooTags | barTags |
+--------+---------------------+-------------+-------------------------+-----------------------------------------------+
| 1 | The life on a query | lorem ipsum | sql;query;select-query; | sql;sql;sql;query;query;query;select-query;select-query;select-query |
+--------+---------------------+-------------+-------------------------+-----------------------------------------------+
Putting barTags
last in the SELECT statement makes it so that barTags
gets the duplicates instead of fooTags
. The amount of duplicates created seem to be bound to the amount of rows columns being aggregated together in the first STRING_AGG
result column, so if fooTags
has 5 rows to aggregate together there will be 5 duplicates of each barTag
in the barTags
column in the result.
How would I get the result I want without duplicates?
2条答案
按热度按时间piok6c0g1#
Your problem is caused by each row in
fooTags
creating that many rows ofbarTags
in theJOIN
, hence the duplication. You can work around this issue by performing theSTRING_AGG
in thefootags
andbartags
tables beforeJOIN
ing them:hrirmatl2#
You can simplify the query like so:
No GROUP BY required, in your case it'll be an expensive operation.