I have these tables:
CREATE TABLE staff
(
[case_id] int,
[staff] varchar(11),
[stafftype] varchar(10)
);
INSERT INTO staff
([case_id], [staff], [stafftype])
VALUES
(1, 'Daffy, Duck', 'Primary'),
(1, 'Bugs, Bunny', 'Additional'),
(1, 'Elmer, Fudd', 'Additional'),
(2, 'Daffy, Duck', 'Primary'),
(2, 'Bugs, Bunny', 'Additional');
CREATE TABLE cases
(
[case_id] int,
[casedate] datetime,
[caselocation] varchar(4)
);
INSERT INTO cases
([case_id], [casedate], [caselocation])
VALUES
(1, '2023-01-01 00:00:00', 'Home'),
(2, '2023-01-03 00:00:00', 'Away');
And I want to return a single row per case, there can only be 1 Primary
stafftype and a max of 2 Additional
stafftypes.
e.g. Result set for case_id = 1
| case_id | caselocation | PrimaryStaff | AdditionalStaff1 | AdditionalStaff2 |
| ------------ | ------------ | ------------ | ------------ | ------------ |
| 1 | Home | Daffy, Duck | Bugs, Bunny | Elmer, Fudd |
SQL Fiddle Demo: http://sqlfiddle.com/#!18/83bbc9/6
4条答案
按热度按时间hgb9j2n61#
Provided only 2 arbitrary 'Additional' stafftypes are required a plain group by will do.
htzpubme2#
One way to do this is conditional aggregation.
Fiddle: http://sqlfiddle.com/#!18/83bbc9/10/0
The cte assigns the 'Primary' stafftype a row number
RN
of 1, then assigns 2 and 3 to the secondary stafftypes. Then the final select uses theRN
to organize the data, and theMAX()
andGROUP BY
combine to roll the data up into single rows.You can select only the inner portion of the CTE, then remove the
MAX()
andGROUP BY
in the bottom query and run that to see what is happening step by step.ie. Run this:
Then this:
mzmfm0qo3#
There are other great answers in this thread, and in many engineering decisions what is the "best" solution will often go by an "it depends" rule. However, I also think this is a small enough an example for a common enough architectural decision point that I think it's worth showing some other "ways" it can be solved. I had considered using
STRING_AGG
and then splitting back out as a sample of one way to do this. This requires someSUBSTRING
+PATINDEX
logic:Option 1:
But I also realized this is a great example to show some of the power of built in XML processing that is available on all supported versions of SQL server. What this can do is abstract the need for a "row numbering" scheme away from the data query itself.
XML
,and -JSON
, both provide ways to take "a subset of data from a join or subquery" that may contain "0 or more rows" and turn that subset into a "logically single entity/expression" that can be parsed to get specific information "back out" at the end of (or later in) the query pipeline. This can be sometimes be useful in more complex query scenarios when the underlying "data architecture" is 'fixed'.A snippet showing the basic premise of what the XML-based query will do "per case id":
It's also possible that your data constraints may also look something like this as a way to force "only one primary staff per case" and ensuring that "each staff can only be on a given case_id once":
And so, the following XML-based query is straightforward to read, and could be adjusted to only join once. This gives you the option to change to an
INNER
join on the primary staff:Option 2:
Further Reading: Earlier above, I mentioned cases where you need to query a "fixed data architecture". If the architecture is still being designed/developed, you may have options to provide input or adjust how the schema is represented so that queries may be able to be constructed differently in the first place, or to take advantage of additional features of the storage and processing engines of
sql server
. This shows one possibility of initial schema design (where i have added "zz
" and "yy
" to names to :a,: prevent naming conflicts against the OP schema if testing on the same database, and :b,: clearly show these objects as testing and development schema items).As additional context not directly presenting a solution to OP, i'll leave the below without further explanation:
gudnpqoy4#