Updated to specify using MS SQL Server.
I have a table that has one row for each action performed by a worker, each action with a date/time stamp.
Action_Log
| Action_ID | Action_DTTM |
| ------------ | ------------ |
| 1 | 04/19/2023 11:15:37 |
| 2 | 04/19/2023 12:16:14 |
| 3 | 04/19/2023 12:16:14 |
| 4 | 04/19/2023 14:21:03 |
| 5 | 04/19/2023 16:55:41 |
| 6 | 04/19/2023 18:10:43 |
| 7 | 04/19/2023 20:24:15 |
| 8 | 04/19/2023 22:43:29 |
| 9 | 04/20/2023 13:21:04 |
| 10 | 04/20/2023 13:55:57 |
| 11 | 04/20/2023 13:55:57 |
| 12 | 04/20/2023 16:28:11 |
| 13 | 04/20/2023 17:35:45 |
| 14 | 04/20/2023 18:48:01 |
| 15 | 04/20/2023 19:32:26 |
| 16 | 04/20/2023 20:16:39 |
| 17 | 04/20/2023 21:03:25 |
I need a query that will aggregate the actions into shifts. This table above represents actions during 2 shifts by a single worker. There are a few different shift types a worker might work - ShiftA is 11am-11pm, ShiftB is 1-9pm - but I don't have a shift schedule telling me when workers were scheduled for each type of shift. All I have is the action log.
Example of the kind of output I'm hoping for:
| Action_ID | Shift_Number | Shift_Type |
| ------------ | ------------ | ------------ |
| 1 | 1 | ShiftA |
| 2 | 1 | ShiftA |
| 3 | 1 | ShiftA |
| 4 | 1 | ShiftA |
| 5 | 1 | ShiftA |
| 6 | 1 | ShiftA |
| 7 | 1 | ShiftA |
| 8 | 1 | ShiftA |
| 9 | 2 | ShiftB |
| 10 | 2 | ShiftB |
| 11 | 2 | ShiftB |
| 12 | 2 | ShiftB |
| 13 | 2 | ShiftB |
| 14 | 2 | ShiftB |
| 15 | 2 | ShiftB |
| 16 | 2 | ShiftB |
| 17 | 2 | ShiftB |
I'm not even sure if this can be accomplished in SQL? Also not sure if my sample tables are rendering in a legible way. I can add screenshots if that's easier to read
3条答案
按热度按时间gwbalxhn1#
OP, action id 17 goes over shift by 225 seconds. Charlieface's answer makes assumptions about a first shift's action (magic numbers). My approach gets this naturally. Code has all the comments to explain everything. I also added to your sample set of data to get data for all 4 shifts to prove this works. Also, my solution supports overlapping shifts, but makes the assumption there is only one shift working any given second. If that is not the case, then there is absolutely no way of doing this.
My suggestion is to have shift times in a user table (not a variable), similar to my setup, so you can easily update that table instead of updating your query with magic numbers. You never want to have a production query using magic numbers. You can even go further and add two datetime columns - date_begin and date_end describing date ranges a shift is valid. In this setup, you would leave the date_end null until it becomes obsolete, then add new rows as needed to describe the new shift times. The reason you would want to do this is because if shift times change, and you need to query back in time, then you will have to either do multiple queries to support the different shift time change, or, use the updated shift times table to support all of that gracefully.
Unfortunately, the result set is large (many columns to show supporting data, and 34 rows deep), and will not post a picture until asked. I also do not use db fiddle. I believe some on here are tightly related to whoever owns it and drives traffic to it.
xggvc2p62#
This is a classic Gaps-and-Islands problem.
You need to first work out the beginning of each shift (by looking for a gap of more than 8 hours). Then you need to apply that calculation across the whole shift, and then based on that you can assign a shift.
I note that your shifts are overlapping, so the results might not be perfectly accurate.
One option is to assign a conditional count of shift starts which gives you a number for each shift, then taking the minimum
Action_DTTM
per shift.A better (more efficient) option might be to just copy over the
Action_DTTM
only when it is the start of a shift, then use that to create the shift number as well as applying it to all other columns.db<>fiddle
jckbn6z73#
If you truly have data without long breaks then I'd think you would want to find those clusters and then use minimum (and/or maximum) times within each span.
I assume there is potential to start shifts early and late so I tried fudge the ranges a bit. Clearly you'll want to play with that logic or make it slightly more robust by looking at end times or total durations. You also appear to be able to distinguish workers which I assume is a worker ID of some kind. All in all this is nearly the same as Charlie's after reading it over.