Improve SQL Server Query to be "thread safe"

1rhkuytd  于 2023-10-15  发布在  SQL Server
关注(0)|答案(3)|浏览(127)

I have an application that process incoming messages, but based on what type the message is I want to process them in a singleton approach meaning that 2 message of the same type is not allowed to be processed at the same time.

So in my database I have this MessageInformation table that contains the messages and an ExecuteStatus . Where 0 = NotProcessed, 1 = Processing and 6 = OnHold.

Now I want to find out if the new message that arrives should be marked to processing or to be put on hold.

I the following SQL statement do this pretty easy:

IF EXISTS (SELECT 1 FROM [dbo].[MessageInformations] WHERE MessageType = @MessageType AND ExecuteStatus IN (1,6))
BEGIN
    UPDATE [dbo].[MessageInformations]
    SET ExecuteStatus = 6
    OUTPUT Inserted.ExecuteStatus
    WHERE [Identifier] = @MessageIdentifier
END
ELSE
BEGIN
    UPDATE [dbo].[MessageInformations]
    SET ExecuteStatus = 1
    OUTPUT Inserted.ExecuteStatus
    WHERE [Identifier] = @MessageIdentifier
END

The problem here is that it's not "thread safe" meaning if 2 message arrive at the same time, the could both be marked as Processing when only 1 should be it. I could put it in a transaction and some lock on the MessageInformation table, but I don't want to put a lock on the whole table because there could be other messages for other types that don't want to wait for the lock to be finished, I only run this query for the messages of that specific message type. Other message types will simple be set to Processing .

And also there is A LOT of messages in the table, it will stable around 40-50 millions (cleaning out old messages).

So I need either a single UPDATE statement that can do this for me or a way to lock only the rows in the table that matches the MessageType (I could add a date-range to that to get fewer records to match, because if a message is onHold for more than a couple of hours something is wrong).

To make it a bit more clear. Here is several steps included.

  • First there is a step that created the message in the database and mark it as 'NotProcessed'
  • Then there is several processor that gets a messageIdentifier assigned to them to try to process that messages. So basically the processor stands with a messageidentifier and needs to verify if it's okay to proccess this message or it should put it onHold. If it's marked as 'Processing' = 1, then the processor can keep processing the message, but if it's marked as 'OnHold' = 6 the the processor must stop and end the processing. And this is what I want my SQL query to figure out.
  • When a processor is done processing a "singleton" message, it will mark it as 'Processed' = 2 and will tell the "Manager" that it's fine to take the first (based on createtime) 'onhold' message and process it, and so it will continue until there is no more 'onhold' messages.
  • Any new messages that arrives when there is an 'onhold' or 'proccessing' messages in the database for this messagetype should allways go into an 'onhold' status, so the onhold messages are processed in the order they arrive.

I could create a seperate table and put the onhold message in that, but if I can solve it by simple changing the status in the MessageInformation table it will be so much easier, since there is a lot of other logic that is depending on that table.

monwx1rj

monwx1rj1#

To ensuring that messages of the same type are not processed concurrently and to avoid locking the entire table, you can use the UPDLOCK hint with a Common Table Expression (CTE).

;WITH CTE AS (
    SELECT TOP 1 [Identifier], ExecuteStatus
    FROM [dbo].[MessageInformations] WITH (UPDLOCK, ROWLOCK)
    WHERE MessageType = @MessageType
    AND ExecuteStatus IN (0, 6) -- NotProcessed or OnHold
    ORDER BY <your_ordering_column> -- Specify an appropriate column for ordering
)
UPDATE CTE
SET ExecuteStatus = 1
OUTPUT Inserted.ExecuteStatus
WHERE [Identifier] = @MessageIdentifier;
hl0ma9xz

hl0ma9xz2#

You are using multiple statements to solve a problem that can be solved with one (or two). @amit mohanty is right in that you can use locking to force the solution for this problem. In general, however, you can also use a higher isolation level (serializable) to achieve the same outcome for a broader class of queries as needed.

You may also want to consider if you have to one transaction per row or if your application can do multiple rows at a time. If possible, this can further simplify the problem by reducing the number of transactions needed. If you can move to a model where you do update once for the whole table (or once per state transition type), you may get further gains in performance and concurrency.

pqwbnv8z

pqwbnv8z3#

Amit's post helped me out, but I needed to tweak the SQL query a bit to meet my requirements.

;WITH CTE AS (
    SELECT *
        FROM [dbo].[MessageInformations] WITH (UPDLOCK, ROWLOCK)
        WHERE MessageType = @MessageType
            AND ExecuteStatus IN (0, 1, 6) -- NotProcessing, Processing or OnHold
)
UPDATE CTE
    SET ExecuteStatus = CASE WHEN (SELECT count(1) FROM CTE WHERE ExecuteStatus IN (1, 6) ) = 0 THEN 1 ELSE 6 END
    OUTPUT Inserted.ExecuteStatus
    WHERE [Identifier] = @MessageIdentifier
        AND ExecuteStatus != 1;

The 'Execution plan' only show 1 Update TSQL, so I guess it's safe for multiple execution.

相关问题