Friday, March 30, 2012

long running transaction blocking queue

Hello,

I seem to be misunderstanding the way transactions work with service broker queues. We have developed and deployed a service broker application that 5 queues and a windows service for each queue on multiple servers (3 currently). Due to a last minute issue, we had to not use transactions when the services executed a recieve and I am not updating the code base to use transactions and am running into blocking issues. One of the services runs for 90 seconds (spooling to the printer) and all of the servers block on the receive operation for this queue. I thought that if I was receving messages from a single conversation, other receives against this queue would not block.

Thanks,

Jim Stallings

RECEIVE indeed should now block each other. One RECEIVE should either return messages for an unlocked conversation or return an empty resultset. Can you share how you do the RECEIVEs? E.g, are you using WAITFOR, are you using a WHERE clause, are you using TOP etc. Also, how many conversations and how many messages per conversation typically are in the queue?

Can you look what is the lock contention on? sys.processes will show what is the type and name of the resource a thread is blocked on.

Thanks,
~ Remus

|||Hello Remus,
Thanks for your response. I took a long weekend so I didn't see your response Friday. At the moment I'm just doing a receive top (1) (count passed in) as this is a very simple implementation where each message has it's own queue and I receive one per message per access until an end conversation is encountered. I (and others) were under the impression from how SSB is presented, that long running transacations could be left open on one conversation without blocking others but your answere indicates that is not the case? My problem is that I need to have a transaction around the processing of the message so that I don't have to build recovery logic if the windows service fails to complete.

From the PDC demo I saw, they showed that you could be processing from one conversation group and other processes could still access another conversation group. That led me to the conclusion that locking is by conversation group. Is that not true? Am I supposed to receive by conversation group? Is receiving by conversation handle supposed to prevent blocking of subsequent receives?

I worked with our DBA to track down the blocking and it is occurring in my stored procedure that does the receive. The sequence is that windows service A (server A) does a receive in a transaction. Windows service B (server B, etc) does another receive and blocks on a select by conversation handle. This select by the way is a holdover from the original coding I converted from the PDC code and is not used anywhere. We went from code complete to production so fast I didn't have time to go back and trim this kind of unused feature from the conversion. I can easily modify the receieve to select by conversation group if this would help with the blocking but since when we did the analysis we saw that we were blocking on the clustered index of the internal table, I had doubts if this would make a difference so I decided to post and ask. My receive proc is below and in the case that I'm asking about, I'm passing a count of 1, no timeout so no waitfor is generated, and no conversation handle.

Thanks,
Jim

ALTER PROCEDURE [DDPT].[ssb_Receive]
/*
****************************************************************************************************************
* File Name : DDPT.ssb_Receive.sql
* Description: Execute RECEIVE against the specified queue. Wait time of 0 will return immediately
* if no messages in queue, non-zero wait time will perform waitfor of specified wait time.
* Receive will be for any conversation unless converstation specified. Message count can
* be specified to limit the number of messages received.
*
* Author : Jim Stallings
*
* Example : Exec DDPT.ssb_Receive.sql
@.queueName = 'LabelRequestQueue'
, @.serverName = 'WSHB8257'
*
* History : Date Initials Description
* - -- --
* 03/13/2006 JAS Initial Development
*
****************************************************************************************************************
*/
@.queueName NVARCHAR(255)
, @.conversationId UNIQUEIDENTIFIER = NULL
, @.waitTime INT = NULL
, @.messageCount INT = 0
, @.serverName NVARCHAR(128)= NULL
AS
BEGIN

SET NOCOUNT ON;

DECLARE @.strSQL NVARCHAR(2000), @.errNo INT
DECLARE @.conversation_handle uniqueidentifier
, @.message_type_name nvarchar(128)
, @.message xml
, @.service_name nvarchar(128)
, @.taskid bigint

SELECT @.strSQL = 'SELECT top 1 @.conversation_handle = conversation_handle,
@.service_name = service_name, @.message_type_name = message_type_name, @.message = message_body
FROM ' + CAST(@.queueName AS VARCHAR(255))

SELECT @.strSQL = @.strSQL + ' WITH (NOLOCK) WHERE conversation_handle = ' + CAST(@.conversationId AS VARCHAR)
WHERE @.conversationId IS NOT NULL

EXEC sp_executesql @.strSQL
, N'@.conversation_handle UNIQUEIDENTIFIER OUTPUT, @.service_name nvarchar(128) output, @.message_type_name NVARCHAR(128) OUTPUT, @.message xml OUTPUT'
, @.conversation_handle OUTPUT, @.service_name OUTPUT, @.message_type_name OUTPUT, @.message OUTPUT

SET @.strSQL = ''
IF @.waitTime IS NOT NULL
SET @.strSQL = 'WAITFOR('

SELECT @.strSQL = @.strSQL + 'RECEIVE '
IF @.messageCount != 0
SELECT @.strSQL = @.strSQL + 'TOP(' + CAST(@.messageCount as varchar) + ')'

SELECT @.strSQL = @.strSQL
+ 'conversation_group_id, conversation_handle
, message_sequence_number, service_name
, service_contract_name, message_type_name
, validation, message_body
FROM ' + CAST(@.queueName AS VARCHAR(255))

SELECT @.strSQL = @.strSQL + 'WHERE conversation_handle = ' + CAST(@.conversationId AS VARCHAR)
WHERE @.conversationId IS NOT NULL

IF @.waitTime IS NOT NULL
SELECT @.strSQL = @.strSQL + '), TIMEOUT ' + CAST(@.waitTime AS VARCHAR)

EXEC(@.strSQL)
SELECT @.errNo = @.@.Error

RETURN @.errNo

END|||

Hi Remus,

I wanted to correct my last post. When I said the demo showed that you could receive from one group while other processes could receive from another, I mispoke. I meant that one of the PDC presentations I listened to gave me the impression that this was capability provided by conversation group locking in that once you started to receive against a conversation group, another process receiving against the same queue would return messages from the next group. I'm assuming that for this to function, a transaction must be kept open against the group until all messages are processed.

Also, I forgot to add that when we looked at the blocking when doing the receive, we determined that we were blocking on the clustered index on the internal table.

Jim

|||

I'm not sure I understand the logic of this procedure. But one thing I can tell is that looking up a conversation by means of SELECT and building a RECEIVE WHERE based on the SELECT return is not going to work. Concurrent procedure calls will return the same conversation and your RECEIVEs will block each other, because they'll try to lock the same conversation (as you already found out).

Also, mixing RECEIVE with and without a WHERE clause on the same queue is not going to work. The RECEIVEs without a WHERE clause will 'steal' messages you expect to be received by the RECEIVEs with a WHERE clause. The typical pattern to use is to use WAITFOR(RECEIVE ...) without a WHERE clause and then process the messages. This pattern allows the RECEIVE to get the next available conversation.

My recommandation is to remove the optional @.conversationId and simplify the procedure to use a simple RECEIVE without a WHERE clause. This way you can benefit from the conversation group locking and the built-in capability of RECEIVE to find the next available conversation group.

BTW you should really consider whether using dynamic T-SQL in a stored procedure is justified in your case. Doing so negates much of the performance benefits of stored procs.

HTH,
~ Remus

|||

Hi Remus,

First, many DBA's have been pounding into me that there are no performance benefits to stored procs since they are compiled on the fly now. Is this not true for SQL 2005?

The logic behind proc was to provide a general purpose receive stored proc that could be used by multiple services with and without timeouts and/or conversation handles. While this could result in the mixed access you mentioned, the respective services (and therefore queues) would consistently either use or not use converstation handles (and timeouts with waitfor). I don't know how else to do a receive proc that works for multiple queues and I don't want to build a stored proc for every queue.

The convoluted logic you mention is logic I said in the post was initially coded, not used and should have been (and is going to be) tossed. In any case, it was just to do a lookup in the queue , and if the conversation handle was not found, the result of executing the select would have nulled the conversation id parameter so that it would not be built into the subsequent receive command. BTW, I thought I read a post from someone in the service broker group that optimizer hints work against the internal tables for service broker and if that is incorrect that would be nice to know. Doing the receive as you suggest is not an issue, that is what I had decided to do anyway. All messages are independent and followed by an end conversation message.

Anyway, down to the real (and I apologize for my poor description of the problem) issue, it is that the windows service that executes for 90 seconds processing a specific message. I have read and heard that service broker could manage a conversation that takes seconds, minutes or even weeks to complete. If a queue is locked for the duration of a transaction, how would this be implemented without single threading on that queue? That is exactly what we are looking for but locking the queue for the 90 seconds duration means that I can't have another server start working on another message. Taking the processing out of the transaction means that I have to do additional coding to track failures and resend messages.

Jim

|||

Jim,

Lets do a simple experiment using the Management Studio.

1. First, create two queues and two services in your database of choice:

[code]

create queue [initiator];

create queue [target];

create service [from] on queue [initiator];

create service [to] on queue [target] ([DEFAULT]);

go

[/code]

2. Now lets send two messages, on two distinct dialogs, by running the following snippet twice:

[code]

declare @.dh uniqueidentifier;

begin dialog conversation @.dh

from service [from]

to service N'to', N'current database'

with encryption = off;

send on conversation @.dh;

go

[/code]

3. On a new Query, issue a RECEIVE and don't commit (to simulate an arbitrarily long transaction). This will grab the first enqueued message:

[code]

begin transaction;

receive * from [target];

-- commit;

[/code]

4. On a new Query, issue a second RECEIVE and don't commit (to simulate an arbitrarily long transaction). This will grab the second enqueued message:

[code]

begin transaction;

receive * from [target];

-- commit;

[/code]

5. On a new Query, issue a third RECEIVE, with a WAITFOR and don't commit. This will block waiting for a message:

[code]

begin transaction;

waitfor (receive * from [target])

-- commit;

[/code]

6. Run again the snippet from step 2 to send a new message.

7. Switch back to the query window opened at step 6. The WAITFOR has now completed, returning the newly enqueued message.

8. Look at the transaction locks will show how each RECEIVE has concurently locked diferent conversation groups:

[code]

select * from sys.dm_tran_locks

where resource_subtype='CONVERSATION_GROUP'

[/code]

This example shows that one can issue concurent RECEIVEs and they will no block each other. If you are seeing in your case RECEIVEs being blocked by each other, that would normally indicate that those RECEIVEs have a WHERE clause and are competing for the same conversation.

HTH,
~ Remus

|||

Hi Remus,

This is what I was asking.

Thanks,

Jim

|||

Jim Stallings wrote:

First, many DBA's have been pounding into me that there are no performance benefits to stored procs since they are compiled on the fly now. Is this not true for SQL 2005?

A stored proc that contains a statement (e.g. a RECEIVE statement) will be compiled once, cached and then at execution time the said statement execution will be invoked. If instead, at the same position, the proc contains an EXEC @.sqlBatch, or an sp_executesql @.sqlBatch, then at execution time the T-SQL batch that is contained in @.sqlBatch has to be parsed, compiled and executed. Parsing and compilation are steps that take significant time, comparable with execution time.

If the @.sqlBatch is identical each time (character by charatcter), then the system might already have a cached ad-hoc compiled plan for it from last time it was executed. But if the batch is but a single character different, it would have to be parsed and compiled from scratch each time. If the @.sqlBatch is constructed with an embeded GUID value, then you're guaranteed to have a different @.sqlBatch each time, causing a parsing and compilation from scratch. If dynamic SQL is a must, a good practice is to construct the @.sqlBatch with and embeded parameter and use sp_executesql instead of EXEC (since sp_executesql can pass in parameters).

A good read on the subject is: http://www.microsoft.com/technet/prodtechnol/sql/2005/recomp.mspx. Have those DBAs read it too ;-)

HTH,
~ Remus

No comments:

Post a Comment