Friday, March 30, 2012

long transaction

Hi,

I am still not very proficient in SQLServer. So apology if the
question sounds basic.

We have a script to clean old unwanted data. It basically deletes
all rows which are more than 2 weeks old. It deletes data from
33 tables and the number of rows in each table runs into few millions.
What I see in the script (not written by me :-) ) is that all data is
deleted within a single BEGIN TRANSACTION and COMMIT TRANSACTION. As
I have background in informix, such an action in Informix may result
in "LONG TRANSACTION PROBLEM". Does SQLServer have a similar concept.

Also won't it have performance problem if all rows are marked locked
till they are committed.

TIA."rkusenet" <rkusenet@.sympatico.ca> wrote in message
news:bq071h$1tbn6v$1@.ID-75254.news.uni-berlin.de...
> Hi,
> I am still not very proficient in SQLServer. So apology if the
> question sounds basic.
> We have a script to clean old unwanted data. It basically deletes
> all rows which are more than 2 weeks old. It deletes data from
> 33 tables and the number of rows in each table runs into few millions.
> What I see in the script (not written by me :-) ) is that all data is
> deleted within a single BEGIN TRANSACTION and COMMIT TRANSACTION. As
> I have background in informix, such an action in Informix may result
> in "LONG TRANSACTION PROBLEM". Does SQLServer have a similar concept.

I'm not sure what the "LONG TRANSACTION PROBLEM" in Informix is, but yes,
you basically don't want transactions to run for long periods of time.

As you point out, this can cause blocking on reads depending on your
isolation level. In addition, it can make recovery that much harder.

Assume your transaction takes 20 minutes to run (that's a really long time
admittedly) and your server gets rebooted 19 minutes into it. You're
looking at probably at least 19 minutes of rollback time when you reboot.
During this time the database will be completely inaccessible.

You have a couple of options that depend on your environment.

One is to simply break this up into multiple transactions. Of course then
it depends on what happens if one fails. Does this matter to other
transactions? Are their any dependencies?

It may also be possible to copy the needed data into a temp table, truncate
the original table and move data back.

But regardless, I would try to redesign this.

> Also won't it have performance problem if all rows are marked locked
> till they are committed.
> TIA.|||"rkusenet" <rkusenet@.sympatico.ca> wrote in message news:<bq071h$1tbn6v$1@.ID-75254.news.uni-berlin.de>...
> Hi,
> I am still not very proficient in SQLServer. So apology if the
> question sounds basic.
> We have a script to clean old unwanted data. It basically deletes
> all rows which are more than 2 weeks old. It deletes data from
> 33 tables and the number of rows in each table runs into few millions.
> What I see in the script (not written by me :-) ) is that all data is
> deleted within a single BEGIN TRANSACTION and COMMIT TRANSACTION. As
> I have background in informix, such an action in Informix may result
> in "LONG TRANSACTION PROBLEM". Does SQLServer have a similar concept.
> Also won't it have performance problem if all rows are marked locked
> till they are committed.
> TIA.

If the data is unwanted, it seems unusual to delete it like that,
unless there really is a genuine requirement to do it inside a single
transaction. A large transaction like that can easily cause
performance and locking problems.

Assuming there isn't a real need to use a single transaction, then you
could do batch deletes instead, perhaps as an overnight maintenance
job to prevent users seeing unexpected results if they run queries
during the deletion.

Simon|||"Simon Hayes" <sql@.hayes.ch> wrote
> If the data is unwanted, it seems unusual to delete it like that,
> unless there really is a genuine requirement to do it inside a single
> transaction.

I don't see any such requirement. It can always continue next time if
the delete fails for some reason.

> A large transaction like that can easily cause
> performance and locking problems.

That's what I thought.

> Assuming there isn't a real need to use a single transaction, then you
> could do batch deletes instead, perhaps as an overnight maintenance
> job to prevent users seeing unexpected results if they run queries
> during the deletion.

I plan to remove BEGIN and COMMIT. This way each delete will be atomic
by itself.

thanks.

No comments:

Post a Comment