Wednesday, March 28, 2012

long running query

Using SS2000 SP4. I have a query that has been running for 17 hours now and
I'm wondering if it is doing anything. The transaction log for the database
has the same time stamp on it that it had yesterday before I started running
the query and the file size is the same. Shouldn't something have been
written to the transaction logs by now?
There is nothing else running on the server of any consequence. There is
disk activity (reads but no writes ) on the disk where the database is. Ther
e
are no reads or writes on the disk where the transaction logs are located.
The processor is at 2%. There is no blocking going on.
Here's the query. I have indexes on all columns in the join and where
clause. There are about 11M records in the table. There is no problem with
free disk space. The tempdb and logs also have a time stamp before I started
the query.
UPDATE tblwf3
SET BRANCH = a.branch_id
from dbo.tblBranchZip a, tblwf3 b
WHERE a.zip = b.zip5
and a.branch_id < '2900'
and a.Franchise is null
and a.closed is null
and a.commercial is null
and a.fume is null
--and b.branch is null
go
Any idea what might be going on? Thanks.
--
Dan D.How about some nasty cartesian product? joining on a zip field (assuming
zip code) would likely result in billions or possibly trillions of joins
between the tables.
Do this. Take the update statement and paste it into another QA window.
Hit CTRL-L to have the Estimated Query Plan displayed. Place the cursor
over the connecting lines in the plan and note the estimated row count.
Also check the left most icon for total estimated rows and the estimated
query cost. I think you may be surprised by what you see.
Also, note that your join is not ANSI compliant and will not be supported by
sql server at some point in the future. Start working to rewrite stuff like
that into this format:
UPDATE tblwf3
SET BRANCH = a.branch_id
from dbo.tblBranchZip a inner join tblwf3 b ON a.zip = b.zip5
WHERE a.branch_id < '2900'
and a.Franchise is null
and a.closed is null
and a.commercial is null
and a.fume is null
TheSQLGuru
President
Indicium Resources, Inc.
"Dan D." <DanD@.discussions.microsoft.com> wrote in message
news:EAFCE68D-33C7-48BC-A16A-AA42821FBB17@.microsoft.com...
> Using SS2000 SP4. I have a query that has been running for 17 hours now
> and
> I'm wondering if it is doing anything. The transaction log for the
> database
> has the same time stamp on it that it had yesterday before I started
> running
> the query and the file size is the same. Shouldn't something have been
> written to the transaction logs by now?
> There is nothing else running on the server of any consequence. There is
> disk activity (reads but no writes ) on the disk where the database is.
> There
> are no reads or writes on the disk where the transaction logs are located.
> The processor is at 2%. There is no blocking going on.
> Here's the query. I have indexes on all columns in the join and where
> clause. There are about 11M records in the table. There is no problem with
> free disk space. The tempdb and logs also have a time stamp before I
> started
> the query.
> UPDATE tblwf3
> SET BRANCH = a.branch_id
> from dbo.tblBranchZip a, tblwf3 b
> WHERE a.zip = b.zip5
> and a.branch_id < '2900'
> and a.Franchise is null
> and a.closed is null
> and a.commercial is null
> and a.fume is null
> --and b.branch is null
> go
> Any idea what might be going on? Thanks.
> --
> Dan D.|||As usual, TheSQLGuru is spot on, although I am not sure whether the SQL
(s)he posted is actually standard.
The version below is, and because it (also) eliminates the implicit
cross join it should run in a reasonable amount of time.
UPDATE tblwf3
SET BRANCH = (
SELECT a.branch_id
from dbo.tblBranchZip a
WHERE a.zip = tblwf3.zip5
and a.branch_id < '2900'
and a.Franchise is null
and a.closed is null
and a.commercial is null
and a.fume is null
)
--where b.branch is null
HTH,
Gert-Jan
Dan D. wrote:
> Using SS2000 SP4. I have a query that has been running for 17 hours now an
d
> I'm wondering if it is doing anything. The transaction log for the databas
e
> has the same time stamp on it that it had yesterday before I started runni
ng
> the query and the file size is the same. Shouldn't something have been
> written to the transaction logs by now?
> There is nothing else running on the server of any consequence. There is
> disk activity (reads but no writes ) on the disk where the database is. Th
ere
> are no reads or writes on the disk where the transaction logs are located.
> The processor is at 2%. There is no blocking going on.
> Here's the query. I have indexes on all columns in the join and where
> clause. There are about 11M records in the table. There is no problem with
> free disk space. The tempdb and logs also have a time stamp before I start
ed
> the query.
> UPDATE tblwf3
> SET BRANCH = a.branch_id
> from dbo.tblBranchZip a, tblwf3 b
> WHERE a.zip = b.zip5
> and a.branch_id < '2900'
> and a.Franchise is null
> and a.closed is null
> and a.commercial is null
> and a.fume is null
> --and b.branch is null
> go
> Any idea what might be going on? Thanks.
> --
> Dan D.|||I had checked the execution plan and didn't get any large numbers that seeme
d
out of the ordinary. On the lower level the first step uses a clustered inde
x
and gives a row count 0f 504. The step on the level above it has another
clustered index and gives a row count of 24,000+. Then there is a nested loo
p
and parallelism steps that give row counts of 11M but since I know that it's
updating every row in the table I think that is to be expected. The left mos
t
icon shows 11M rows.
Thanks,
--
Dan D.
"TheSQLGuru" wrote:

> How about some nasty cartesian product? joining on a zip field (assuming
> zip code) would likely result in billions or possibly trillions of joins
> between the tables.
> Do this. Take the update statement and paste it into another QA window.
> Hit CTRL-L to have the Estimated Query Plan displayed. Place the cursor
> over the connecting lines in the plan and note the estimated row count.
> Also check the left most icon for total estimated rows and the estimated
> query cost. I think you may be surprised by what you see.
> Also, note that your join is not ANSI compliant and will not be supported
by
> sql server at some point in the future. Start working to rewrite stuff li
ke
> that into this format:
> UPDATE tblwf3
> SET BRANCH = a.branch_id
> from dbo.tblBranchZip a inner join tblwf3 b ON a.zip = b.zip5
> WHERE a.branch_id < '2900'
> and a.Franchise is null
> and a.closed is null
> and a.commercial is null
> and a.fume is null
>
> --
> TheSQLGuru
> President
> Indicium Resources, Inc.
> "Dan D." <DanD@.discussions.microsoft.com> wrote in message
> news:EAFCE68D-33C7-48BC-A16A-AA42821FBB17@.microsoft.com...
>
>|||I didn't respond to your suggested rewrite of the query. That is normally ho
w
I write queries and it is what I tried first but there was no difference in
performance. I was grasping at straws. I can update the with either query in
< 10 secs if I have 100K rows. I just didn't expect 11M rows to take so long
.
I have 1.74T of free space on RAID 0. We wanted to do RAID 10 but found out
the controller wouldn't support it so I decided to test with RAID 0.
Thanks,
--
Dan D.
"TheSQLGuru" wrote:

> How about some nasty cartesian product? joining on a zip field (assuming
> zip code) would likely result in billions or possibly trillions of joins
> between the tables.
> Do this. Take the update statement and paste it into another QA window.
> Hit CTRL-L to have the Estimated Query Plan displayed. Place the cursor
> over the connecting lines in the plan and note the estimated row count.
> Also check the left most icon for total estimated rows and the estimated
> query cost. I think you may be surprised by what you see.
> Also, note that your join is not ANSI compliant and will not be supported
by
> sql server at some point in the future. Start working to rewrite stuff li
ke
> that into this format:
> UPDATE tblwf3
> SET BRANCH = a.branch_id
> from dbo.tblBranchZip a inner join tblwf3 b ON a.zip = b.zip5
> WHERE a.branch_id < '2900'
> and a.Franchise is null
> and a.closed is null
> and a.commercial is null
> and a.fume is null
>
> --
> TheSQLGuru
> President
> Indicium Resources, Inc.
> "Dan D." <DanD@.discussions.microsoft.com> wrote in message
> news:EAFCE68D-33C7-48BC-A16A-AA42821FBB17@.microsoft.com...
>
>|||I should have added that I can also run a query and update 5M rows in about
10 minis. It's just when I try 11M rows something happens and it takes
forever.
--
Dan D.
"TheSQLGuru" wrote:

> How about some nasty cartesian product? joining on a zip field (assuming
> zip code) would likely result in billions or possibly trillions of joins
> between the tables.
> Do this. Take the update statement and paste it into another QA window.
> Hit CTRL-L to have the Estimated Query Plan displayed. Place the cursor
> over the connecting lines in the plan and note the estimated row count.
> Also check the left most icon for total estimated rows and the estimated
> query cost. I think you may be surprised by what you see.
> Also, note that your join is not ANSI compliant and will not be supported
by
> sql server at some point in the future. Start working to rewrite stuff li
ke
> that into this format:
> UPDATE tblwf3
> SET BRANCH = a.branch_id
> from dbo.tblBranchZip a inner join tblwf3 b ON a.zip = b.zip5
> WHERE a.branch_id < '2900'
> and a.Franchise is null
> and a.closed is null
> and a.commercial is null
> and a.fume is null
>
> --
> TheSQLGuru
> President
> Indicium Resources, Inc.
> "Dan D." <DanD@.discussions.microsoft.com> wrote in message
> news:EAFCE68D-33C7-48BC-A16A-AA42821FBB17@.microsoft.com...
>
>|||I can see that the execution plan is quite different from the one I was
using. I tried your query on 5M rows and it finished in 11mins which is the
same as the query I was using. However, when I ran it on the 22M row table i
t
only took 2mins. The indexes are the same so I'm not sure what is going on.
I'll test some more.
Thanks,
--
Dan D.
"Gert-Jan Strik" wrote:

> As usual, TheSQLGuru is spot on, although I am not sure whether the SQL
> (s)he posted is actually standard.
> The version below is, and because it (also) eliminates the implicit
> cross join it should run in a reasonable amount of time.
> UPDATE tblwf3
> SET BRANCH = (
> SELECT a.branch_id
> from dbo.tblBranchZip a
> WHERE a.zip = tblwf3.zip5
> and a.branch_id < '2900'
> and a.Franchise is null
> and a.closed is null
> and a.commercial is null
> and a.fume is null
> )
> --where b.branch is null
> HTH,
> Gert-Jan
> Dan D. wrote:
>|||I don't doubt that you are updating 11M rows. The question is how many
TIMES are you updating those 11M rows. 24000+ times perhaps? Not sure
since I don't have access to the data.
However, I can state with almost 100% certainty that this update should NOT
be using a nested loop!! Try using OPTION (HASH JOIN) or MERGE JOIN (prolly
not right if not in sorted order already). See BOL for details.
TheSQLGuru
President
Indicium Resources, Inc.
"Dan D." <DanD@.discussions.microsoft.com> wrote in message
news:E77D9053-A6FE-4E46-9A2F-A42CA967B0B2@.microsoft.com...[vbcol=seagreen]
>I had checked the execution plan and didn't get any large numbers that
>seemed
> out of the ordinary. On the lower level the first step uses a clustered
> index
> and gives a row count 0f 504. The step on the level above it has another
> clustered index and gives a row count of 24,000+. Then there is a nested
> loop
> and parallelism steps that give row counts of 11M but since I know that
> it's
> updating every row in the table I think that is to be expected. The left
> most
> icon shows 11M rows.
> Thanks,
> --
> Dan D.
>
> "TheSQLGuru" wrote:
>|||1) From SQL 2005 BOL,
ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/40e63302-0c68-4593-af3e-
6d190181fee7.htm,
about 3/4 down the page, you will find this example:
USE AdventureWorks;
GO
UPDATE Sales.SalesPerson
SET SalesYTD = SalesYTD + SubTotal
FROM Sales.SalesPerson AS sp
JOIN Sales.SalesOrderHeader AS so
ON sp.SalesPersonID = so.SalesPersonID
AND so.OrderDate = (SELECT MAX(OrderDate)
FROM Sales.SalesOrderHeader
WHERE SalesPersonID =
sp.SalesPersonID);
GO2) "As usual, TheSQLGuru is spot on..." - thanks very much for the
compliment!!
3) "(s)he" - it is he actually. Kevin. Glad to meet you! :-)
TheSQLGuru
President
Indicium Resources, Inc.
"Gert-Jan Strik" <sorry@.toomuchspamalready.nl> wrote in message
news:46324FBF.204D08B8@.toomuchspamalready.nl...[vbcol=seagreen]
> As usual, TheSQLGuru is spot on, although I am not sure whether the SQL
> (s)he posted is actually standard.
> The version below is, and because it (also) eliminates the implicit
> cross join it should run in a reasonable amount of time.
> UPDATE tblwf3
> SET BRANCH = (
> SELECT a.branch_id
> from dbo.tblBranchZip a
> WHERE a.zip = tblwf3.zip5
> and a.branch_id < '2900'
> and a.Franchise is null
> and a.closed is null
> and a.commercial is null
> and a.fume is null
> )
> --where b.branch is null
> HTH,
> Gert-Jan
> Dan D. wrote:|||I've tried your method on a couple of other queries and it is definitely
faster. How would you write the query if I wanted to update several rows. Fo
r
instance,
UPDATE tblwf3
SET BRANCH = a.branch_id, term=a.term, hhseg=a.hhseg
from dbo.tblBranchZip a inner join tblwf3 b
ON a.zip = b.zip5
and a.branch_id < '2900'
and a.Franchise is null
and a.closed is null
and a.commercial is null
and a.fume is null
Thanks,
Dan D.
"Gert-Jan Strik" wrote:

> As usual, TheSQLGuru is spot on, although I am not sure whether the SQL
> (s)he posted is actually standard.
> The version below is, and because it (also) eliminates the implicit
> cross join it should run in a reasonable amount of time.
> UPDATE tblwf3
> SET BRANCH = (
> SELECT a.branch_id
> from dbo.tblBranchZip a
> WHERE a.zip = tblwf3.zip5
> and a.branch_id < '2900'
> and a.Franchise is null
> and a.closed is null
> and a.commercial is null
> and a.fume is null
> )
> --where b.branch is null
> HTH,
> Gert-Jan
> Dan D. wrote:
>

No comments:

Post a Comment