Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

troubleshooting question

Status
Not open for further replies.

wbodger

Programmer
Apr 23, 2007
769
US
OK, I have been doing the 'dba' thing for a few years, but it was a role that I just kind of slipped into. There have not been any problems (so far) that were not solvable, but if I were to ask the question (as is often asked in an interview) 'If a user came to you and said the system was really sluggish today, what would you do?'. How would you answer that question? My thoughts are to check the system status, run sp_who3 to see what processes are running and then drill deeper on any processes that seem to be taking a lot of system resources. Perhaps check the execution plan to see if there are any steps that seem to be taking the lions share of the execution time, then check to see that the tables are properly indexed. What am I missing, how should my thinking change with these kind of problems, what other things should I be looking at?

Still trying to learn...

wb
 
What do you mean it was sluggish?
What were you running or trying to do at the time?
What error did you get?

Those are two of the questions I ask back. I want to know what they were running since the issue might be a poor script. It could also be they are running an application and the issue is with the application. I usually know their location, so I can decide if the network might be an issue (we have users in almost every state).

The difference between 'what were you running' and 'what were you trying to do' can be huge. It's a whole different issue if it is 'sluggish' when they are trying to connect and if it is 'sluggish' when they are running a query.

So my answer is first you have to narrow down where the issue is? Network? Application? Connection? Script? Other?

-SQLBill



The following is part of my signature block and is only intended to be informational.
Posting advice: FAQ481-4875
 
That makes sense. Narrow down the scope, is it the connection or is it internal to the database. So, if they says 'Well, I was running that script to get the TPS report', knowing that report has run well every other day for the fast year, where do you start? Do you look first at their computer or at the database? On the database, do you look at just that query or do you look at performance as a whole? Where do you look to check performance as a whole? If just that script, do you go thru the execution plan (even though it used to work fine) or do you start elsewhere?

Thank you very much for your help as I am trying to get a better handle on approaching these kind of problems!

wb
 
If it's been running well every other day, I want to know two things first:

1. did they make a recent change to it or any of the databases/tables/procedures/functions that it uses?

2. what else was running at that time?

For #2, I recently had a user complain a report was taking a long time to run. #1 turned up nothing. #2 showed there was a reindexing job added to the server that was running at the same time. I stopped that and the report ran quickly again. Sometimes you need to tweak schedules.

Another time, a report was running slow...sp_who2 showed there was an insert happening on the database and it was a huge one. Then I checked the report script and they were not using (NOLOCK) on the script. I suggested they use that if they could afford 'dirty' reads for their report. Otherwise, they would have to wait until the insert finished.

Basically, if a script has been running fine and all of a sudden isn't...I look for the issue to be somewhere else. Although sometimes I've done a lot of searching only to have the user finally say...well I did make a change to the script, but it was a small one and wouldn't have caused this issue...yeah right.

As for when I check a script...I'll see if I can run it. If so, that gives me a timeline of how long it takes. Now I know that if it runs faster than that, I've improved it. Then I look through the script for obvious issues. I might look to see if columns are heavily used - maybe an index is needed and it wasn't before (this may also indicate the amount of data has drastically increased in the table/database). Then I'll get the execution plan. Next, I may run portions of the script to see where the slowness is. I've found scripts where one subselect slowed the entire thing down. As part of my testing, I'll use hard-coded values instead of parameters...sometimes that will speed it up. If there is a function that does something with the data, try to remove it and hard-code what it does - for example, one script had a function that took a string (a,b,c,d) and converted it for use with an WHERE .... IN. So the script portion looked like this:

WHERE something IN dbo.userfn_fix_string @string

I hard-coded the statement as:

WHERE something IN ('a', 'b', 'c', 'd')

and the script ran much quicker - from 20 minutes down to less than one minute. Bingo! there was the issue.

-SQLBill

-SQLBill

The following is part of my signature block and is only intended to be informational.
Posting advice: FAQ481-4875
 
Great, thank you so much for the input, very helpful in helping me to think more thoroughly in my problem-solving.

Willie
 
Glad I could help. One other 'trick' for troubleshooting scripts that has come in handy for me is the PRINT statement. I will often add PRINT statements to show me where the script is at in that moment or what it is really doing.

For example:
PRINT 'Starting Step1 ' + convert(varchar(20), getdate(), 121)

then later on

PRINT 'End Step1 ' + convert(varchar(20), getdate(), 121)

That will let me know how long specific parts run.

I also add PRINT statements for loops. That let me catch one time when the script was starting from the beginning each time. The output looked something like this:

1-100
1-200
1-300

When it should have been:
1-100
101-200
201-300.

-SQLBill

The following is part of my signature block and is only intended to be informational.
Posting advice: FAQ481-4875
 
So, you can use Print for blocks of code, but what if you have some inherited code that has a number of subqueries making it all one code block. Is there some way to use Print in a situation like this to see what is happening at specific locations in the code?

wb
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top