Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Teaching Troubleshooting 10

Status
Not open for further replies.

spamly

MIS
Apr 1, 2004
447
0
0
US
I know that there are a few different forums that I could post this, but I thought I'd try this one as most, or all, here should understand where I'm coming from.

I'd like to get some advise from the masses concerning UNIX cross-training. My workcenter has a few "new" UNIX admins who are fairly good at following detailed instructions, but are lost if anything comes up outside our standard documentation. They'd prefer to have documentation on every conceivable procedure and issue that could arise. I've tried directing them to a man page or a UNIX admin book, but they either ignore me or tell me that they can't understand that. They have also attended two UNIX/Linux sysadmin courses, but I don't think they have retained much.

This gets even more troublesome as they want to have a single fix for a given problem. One question I have had recently was "Why doesn't ftp work for me on this server?" I told them there are a lot of things that can prevent ftp from working and asked them what they had checked so far. Typically, they don't bother checking anything and they are looking for the single command to fix the issue. As you can probably know, there isn't a single command fix for everything.

I believe that they really need troubleshooting skills to progress as administrators. How do you teach this? I've encouraged them to build test servers. I've also built test servers for them. I've had one-on-one sessions with them where I explain and demonstrate how I troubleshoot a problem. I've also encouraged them to tackle a problem while I take a backseat. This hasn't worked for us yet and it has been over two years.

These admins are new to UNIX, but have extensive Windows Server, Novell, and/or DataComm experience. They are also categorized as "senior" or "lead" administrators, but perform as junior administrators and don't show much initiative.

Any ideas as to what I can do? Is there anyone else in a similar situation as an admin that wants to progress or as a lead that is struggling to cross-train peers? All advise is welcome. Thanks in advance!
 
Maybe it's the old fogey in me but there's no substitute for experience.

Perhaps you could try 'throwing them in at the deep end'. Use a none time critical problem as a training excercise. Point them towards 'man' Google and Tek-Tips and let them get on with it. I learnt most on late night support when there was no one else around and I had to find the solution.

Ceci n'est pas une signature
Columb Healy
 
Insert a known bug in a testbed system and put them to work finding it. Nothing like spending a day realizing that your previous notions are worthless.
Had it happen to me at an IBM school. All the previous experience and book stuff didn't mean diddly. Two of us chased a blown fuse all day and ended up stumped.
My favorite example, several years later, was a student telling me that a broken wire problem didn't happen in real life. This after he had spent 8 hours not understanding that the machine he was working on hadn't made the first required A or B decision, which was about a 20 second determination in normal troubleshooting.
Your hardest task is creating realistic problems.
 
There are 'cut down' (shortened) courses from some vendors for experienced sys admins in other UNIX varieties. And I remember attending a 'Troubleshooting' course for an older version of Tru64. Other courses may exist. Use Google.

From your second last paragraph, it sounds like you can make up your own 'job titles' (it appears some now match the 'ego' not the experience).


I hope that helps.

Mike
 
Thanks for all the feedback! I really appreciate it.

Columb, I learned by getting thrown in the deep end, so I know where's your coming from. Maybe I can get them working on more of the phone calls coming into the workgroup. Most 'critical' calls go straight to a real lead administrator and, to be honest, I'm not sure that I want them working on mission critical server problems. I work in a health care setting and some server outages could significantly impact the quality of patient care we provide.

I'll poke around for different troubleshooting courses on the net. I really need to encourage them to get the fundamentals as well as troubleshooting experience. Maybe I can talk with our manager and see if I can get him to require a couple of specific technical classes a year.

I'll also check with these other admins to see if they are okay with me creating problems in test systems for them to work out. I'm not sure if they'd be able to handle some basic problems and I don't want to come off as condescending.

Our job titles are tied to years of consciousness in the IT field, not necessarily to skills. As you can imagine, the people that have been around for a while love the system, but the newer or younger IT employees dislike it. It really promotes loyalty to the IT industry. When I criticized it I was simply told "We can't please everyone" and "It's not a perfect system". Oh, well.

Thanks again for all the feedback. All is welcome!
 
In my limited experience, there are "operators" and there are "administrators", and in several attempts I've seen, noone has jumped successfully from the operator mindset to the administrator mindset.

Good luck with it though.
 
When I se the "They need a...documentation on every conceivable procedure",
my first impression is: They don't understand how the system works.
That's wath we all need to be able to trobleshoot,
we need to understand how the machine work.
If they come from a Windows environment, they may not "get it",
and give up right away.
And lets face it, not everybody has what it takes to
become a good SysAdmin.

Maby the book "Practice of System and Network administration"
could be a good read for them ;-)
 
We suffer something similar in my work place.
No procedure means no knowledge.

The solution is draconian. Put them on 24x7 support.
When they start getting calls at 3am they will take the time to learn their systems and be proactive with issues.

However as columb quite rightly said, experience counts for everything in the sys admin world.
 
I don't consider someone an admin who can only function because of well-laid out instructions. Anyone who can read and think can do that. Part of being an admin is being able to get a feel for what might be causing your problem, or even where to start looking. As a few others have said before, you can't teach experience.

There are a few givens, like checking the messages/syslog files, console logs, checking for stale NFS mounts, full filesystems, etc., but the rest of it comes with years of 'Oh, I didn't know that' revelations. I've been a Solaris admin for a while, and I'm learning new things, small and large, every day. You can't learn to troubleshoot by any way other than getting in there and figuring it out. As someone also said previously, tek-tips and google are my two biggest resources for when something comes up that I'm not familiar with, and usually one or both of them pays off.

The best thing to do in my opinion is give those people some older/unused systems to take home. Let them jumpstart/load them, have them just play around and figure things out. That doesn't compare to a large company's trouble tickets, but its a start for getting to know how the OS works, and possibly where to look when something gets borked. When a problem comes up, give it to all of them and see if they can independently come up with a solution. Being told how something was fixed won't teach you anything (except maybe how to take a lot of notes) so they have to experience it for themselves.
 
The real question is, "do they live it" or do they just "work it." It seems in the computer world there are people who are genuinely interested and like to learn, and then there are those that went into it for money or other reasons and don't live it.

Urge them to install Linux or BSD on their home boxes, shove the latest technical journals (with a few random man pages you print out stuck in them) and urge them to read them FOR PLEASURE in their off time, and try to get them excited about what they are doing. Granted, you can't force this stuff, but gentle nudging can work well.

Create a very obscure "once in a life time" type error on a test server and push them to solve it... Make it a competition (watch the beginning of this season's House for ideas on how to make this interesting) and then offer a reward for which one solves it first. When one wins, tell them both that "this is how you should have been doing all your tickets/troubleshooting from the beginning..."

[plug=shameless]
[/plug]
 
These are all good suggestions, and positive.

Personally I've resigned positions as lead when personnel who were unsuitable and incompetent were employed and management
refused to act to correct the situation.

In my opinion this is not a technical issue. This is an HR and management decision accountability issue. If I had it to do over again I'd have escalated to the hiring managers boss.
 
Thanks for all the feedback!

One thing that I should probably point out is that I'm the "young" guy here. I do, however, have more education (both formal university and sysadmin training) than almost anyone. I also have over 13 years of really good UNIX experience.

The "rookies" have 25-35+ years of experience in the IT field, but nothing UNIX specific. I have created test servers for them in the past, but they have not taken advantage of the situation to learn anything. I even have a test system sitting idle right now in a spare cubical. I've also installed VMWare on their local PCs and supplied them with media for various flavors of Linux. In case they didn't want to spend the time building a server, I've also created several images that they can load directly on their VMWare servers, but you can guess where this went.

I work on some more complex servers (clusters, multiple databases on a single server, 1000+ "custom" printer queues, complex monitoring scripts tied into our enterprise monitoring solution), and I know it's not easy to transfer this knowledge. In response to this, I've coordinated training sessions where I've distributed diagrams detailing key server and application architectures.

We have a some informal hierarchy in our workgroup as it pertains to our positions. They know that they are the junior admins. Unfortunately this doesn't create a relationship that puts me, or any senior admin, formally above them. This means that I can only suggest courses of work or training for them. I can't order them to do anything. I also don't have the authority to put them oncall 24x7. I do have a good working relationship with our director and he respects me professionally, so I am going to tactfully request that he implements some of your suggestions. He knows there is a problem and is looking for a solution that doesn't alienate any of his staff.
 
That leads to the next step. Money or perks go to those who follow the manager's suggestions to improve themselves.

 
Spamly I am a previous (and current) sysadmin in Windows and have only recently inherited 30+ Linux servers of different flavours. From my experience;

- People have to WANT to learn/master Linux. As the other posts have indicated, experience is paramount. You have set them up correctly with training, VMWare images and Lab testing.. It's all a mindset
- As a Windows admin, I try to create documentation where I map how Windows tasks are done in comparison to Linux
- I jump on Forums such as this and try to pick up some troubleshooting techniques from the posts and even try to answer questions
- I have worked with some excellent Linux sysadmins and developers and pick the brain as much as possible. If you can find someone who has more of a willingness to learn, mentor him/her and have them keep training the others
- Again as other posts have indicated if you are already a lousy sysadmin on other platforms, you definitely any better on a Unix platform -- I would get rid of those guys

That's my 2 cents -- Hope it helps
 
It may help to setup a knowledge base, when a n other fixes a fault get them to document the solution and stick in a document repository like say owl (ggogle it's free). It would also be a nice project / learning exercise for your team to set it up in the first place (will need them to think).

For noddy stuff you could also get them to implement webmin.

But at the end of the day there is only so much you can do, if they are not willing to help themselves; it may be either a confidence thing or a cant be bothered thing, if it's the second get rid of the deadwood and bring new people in.

Mike

"Whenever I dwell for any length of time on my own shortcomings, they gradually begin to seem mild, harmless, rather engaging little things, not at all like the staring defects in other people's characters."
 
I thought i'd add this in..

Possibly your company should back up and take a whole hearted look at protecting themselves. Have every employee go to and take an Analytic Troubleshooting course or two. Start the mind thinking in the right direction and you'll be half way to solving all your worlds problems ;)

ps> I don't work for them nor have anything to do with the company. No plug intended.. I did take one of these courses though. It's not for the experienced troubleshooter (or you wouldn't have had the job you do for as long as you have), but may be just what these junior types need. First teach em how to troubleshoot, then set up databases and all that fun stuff.

-ram
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top