Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

backup non-domain server 1

Status
Not open for further replies.

mansley

IS-IT--Management
Jun 15, 2009
3
0
0
US
I'm running version 7.0 and the Commcell and most of my agents are on a domain instide the LAN.

I have installed the idata filesystem agent to several servers in my DMZ. Those that are domain members work flawlessly, but those that are workgroup fail out with
Error Code: [19:599] Description: Loss of control process ifind.exe. Possible causes: 1. The control process has unexpectedly died. Check dr watson log or core file. 2. The communication to the control process machine hydra might have gone down due to network errors. 3. If the machine hydra is a cluster, it may have failed over. 4. The machine hydra may have rebooted.

I know the connectivity test returns 'ready' and that the firewall ports work since domain base DMZ machines work fine.

I think this has something to do with permissions since my CommCell is using a domain account for all services. How do I backup a workgroup server?
 
Quick question, do you specify both host names and IP addresses in your FWHosts and FWPeers files?

 
This is very likely to be a connectivity problem. You can get connectivity "ready" but still have connectivity problems - I had this situation yesterday! Some things to check:

1. From the GUI, right click the client and select Properties. Carefully note the Host Name and CommServe HostName, especially whether they use fully qualified domain names (FQDNs) or short names.

2. Log on to the client. Start regedit and under the CommServe registry key (HKLM\SOFTWARE\CommVault Systems\Galaxy\Instance001\CommServe), note the value of the sCSHOSTNAME key (the location of the key may be different on your system but it will be something resembling my example). This is the name that the client actually uses to contact the CommServer (which I will refer to as the "CS" from now on). It should exactly match the "CommServe HostName" that you got from step 1 above. If not, someone's been fiddling - make the two match.

3. On the client, run FirewallConfig.exe in the Galaxy Base folder (which by default is C:\Program Files\CommVault\Galaxy\Base). Check that the port range is correct and matches the firewall rules. Check that the CS is named correctly using the previously obtained name (exactly as it appears in the GUI and the registry).

4. On the client, start a command prompt and do ALL of the following:

(a) Ping the CS using the previously obtained name (exactly as it appears in the GUI and the registry). Ping may or may not be blocked by your firewall, so do not be worried if the ping doesn't work, but do note the address it's trying to reach. Is it the correct address for the CS? Fix if not (it may be worth checking the contents of the HOSTS file).

(b) Use nslookup of the CS using the previously obtained name (exactly as it appears in the GUI and the registry). Is it the correct address for the CS? Fix if not (it may be worth checking the contents of the HOSTS file). We do an nslookup as well as a ping because they can and do return different addresses in some circumstances.

(c) To establish that the firewall has the right ports open, telnet to the CS using the following ports and you should see the following results. Be sure to use the CS name from the GUI/registry.

- telnet <commserver-name> 8400
This should give a blank screen. If so, press Ctrl-] (control right-square-bracket) and type quit. If not, you cannot communicate with the CS properly.

- telnet <commserver-name> 8401
This should give a few characters of garbage. If so, press Ctrl-] (control right-square-bracket) and type quit. If not, you cannot communicate with the CS properly.

- telnet <commserver-name> 8402
This should give a few characters of garbage. If so, press Ctrl-] (control right-square-bracket) and type quit. If not, you cannot communicate with the CS properly.

If any of the above telnet connection attempts fail, look at your firewall logs to find out where the problem is, or check the client's persistent routes. Type "route print" at the command prompt to check. See a networks guru if you don't know what to do.

5. Having checked connectivity from the client to the CS, we now need to do the reverse, since the CS initiates connections to the client. Log on to the CS.

6. On the CS, run FirewallConfig.exe in the Galaxy Base folder. Check that the port range is correct and matches the firewall rules. Check that the client is named correctly using the previously obtained Host Name (exactly as it appears in the GUI).

7. On the CS, start a command prompt and do ALL of the following:

(a) Ping the client using the previously obtained name (exactly as it appears in the GUI). Ping may or may not be blocked by your firewall, so do not be worried if the ping doesn't work, but do note the address it's trying to reach. Is it the correct address for the client? Fix if not (it may be worth checking the contents of the HOSTS file).

(b) Use nslookup of the client using the previously obtained name (exactly as it appears in the GUI). Is it the correct address for the client? Fix if not (it may be worth checking the contents of the HOSTS file).

(c) To establish that the firewall has the right ports open, telnet to the client using the following ports and you should see the following results. Be sure to use the client name from the GUI.

- telnet <client-name> 8400
This should give a blank screen. If so, press Ctrl-] (control right-square-bracket) and type quit. If not, you cannot communicate with the client properly.

- telnet <client-name> 8401
This should actually fail!

- telnet <client-name> 8402
This should give a few characters of garbage. If so, press Ctrl-] (control right-square-bracket) and type quit. If not, you cannot communicate with the client properly.

If any of the above telnet connection attempts fail (except the one to 8401), look at your firewall logs to find out where the problem is, or check the CS' persistent routes. Type "route print" at the command prompt to check. See a networks guru if you don't know what to do.

8. Kill any of the client's backups that might be still trying and run another backup. If it works, problem solved! Be happy! If not, be sad, and continue reading...

9. Having established 2-way communications between the CS and the client, guess what? We now have to do the same between the client and the Media Agent (hereinafter called the "MA"). Using the GUI, find out what storage policy the particular sub-client is writing to and go look at it. Find out all of the MAs that it uses. Many (most?) sites will only have one MA per storage policy but do not assume this if you don't know your site well. [If you are finding that your backup sometimes works and sometimes doesn't, it may be able to communicate freely with one MA and not with another!] In the GUI, expand Policies then Storage Policies. Click on the correct storage policy on the left pane, then on the right pane look at the "Copy Type" column. Right click the copy listed "Primary" and select Properties. Go to the Data Paths tab and note down all the MAs in the "MediaAgent Name" column. Your backup could use any of these MAs (those with a tick in "Enabled" anyway).

10. We now need to get the network names used for each of these. Still in the GUI, expand "Storage Resources" then "MediaAgents". Right click each of the MAs obtained from the storage policy, click Properties, and note down the "Host Name". This is the name that we need to check connectivity with.

11. Now we're going to repeat all of our connectivity checks above, this time between the MA (or MAs) and the client. Repeat steps 3 to 7 above, substituting "MA" for "CS" wherever it appears in those steps. Do it all again for each MA. For example, if your check of the storage policy shows 2 enabled MAs, MA1 and MA2 for example, who are listed under MediaAgent properties as Host Names "ma1.mydomain.com" and "ma2.mydomain.com", then you need to do the above connectivity tests between the client and ma1.mydomain.com (and the reverse) and between the client and ma2.mydomain.com (and the reverse).

12. Now repeat step 8!

13. If that doesn't fix it, please go quietly outside and then let out a loud scream.

14. I haven't yet mentioned DNS reverse lookups. CommVault will tell you that just as names must translate to the correct addresses, addresses *must* also translate back to names (and to the same names as the forward lookup), although I've seen plenty of backups work without this being true. If you've gotten here and it's still not working for you, you should check this. CommVault provide a neat little tool to help with this. It's called CVIPInfo and it's on the resource kit. Get it. Run it on the CS, all relevant MAs, and of course the client. (I do hope you're not going to ask me what the resource kit is. You can download it from the CommVault Maintenance Advantage web site and it contains loads of helpful tools and utilities).

15. Other things: Try allowing everything through the firewall temporarily. Does it work now? Also make sure the Windows software firewall isn't interfering (turn it off). Check persistent routes.

16. If you've got this far and it's STILL not working for you, then you've got to the limit of my diagnostic abilities without the benefit of seeing your site. And of course my consulting fees are very reasonable! :)
 
In fact, reverse DNS did seem to be the problem.....thanks!! Craig is a genius!! I have another question about 'importing' a first full backup from other media due to slow WAN links, but will post in a new thread.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top