Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Authentication Failes 2

Status
Not open for further replies.

a2jam

MIS
Nov 9, 2007
12
US
I have an exchange cluster (active/passive) with two physical nodes. Commvault is configured to backup the vitual server, no matter which node is active. If the first node is active, commvault works just fine. If the exchange server failed to the second node, the backup jobs sits in a pending mode and has "Waiting for the services on the client [servername] to come online." message. When I check the connectivity of the virtual server, it has this message "failed to connect to client computer. Authentication failed." However, the connectivity check comes up succussful for the both physical nodes.

I appreciate any help I can get before the exchange admin get tired of me telling him to fail back the cluster to the first node.

thanks
 
Hi

I'd first check if Exchange iDA was installed correctly....What version of Galaxy is this?
Do you see 3 nodes for the cluster within the CommServe console?...there should be 2 x physical just showing filesystem iDA and 1 x virtual should filesystem and Exchange iDAs

If Galaxy 6.1 then the Exchange agent should have been installed like so;-
1) Install filesystem iDAs on both physical nodes (using cd 1b)
2) Install Exchange iDAs to virtual node from active node (using cd 5 which is cluster aware)

After installing the Exchange agent you get the option to update the passive node....you should do this.

You should have no problems when cluster fails over if installed correctly but, you will see a slight delay before the backup resumes.

CommVault Certified Engineer
 
You also have to have the physical Win FS IDA on both nodes prior to starting.

Also, the install needs to be done from the primary node.
Also, verify host file entries for each cluster instance.
If it has a cluster name it has to have an entry.

Other than that I would go with Birky and check your install

Your physical host nodes will alway show connectivity from the gui. It is the virtual servers that are tricky.

I would install on the primary virtual and answer yes to install on the passive... then fail the system over and check cluster admin to see if the "GX..." services started on failover.
If they are running it should work.
If the are not there is a host file or install issue and you should repost what you see.

Let us know...

 
Thank you very much for the quick response

It is version 6.1 and the file system iDAs (CD-1b) are installed on both Physical servers and the exchange iDAs (CD-5) on the virtual. Right now, it is on the second node and all Galaxy services are running.....four services in total. And the connective check is fine to the physical nodes but not to the vitutal.

It looks like I installed it correctly as you mentioned.....starting from the active node and then on the passive node.

How does the authentication work between a client and a commserve and Where is the host file located? Is there anything else i need to be looking at?

Thanks
 
This the reason job delay message i am getting

[CVSession::authenticateClient]:Remote system [vitual FQDN]. Failed authentication returned from server.
 
Here you go;-

Cause:
When a job is initiated, the CommServe and client perform an authentication to verify each other. The error above indicates a problem has occurred during the authentication process.

In most cases however, this error is the result of a communication problem between the client and Commserve rather than a failure of the actual authentication process.

To begin authentication, the CommServe connects to the client's cvd process on port 8400. The client then connects back to the CommServe's cvd process, also on port 8400. If the client is unable to connect back to the CommServe for any reason, the above error will be triggered. The most common causes are:

Client is unable to resolve the name of the CommServe, or is resolving the name to the wrong address.
Client is unable to reach port 8400 on the CommServe, usually due to a firewall.
Client is attempting to reach the CommServe using the wrong name. Typically this occurs following a CommServe name change.

--------------------------------------------------------------------------------

Resolution:

From the CommServe, make note of its IP address.
From the client computer, start regedit.exe.
Navigate to
HKEY_LOCAL_MACHINE\SOFTWARE\CommVaultSystems\Galaxy\Instance00x\CommServe
and note the value of sCSHOSTNAME. This should be the name of the CommServe.
If you have changed the name of your CommServe or the value shown is not the correct name, then correct this value from the client properties page in CommCell Console.

From the client computer, ping the CommServe using the name as shown in sCSHOSTNAME registry entry above.
Verify that the name resolves to the correct IP address as verified in step 1. If the client is unable to properly resolve the name of the CommServe, check your DNS and hosts file on the client computer to correct the problem.

Open up a command prompt from the client computer and enter the following command using the host name exactly as shown in the sCSHOSTNAME registry entry above. If the connection succeeds you will see an empty screen. If the connection fails, please check your network configuration to ensure that the client computer can properly communicate with the CommServe.

telnet <commservename> 8400




CommVault Certified Engineer
 
Do you notice the port number above FRbutler ;-) lol



Birky
CommVault Certified Engineer
 
I checked the reg key and there are two entries and the sCSHOSTNAME is my commserve FQDN.

HKEY_LOCAL_MACHINE\SOFTWARE\CommVaultSystems\Galaxy\Instance001\CommServe
HKEY_LOCAL_MACHINE\SOFTWARE\CommVaultSystems\Galaxy\Instance002\CommServe

and the telnet command to the commserve checks out fine. Nslookup resolves the IP address to FQDN and vice versa.

I found this from the CVD log on the commserve. I had to edit out a couple of things.....FQDN and Cluster_name

5876 5e4 11/09 16:40:07 ###### [CVD ] ** CVD_CVSESSION_ERROR: RemoteHost=FQDN? Error replying to attach on socket: 9000025=[CVSession::authenticateServer]:Remote system [FQDN]. Could not get client password from database - authentication failed.
5876 c64 11/09 16:45:09 ###### ** CVSession::getClientPasswordLocal: Attempting to get client [Cluter_Name] CommCell [2] password from database. Remote [FQDN]Encountered DB error [Error for operation [Close] on table [client]. No rows were affected or no data returned from operation.].
5876 c64 11/09 16:45:09 ###### [CVSession::authenticateServer]:Remote system [FQDN]. Could not get client password from database - authentication failed.
5876 c64 11/09 16:45:09 ###### ** CVSession::replyAttach (PlatformType)
- RemoteHost=FQDN.
- authenticateServer failed. Error=9000025.
 
So it isn't comms error....weird thing is that cvd does connect on port 8400.

When Galaxy was installed, did updating the passive node succeed successfully?

If I was in your place then I would do the following to troubleshoot this to try to resolve the db error...
1) Perform failover to node2 and reboot node1 then vice versa...try a backup whilst node2 is active node

2) If error still occurs from node 2 when it's the active node after reboot, uninstall the iDAs on all 3 nodes and reinstall....ensuring that SP4 and Post SP4 is installed. (Don't delete/deconfigure anything though from the CommServe itself, just uninstall/reinstall on client nodes.)

3) If still no joy then log a call with CV Support because they may need to look into the db.



Birky
CommVault Certified Engineer
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top