Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Active Directory Nightmare....

Status
Not open for further replies.

reynolwi

IS-IT--Management
Sep 7, 2006
452
US
I am really thinking that someone doesnt like me...

A domain controller at the remote site here in town crashed. I took the old fileserver i took offline here at central over there and brought it up and got the other server back up. I installed AD on the fileserver and then the other crashed on me and now its not coming back up. The server i installed AD on is now having problems trying to replicate data from the FSMO holder which happens to be the AD server here at central site. Then to add to my everything i learn that the ad server at the site 4 hours from us has been offline and now its back online looking for the old domain controller which held the FSMO roles which was replaced by a new server with a new servername last friday and on top of that its looking for the ad server at the remote site here in town which doesnt exist right now.

Im in a nightmare.... I can see this really screwing stuff up. All the old AD info off the server thats been offline for apperantly a week and a half and a new server online that is trying its hardest to get AD information.


Right now the new ad server at the remote site here in town has its DNS primary server set to the AD server here at central which is FSMO holder. I told DNS on that computer to setup the AD partitions and i didnt see an error so im praying that it starts loading data. I'll be checking on it at 6am 3/20/08.

Someone.... Anyone?

Wm. Reynolds
RRWDS | TxPSS


- - - - - - - - - - - - -
Network Error:
Hit any user to continue
 
It is now 6:50am and i have still gotten no where. Nothing is replicating at all. I manually tried creating replication links from the ad server here in central site and will try and do it on the other servers.

Wm. Reynolds
RRWDS | TxPSS


- - - - - - - - - - - - -
Network Error:
Hit any user to continue
 
i created links manually in sites and services but could not create links on the server that is 4 hours away. It is still seeing the old ad servers and i can not list the new ones. The remote site still has yet to pickup any AD information and i created links on it manually hopeing it would help. I do not see that im getting any errors on this other server here in town when i go into DNS and tell it to create the AD partitions, but yet it is refusing to.

If i do nslookup here at central on the AD server here for the domain name it is still listing the old ad servers.

> nslookup
default server: servername.domain.net
address: 10.25.18.11

> domain.net
server: servername.domain.net
address: 10.25.18.11

name: domain.net
addresses: 10.25.18.10, 10.25.18.11, 10.25.19.10, 10.25.19.11, 10.25.20.10


That shouldnt be right. How do i remove the old server info out of that query?

Wm. Reynolds
RRWDS | TxPSS


- - - - - - - - - - - - -
Network Error:
Hit any user to continue
 
So is your entire AD structure blown up and you are completely working with backups? Have you install the support tools yet? What have you restore and have you seized the roles to the new box? Have you restored the system state in its entirety which would have restored your DNS configs? Are these accurate anymore? Have you gone through them to update or delete the server records? Ran netdiag yet? Sounds like quite the spiderweb of stuff your trying to handle at once.

Cory
 
Ok.... as of 9am the new server that was promoted in the remote site here in town is now replicating to central site here. Im guessing it finally picked up on the links i manually created and used those. It is now replicating just fine.

The new problem is now the AD Server that is 4 hours away is still not picking up on the new AD servers here and is trying to poll the old server names.

Below is how the enterprise use to be setup...

Central Site:
--------
Server1 - AD, DC, GC, DNS, WINS, DHCP, ANTIVIRUS (HOLDS ALL FSMO ROLES)
Server2 - FILE, VPN
Server3 - EXCHANGE

SiteCS 1 (remote site in town):
--------
Server1 - AD, DC, GC, DNS, WINS, DHCP, FILE, VPN

SiteRP 1 (remote site 4 hours away):
--------
Server 1 - AD, DC, GC, DNS, WINS, DHCP, FILE, VPN


Everything replicated fine. Then last wednesday i changed up the central site and moved 2 servers into the new server we just got. So the central site changed to this...

Server 1 - DEMOTED/REMOVED
Server 2 - REMOVED/BACKUP IF NEEDED
ServerW1 - AD, DC, GC, DNS, WINS, FILE, ANTIVIRUS, VPN (HOLDS ALL FSMO ROLES)

The new server became the new AD Server at the central site and also assumed the roles of the file server. I was not aware that the AD server in the remote site 4 hours away had gone offline sometime while all this was changing. So it is still showing the Old AD Server in central site as the primary. Then when the AD server crashed on me in the remote site here in town i had to change that all up so it is now as shown below...

Server 1 - DOES NOT EXIST. WILL NOT BOOT BACK UP
ServerH1 - AD, GC, DC, WINS, DHCP, DNS, FILE, VPN

The new server at that site is now replicating data finally with ad server here at central. All that is left is to get the AD server in the remote site 4 hours away to recognize that there has been server changes and those old servers do not exist anymore. And i need to remove the old server addresses from AD completely because they still show up when you do NSLOOKUP on the domain name.

Wm. Reynolds
RRWDS | TxPSS


- - - - - - - - - - - - -
Network Error:
Hit any user to continue
 
heres one... i just saw this event warning in the system event logs on the AD server in the central site...

Event Type: Warning
Event Source: Rasman
Event Category: None
Event ID: 20209
Date: 3/20/2008
Time: 7:13:11 AM
User: N/A
Computer: ServerW1
Description:
A connection between the VPN server and the VPN client "IP ADDRESS" has been established, but the VPN connection cannot be completed. The most common cause for this is that a firewall or router between the VPN server and the VPN client is not configured to allow Generic Routing Encapsulation (GRE) packets (protocol 47). Verify that the firewalls and routers between your VPN server and the Internet allow GRE packets. Make sure the firewalls and routers on the user's network are also configured to allow GRE packets. If the problem persists, have the user contact the Internet service provider (ISP) to determine whether the ISP might be blocking GRE packets.

For more information, see Help and Support Center at

This is kind of funny considering i have never seen this message before nor have a i setup any of the servers to do this.

Wm. Reynolds
RRWDS | TxPSS


- - - - - - - - - - - - -
Network Error:
Hit any user to continue
 
sounds about like my week last week except on a smaller scale.

My replication wasn't working it was throwing weird errors. I ended up completely deleting the Links and roots and cleaned up the MetaData


Then I ran chkdsk on all the drives to cleanup any messed up stuff.

chkdsk [drive] /x
-if you do it on the OS it will make you restart

So I did that on both computers on all like 6 drives.

Then I deleted the files on the 2nd server that were part of the replication set and then restarted both computers to get the systems refreshed.

Then lastly I added the Replication links and let everything sync from server1 to server2 using RING topology and I disabled the links to server2 while all this was happening so that everyone just went to the main server where everything was.

Make sure that there are no traces of any dead servers on your AD that can screw ya up.

If it comes down to it you can a do a Non-Authoritive restore and keep everything it just puts it back to a state that worked.

Let me know what happens



--
-TheCloak

"You Never Know What Hits You, A Gunshot is the Perfect Way" - JFK
 
Also make sure...double even triple check that your computers firewalls are off. I always turn them off when messing with stuff like this so that can't even be a problem.

Also sometimes a simple RESTART of the networked computers (clients) can help out. I have had my systems acting like hell and they fixed after I restarted 20 some odd computers haha

--
-TheCloak

"You Never Know What Hits You, A Gunshot is the Perfect Way" - JFK
 
the problem is though i have to get this server which is 4 hours away to recognize that the old AD servers DO NOT EXIST anymore. Its refusing to let them go. How can i manually go in and create a link or something to get it to connect so it sees the new AD servers?

Wm. Reynolds
RRWDS | TxPSS


- - - - - - - - - - - - -
Network Error:
Hit any user to continue
 
So let me understand you better... what were the roles of servers that are "old" and what is the role of the server 4 hours away?

Hope the main AD is with you and that the 4 hour away one is just a DC.

It's always better when you have the engine with you!

--
-TheCloak

"You Never Know What Hits You, A Gunshot is the Perfect Way" - JFK
 
the main AD is with me... the server 4 hours away is just a remote AD domain controller. All FSMO roles are here with me in central site.

I went in and manually rebuilt the DNS entries in that AD server 4 hours away and updated it to reflect the new servers and removed the old servers. What else do i need to do?

Wm. Reynolds
RRWDS | TxPSS


- - - - - - - - - - - - -
Network Error:
Hit any user to continue
 
is there any way to get the server to remove the old AD servers. I manually created the new servers in each site on the server thats 4 hours away but its not helping any. Ive gone thru and ensured that the DNS server files on that server do not reflect any of the old server names and ive flushed the dns cache. Ive also gone in and removed the old servers from sites and services on that server as well.

Wm. Reynolds
RRWDS | TxPSS


- - - - - - - - - - - - -
Network Error:
Hit any user to continue
 
that's where the metadata comes in yet again.

You net effort would be to do a restore to a previous working state via a backup.

Then again if you have the option to redo the schematics that would fix the problem and more than likely make it all run smoother.

which so is the main AD computer working fine and just the DCs can't connect?



--
-TheCloak

"You Never Know What Hits You, A Gunshot is the Perfect Way" - JFK
 
The main AD server that holds all the FSMO roles is just fine. It is on a NEW server with a NEW server name. The old main AD server is OFFLINE permenantly.

The domain controller at the remote site here in town is now fine as well. It is on a NEW server with a NEW server name and is replicating with the central site. The old domain controller that was here that crashed is now OFFLINE permenantly.

The domain controller at the remote site which is 4 hours from me is HAVING ISSUES replicating because it was offline during the time i changed server hardware. It still thinks the old servers are in the enterprise and doesnt realize that the enterprise has changed.


The problem is i removed the old servers from Sites and Services already so it says it cant remove the server in metadata cleanup because there is no data there or server to remove. Im wondering if i can just remove AD and reinstall it since im nowhere close to even sit in front of this server and mess with it more... I might just see what it'll do if i do that.

Wm. Reynolds
RRWDS | TxPSS


- - - - - - - - - - - - -
Network Error:
Hit any user to continue
 
OK... somebody... If i remove AD from this stupid server thats 4 hours from me since i can not get it to replicate data with the new servers what is it going to do to the enterprise? I walked thru the steps of removing and it says its holding the last replica set of one or more AD partitions and when you tell it to remove it wont because it thinks its the last domain controller in the domain. If i go back and tell it thats its the last domain controller a message pops up saying that AD detects that there may be other domain controllers.

This server IS NOT replicating any data at all with the new servers. It still thinks the old servers are operating.

What can i do? I mean can i just remove all the server ips and point the dns primary and secondary back to itself?

Wm. Reynolds
RRWDS | TxPSS


- - - - - - - - - - - - -
Network Error:
Hit any user to continue
 
ok after looking at my options and counting to 10... i chose to do a DCPROMO /FORCEREMOVAL on this stupid server and am working on rejoing it back as a domain controller under a new server name. I went thru microsoft, and googled different topics but could not for the life of me to get it to work and replicate so i dont think i have any other choice. I removed the server and cleaned up the metadata on the primary ad server that holds all the fsmo roles and so far its all looking good.

Wm. Reynolds
RRWDS | TxPSS


- - - - - - - - - - - - -
Network Error:
Hit any user to continue
 
if your main one and the other one is working good I wouldn't mess with them any longer. I'd make a trip to the one that's not with you and redo it. Because if it's not replicating but the other is then it's not a local problem. If you already have two new servers I think it'd be smart to redo the other one so it doesn't bottleneck.

--
-TheCloak

"You Never Know What Hits You, A Gunshot is the Perfect Way" - JFK
 
already got it fixed. its all replicating now with no problems. granted the enterprise has changed a lot and i have to redo my layout and systems diagram, but its working. the dcpromo /forceremoval did the trick and i got AD reinstalled with no problems and it instantly picked up on what it was relicating with.

Wm. Reynolds
RRWDS | TxPSS


- - - - - - - - - - - - -
Network Error:
Hit any user to continue
 
awesome glad we got the answer

--
-TheCloak

"You Never Know What Hits You, A Gunshot is the Perfect Way" - JFK
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top