Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Adding secondary DC 1

Status
Not open for further replies.

ColdFlame

Technical User
Jul 24, 2007
48
CA
I've been reading through thread931-1270962 and there was mention of the FSMO role being an issue when running Exchange.

I am running Windows 2003 Enterprise and Exchange 2003 Enterprise. Currently, and likely against best practice, my PDC (CGYPDC01) is running DNS, DHCP, and doing all of the file/print sharing as well.

I have just promoted a second Windows 2003 Ent. server (CGYPDC02) to the role of DC and am curious as to what issues I could have as a result of doing so? Did I do something bad in promoting this second DC?

I should mention, I also have 3 remote sites that are DC's for their own subnets in the same domain.

Thanks in advance!

ColdFlame
 
What this is talking about is if you have the Infrastructure Master on a DC that is a GC. It is not recommended, but not a huge deal. If at the location that has the FSMO roles you have 2 DC's, then I would make 1 of them that does NOT have the IM role the GC and would NOT setup the IM role machine as GC. If you only have 1 DC at that site, then I would make sure its a GC and not worry about it.

Exchange has nothing to do with the FSMO role that you speak of.
 
The problem with having the infrastructure master role role on a GC only surfaces in a multidomain environment, if you run a single domain there is no issue.

You did nothing wrong in adding a second DC, in fact best practice recommends more than one DC for redundancy anyway.

If I read your post correctly you are running exchange on your DC, that is bad and should not be done really.

Paul
MCSE


"Two things are infinite: the universe and human stupidity; and I'm not sure about the the universe."
Albert Einstein
 
Pagy,

No, I am not running Exchange on my DC. I may not follow best practices to a "T" at times, but I certainly would never do that! =)

I have separate boxes for Exchange, SQL, IIS, SMS/MOM (not finished yet), as well as a few others.

Thanks for your responses, however, since this promotion, two of my remote sites can't communicate anymore! I don't know if it's related at all, but none of the routes have been changed and both can ping eachother's gateways, however they can no longer ping eachother or communicate with one-another.

For the time being, I thought I would demote that secondary DC just to see if it alleviated the problem. When doing a dcpromote, it's giving a "Failed to configure the service NETLOGON as requested: The wait operation timed out."

Any ideas what that is? Did I do something wrong? Did some of these FSMO roles transfer automatically by doing the promotion of the secondary DC yesterday?

In case you haven't noticed, I'm not overly familiar with the FSMO roles and what they do and how they pertain. Can someone help/enlighten me?
 
Obviously I didn't read it properly then :)

Here is info on FMSO roles


The FSMO roles would not have automatically transferred, transferring them is a manual process.

On a DC open a command windows;
netdom query fsmo

this will show you which server has the fsmo roles

Paul
MCSE


"Two things are infinite: the universe and human stupidity; and I'm not sure about the the universe."
Albert Einstein
 
Pagy,

Thankyou, that clears up a lot of things then. Now to figure out why I can't demote that DC and why my one site is coming up with all these NTDS KCC errors all of a sudden and unable to communicate with another remote site.

The eventlog is logging the following eventid's every 15 minutes: 1566, 1311, 1865. These show up 3 times in that order, and then a 1925 event is logged.

I am assuming that the KCC errors are due to the connectivity issue, but this still doesn't make sense why it would suddenly do this.

Site A can ping B, C, & D.
Site B can ping A, C, & D.
Site C can ping A, B, & D.
Site D can ping A, & C, not B.

dcdiag /e resulted in not being able to ping site B (which we know and can't figure out), and then at the end, had this for each site:

***Error: The remote site Default-First-Site, has no servers that can act as bridgeheads between the Default-First-Site and the local site FtSask for the writeable NC TAPI3Directory. Replication will not continue until this is resolved.
 
So the the link problem is Site D to Site B??

Sounds like a WAN link problem really which as you noted will result in your KCC errors. How are you WAN links configured?? What routers?? etc etc

Paul
MCSE


"Two things are infinite: the universe and human stupidity; and I'm not sure about the the universe."
Albert Einstein
 
Pagy,

Yes, Site D to Site B, and vice versa. Wan link problem was my thought too, however, Site D can ping Site B's gateway and Site B can ping Site D's gateway...

So the WAN link is there, at least to the routers.

The WAN link is a 10mb fibre line from our head office going out and each of the 3 satellite offices are running DSL connections that are tied to that fibre line directly (however go straight out to the Internet for surfing). All Cisco gear... the sat. offices run Cisco 800 series routers and the head office runs a Cisco 1800 series router.

Essentially, the links are vlans that a third party set up and hosts for us. Each site runs it's own subnet/ip scheme. That make sense or provide enough info?
 
Still sounds like somethings gone a bit splat somewhere on that WAN link to me.

Can you ping anything else on those LANs at site D and B?? Or is it just the server you can't get to ??

Paul
MCSE


"Two things are infinite: the universe and human stupidity; and I'm not sure about the the universe."
Albert Einstein
 
Paul,

Strangely, Site B can ping anything at Site D with the exception of the server.

I just noticed now that Site D can't ping anything at all in Site B.

Wow, I am very confused. Our logs are filling even our head office (main DC) with group policy errors (event id's 1000, 1030, and 1058).


As an aside, what harm is there going to be in forcing CGYPDC02 to demote (drpromo /forceremoval) seeing it won't demote gracefully?
 
Steve,

Where do I go about determining if that's the case? The hosts file is clean...
 
ColdFlame,

Microsoft makes a great utility that you should start with: Netdiag. I think it comes with the support tools on the Windows Server CD. However, given the results you're getting, it's probably going to say something's broken. You'll probably need to run these tests anyway.

First, verify that a user's machine in SideD can ping the server and that the server can ping the user's machine. With that out of the way, check for any differences in the network setup between the clients at SideD and the server. Maybe there's a different default gateway in use by the clients. Also check for VPN tunnels, Firewall settings, IPSec filters or things like that on the server at SideD.

(if you know how to open a command window, just skip the next part)
Open a command window by clicking start -> run -> type "cmd" and press return.

type the command "route print" - you should see results like:

Network Destination Netmask Gateway Interface Metric
0.0.0.0 0.0.0.0 10.1.13.140 10.1.13.140 1

First, make sure you *have* a default gateway, and that it's correct. This is the network entry that looks like "0.0.0.0" in the list. If you do, then what you're looking for next is the very bottom of the screen. You want to check for persistent routes. Perhaps a previous tech put in a static route trying to diagnose a problem or during the setup of the WAN. Compare your results with a client PC. Maybe there are static routes assigned in the login script and the server doesn't run login scripts, or the clients pick up something from DHCP that you need to do manually. Anyway, the fact that your clients at that site can ping you but not the server makes the server's networking stack a likely culprit.

For detailed network info, run "ipconfig /all" from a command prompt. Compare the clients' settings to the server.

If there are no static routes, and the client networking settings match your own, do your ping tests again but this time also perform this test (Traceroute):

C:\>tracert Tracing route to [64.233.167.99]
over a maximum of 30 hops:

If you can ping the Internet, that's a good sign. Then, run "tracert siteB" where siteb is the router or server across the WAN (use the IP, not the name). Since you know the steps that each packet *should* take when getting to the other side, you should be able to spot the problem quickly.

If your packets get to the router ok but then get stuck and time out, or get stuck in a loop, then the router itself needs to be looked at. This is very unlikely, but still a possibility. It's always possible someone put in a firewall rule on the router that only applies to the server. Hey, it could happen.


I hope this helps, I probably haven't covered everything. I hope some other folks monitoring this page can help fill in the gaps.







Once everything is up and running, here are some free neat tools to help keep tabs on your WAN:


Also this one for monitoring up/down status, bandwidth and errors:

This one has some pretty good SNMP capabilities.

There's bound to be lots more out there too!
 
Hi Steve,

Firstly, thankyou for your response. It was very informative and useful. I am familiar with tracert's and pinging, etc...

I am seeing strange anomalies though; I'll detail them below.

Site D can't contact Site B at all, whether it be pinging, tracert's, etc... Everything times out. Internet tracert's are fine.

Site B can trace to the router at Site D fine, but no further as it evident here:
Tracing route to 10.0.11.1 over a maximum of 30 hops

1 1 ms 1 ms <1 ms 10.0.10.1
2 19 ms 22 ms 20 ms 172.21.255.254
3 39 ms 37 ms 38 ms 10.0.11.1

Trace complete.

However, when I trace direct to the server at Site D from Site B, I get this:
Tracing route to ftsaskdc.geminicorp.ab.ca [10.0.11.2]
over a maximum of 30 hops:

1 1 ms 1 ms 1 ms 10.0.10.1
2 42 ms 20 ms 20 ms 172.21.255.254
3 36 ms 39 ms 38 ms 172.21.223.38
4 * * * Request timed out.
5 * * * Request timed out.
6 * * * Request timed out.
7 * ^C

And for the record, neither server has any persistent routes. Workstations are seeing the same results. Nobody but our VLAN provider could have touched those firewalls and they claim they didn't, so I'm not too sure. I may have my users reboot the firewall in both locations and see if something's hung up, though I've never experienced that problem before.

I'll check out those utilities; thankyou for posting them!

Regards,

ColdFlame
 
I may have to defer to pagy's wisdom (above) that the problem is in the WAN link.

Let me try to help as much as I'm able. I've once seen a really stupid PIX problem once where it had to be rebooted in order to ping or perform any TCPIP action. What would make it work was reassigning the device (might be out of the question for you) to a different IP. Only 10 IP addresses out of a whole subnet would actually work, the rest may as well not have existed. Rebooting seemed to fix it. I suggested to the onsite tech to buy a 13$ lamp timer and have it reboot early in the AM. His WAN vendor was unwilling to address the problem. The lamp timer is still doing the trick.

If the problem is centered on the router there, and rebooting doesn't do the trick, then I would complain to the vendor that configured the WAN for you. This is certainly getting into that territory.

There are bound to be gurus here that can explain this issue: When the site is up, of course.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top