Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Unable to ping VM on different VLAN 1

Status
Not open for further replies.

Borvik

Programmer
Jan 2, 2002
1,392
0
0
US
Forgive me if this isn't the right forum, I'm not certain which this should go under.

This just doesn't make sense to me anymore. This setup used to work until the middle of September. Something must have changed and accessing my server no longer works properly (though I don't remember changing anything).

Here's the setup to this server:
Code:
Internet -> Watchguard -> HP 5406zl -> HP 2650 -> SERVER_A -> SERVER_A.1
                                        |          VLAN 4      VLAN 7
                                        |
                                        |-> SERVER_B (VLAN 7)
                                        |
                                        |-> CLIENT (VLAN 2)

From CLIENT I can ping SERVER_A and SERVER_B, but not SERVER_A.1
From SERVER_B I can ping SERVER_A and SERVER_A.1.
SERVER_A.1 can ping SERVER_B.
The HP 5406zl can ping SERVER_A.1.
The HP 2650 CANNOT ping SERVER_A.1.

The default route on the 2650 points to the 5406zl, and the 5406zl has routes for each of the subnets to point to the VLANs, with a default route to the Watchguard.

From what I can tell with the way the routes are setup, my ping request should go from my client through the 2650, up to the 5406zl which should route it to the right VLAN, back down to the 2650 and the server - but this doesn't work. I plan on using Wireshark tomorrow to try and further analyse this.

I'm looking for help to try and figure out why I can't see SERVER_A.1 from CLIENT. I know there is probably something missing to determine what might be going on, feel free to ask me for more info and I'll provide what I can. Hopefully my Wireshark reveals more as well.

Thanks.
 
ServerA.1 is a Win2k3 server that has it's default gateway pointing to one of the IPs of the 5406zl (1 ip for each subnet, and there are seven of them).

ServerA.1 CAN access the internet, and the internet CAN see it as well - NATed of course.
 
Running Wireshark, first monitoring the port on the 2650, then on SERVER_A.1 itself - I have found that the ping requests are reaching SERVER_A.1, but I did not see replies going to CLIENT, but I could see replies go to SERVER_B.

It almost doesn't seem as if the replies are even being generated for going to the CLIENT, though that seems kind of absurd thinking about it.

Broadening my Wireshark data beyond the ICMP protocol, I do see an ARP broadcast for CLIENT's ip address but no response for it. Though I do see a response for the ARP broadcast for SERVER_B's ip address.

I see the ARP request response pattern (ie failing for CLIENT but not for SERVER_B) when running Wireshark on SERVER_A.1, and on another client port monitoring the 2650 - so the ARP request is leaving the VMware server (so not the esxi arp drop issue).

Any ideas?
 
Please provide the following information:

ServerA.1
IP address
subnet mask

Client
IP address
subnet mask
 
Code:
SERVER_A (vlan 4)
SERVER_A.1 (vlan 7)
-----------------
IP: 192.168.107.7
Sb: 255.255.0.0
Gw: 192.168.107.1

CLIENT (vlan 2)
------------------
IP: 192.168.102.27
Sb: 255.255.255.0
Gw: 192.168.102.1

5406zl
-----------------
vlan 2
   name "Clients"
   untagged A8-A10,A12-A13,A15-A22,A24-B1,B4-B14,B16-B17,B19-B21,B23
   ip helper-address 192.168.104.2
   ip address 192.168.102.1 255.255.255.0
   exit

vlan 4
   name "Int. Services"
   untagged B2
   ip helper-address 192.168.104.2
   ip address 192.168.104.1 255.255.255.0
   exit

vlan 7
   name "Servers"
   ip helper-address 192.168.104.2
   ip address 192.168.107.1 255.255.255.0
   exit

                                IP Route Entries

  Destination        Gateway         VLAN Type      Sub-Type   Metric     Dist.
  ------------------ --------------- ---- --------- ---------- ---------- -----
  0.0.0.0/0          192.168.101.254 1    static               1          1
  127.0.0.0/8        reject               static               0          0
  127.0.0.1/32       lo0                  connected            1          0
  192.168.101.0/24   Management      1    connected            1          0
  192.168.102.0/24   Clients         2    connected            1          0
  192.168.103.0/24   Printers        3    connected            1          0
  192.168.104.0/24   Int. Services   4    connected            1          0
  192.168.107.0/24   Servers         7    connected            1          0
 
I should probably also list SERVER_B, which does work as well:
Code:
SERVER_B (vlan 7)
IP: 192.168.107.6
Sb: 255.255.255.0
Gw: 192.168.107.1
 
OK, so think about it - what does SERVERA.1 do with a packet address to CLIENT?

What's the first thing it checks?

(Hint: you said you saw an ARP request - why is it sending an ARP request? What is an ARP request designed to do?)
 
SERVER_A.1 doesn't know where CLIENT is so it sends out the ARP request to figure out where to send the packet. With no response it doesn't know where to send the ICMP reply, which is why my ping isn't working. I got that when I first noticed the ARP in the Wireshark log.

From my understanding either CLIENT or the switch should be replying with the MAC address of where to send it - either the MAC address of CLIENT or of the port on the switch (I'm about 95% certain it should be CLIENT's but I'm going by supposition and observed patterns).

I'm fairly certain the different subnets are NOT the issue, thanks to the routes setup in the switch and the fact that SERVER_B is working just fine (which also seems to have a different subnet mask from SERVER_A.1).

My problem is, that it USED to work and now it doesn't. We've now identified WHY it's not working, but I still don't know what would have changed (I am a programmer, with _some_ network knowledge but not a lot). I found the problem when our network backup stopped being able to connect to this server, and as my CLIENT can't access it either - I'm sure that the problem is related.
 
Right, that's not how it works.

The first thing in an IP connection is that the sender checks its network (IP address under subnet mask) to see if the client is local or remote.

If it's a local client, (network portion of IP address is the same) then an ARP request is sent to find it.
ONLY THE CLIENT can reply to that ARP request. Not the switch.

If it's a remote client, (different subnet) then an ARP request is not sent - because it would never work - an ARP request works within a broadcast domain, not outside of it.
In this case, the frame is addressed to the default GW instead, with the default GW consulting its routing tables to forward the packet.

Anyway, to cut to the chase - your subnet mask is wrong. I hope you can see why.
 
Ok, the first part I assumed but wasn't sure on (CLIENT responding to the ARP).

The second part I didn't have a clue, I figured the routing in the switch would forward the ARP request - but that DOES explain that. I just started figuring out the static routes to route between subdomains rather than less restrictive subdomain masks. I don't remember changing either at the time it went down, but changing that DID fix my pings.

Thanks for helping me understand that.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top