Hi All,
I'm having some problems with a particular application (called Isis) which sends multicast data between two machines. Every 5 minutes, the application will send a multicast 'heartbeat' between two machines. This is a business critical application and it works well most of the time, but for the last week, when it has been loosing it's heartbeat once every night - normally at 11pm and 3pm - which makes it all the more pressing to get fixed!
We have 3 IOS based Cat6500 routers in our core and two layer 2 catos switches near the servers running PIM spare-dense mode and IGMP snooping.
Packet traces near the source and receiver show that although the receiver is issuing IGMP membership reports, and the sender is 'publishing' data to the correct group, packets just aren't arriving at the receiver, and hence the heartbeat is lost. This is my evidence for some kind of network problem.
I have written a script to ask the routers for the following every 2 minutes:
>show ip mroute <src> <grp>
>show ip igmp group <grp>
>show multicast group 01-00-5e-xx-xx-xx
What I have observered is that very often, the leaf router, near the listener goes into a *,g state, instead of s,g. Could this cause a problem? Why does it move away from the s,g group every now and again?
When the listener finally keeled over at 3am ('great way to spend a Friday night!') I observed that the sender carried on trying to send messages to the mutlicast group.
However, the sender's router did not have an s,g entry. I thought that the router nearest the sender should make an s,g entry and send the messages to the rendez-vous point, but this was not happening. The sender continued to send, but packets were not being sent to the rendez-vous point. Are there any PIM gurus out there who could tell me if this is consistent with the PIM-spare mode protocol? Am I correct in saying that when a sender sends a packet to a multicast group, the first hop router should immediately create an s,g state?
Any help would be much appreciated.
Thanks!
James.
I'm having some problems with a particular application (called Isis) which sends multicast data between two machines. Every 5 minutes, the application will send a multicast 'heartbeat' between two machines. This is a business critical application and it works well most of the time, but for the last week, when it has been loosing it's heartbeat once every night - normally at 11pm and 3pm - which makes it all the more pressing to get fixed!
We have 3 IOS based Cat6500 routers in our core and two layer 2 catos switches near the servers running PIM spare-dense mode and IGMP snooping.
Packet traces near the source and receiver show that although the receiver is issuing IGMP membership reports, and the sender is 'publishing' data to the correct group, packets just aren't arriving at the receiver, and hence the heartbeat is lost. This is my evidence for some kind of network problem.
I have written a script to ask the routers for the following every 2 minutes:
>show ip mroute <src> <grp>
>show ip igmp group <grp>
>show multicast group 01-00-5e-xx-xx-xx
What I have observered is that very often, the leaf router, near the listener goes into a *,g state, instead of s,g. Could this cause a problem? Why does it move away from the s,g group every now and again?
When the listener finally keeled over at 3am ('great way to spend a Friday night!') I observed that the sender carried on trying to send messages to the mutlicast group.
However, the sender's router did not have an s,g entry. I thought that the router nearest the sender should make an s,g entry and send the messages to the rendez-vous point, but this was not happening. The sender continued to send, but packets were not being sent to the rendez-vous point. Are there any PIM gurus out there who could tell me if this is consistent with the PIM-spare mode protocol? Am I correct in saying that when a sender sends a packet to a multicast group, the first hop router should immediately create an s,g state?
Any help would be much appreciated.
Thanks!
James.