Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Problems with localhost

Status
Not open for further replies.

suicidal

Programmer
Nov 14, 2002
6
BE
Hello,

I've had some problems with a TCP connection through the localhost. There was data loss during very heavy load. We moved the two programs to seperate computers and the problem was solved. I know that the localhost works a bit different then the 'real' ip addresses, but does anybody know if there is a difference between the two (i.e. smaller stack,...)


Greetings
 
As far as I know, it is the same stack, without any Layer 2 or below.

The problems that you experienced are probably only due to the fact that your single machine was acting as both client and server. You may have experienced buffer overruns, or overwriting of memory arrays because the data is arriving faster than you can write it out to hardware. Communicating through the NIC hardware is much slower than passing data down the stack and back up, so your application will have more time to perform "housecleaning" before the next packet arrives.

Can you provide more details about where the data was lost, and what was under a heavy load?

For instance, did the application drop packets (shouldn't if it was actually TCP) or did the application receive the TCP packets and then not store the data? A simple tcpdump would allow you to eliminate the IP stack as a problem in this case.

And was the computer running at 100% CPU utilization, or was the memory heavily utilized, or was the network running at or near capacity? If your application was pushing the limits of CPU or memory, it was probably pushing data through the stack at rates that could not possibly be achieved across a network.

Good luck!

pansophic
 
Hello,

Thanks for your help.

The problem appeared before I was working here, but I doubt the processor load was 100%.

My colleagues tried to use tcpdump (or windump), but traffic to 127.0.0.1 doesn't appear in its logs. So no luck there.

P.S. I'm not sure that the data passes trought the NIC hardware. If so, transmission rates shouldn't be a problem, should it?
 
On the loopback interface traffic does not pass through the NIC hardware, so it is possible for the system to achieve enormous transmission rates across the loopback. This can cause problems for applications if the data handling is not perfect, because new data can arrive before the last data has been saved.

When data has to pass through a network, things happen much more slowly, so the application will have time to handle data before the next packet actually arrives. Plus, it is only doing 1/2 of the work (either client or server) so there is less chance for contention.

If you have any shared libraries that you were using for both client and server, it is possible to overwrite pointers as well.

I'd recommend that if you want to test on a single machine, you use something like VMWare to emulate multiple separate systems on a single computer. You eliminate a lot of the shared library issues that way, and you can simply reconfigure the virtual machine to have differing amounts of memory and have multiple operating systems (or patch levels).

But based on what you have said, I doubt that you can figure it out without doing additional testing.

pansophic
 
My guess would be that the applications are written with assumptions that are not valid when the "network" is very fast, as for example when using loopback connections.

A lot of people who try to write VB programs based on the Winsock Control for example make a lot of assumptions oabout how data arrives in the buffer and how event triggering and handling will flow.

Anyway, to debug things like this I use TCPTap, a program that can be used to "plug in" between a TCP client and server and monitor message flow.

You'd run TCPTap on your test machine, and tell it to connect to your "localhost" server also on that same machine. Then you give TCPTap a DIFFERENT port number to use, so that it can masquerade as your server to your client program.

Then you have your client program (still on the same machine) connect to the TCPTap client port. TCPTap passes all the traffic through and logs it to a test file of your choosing.

Normal connection:

Code:
[client]3002-----3002[server]

TCPTapped connection:

Code:
[client]6002---6002[TCPTap]3002---3002[server]

TCPTap doesn't care if you change the server's port or the client's port, but you need to change one of them unless TCPTap is on a separate machine from the client, server, or both. I.e. this won't work:

Code:
[client]3002---3002[TCPTap]3002---3002[server]

Problem is, you can't have two "servers" on the same machine via localhost that bind to the same TCP port.

This WILL work though:

Code:
[client]3002                      3002[server]
Code:
            \                    /
Code:
             --3002[TCPTap]3002--

Where both the client and server are on Machine A and TCPTap is on Machine B. That won't help you though, because your traffic will be going over a physical network now.

Ah well, TCPTap is free and can be DLed from
 
I'm confused.

Does it matter how quickly data arrives?
If you follow these steps
open socket
read from socket
process data
read from socket (succeeds if data available and block otherwise).

How can a program mess up things if data arrives fast?
 
It doesn't if your program is written serially, but the Winsock control uses an interrupt to notify that data has arrived. If you have not completed a process of the data and more data arrives, the interrupt will fire. The data processing will stop at this point, and the data_arrived event will be handled. If you haven't already saved off the data to a local variable, and then appended the new data to that variable, then you will loose the data that was being processed.

You could check to see if this is your problem by clearing the local variable at the end of the process_data routine. When the data_arrived event fires, check to see if the variable has any data in it before adding the new data. Write any overflows to an error log.

pansophic
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top