Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

how to communicate between two VB Apps across the web using sockets? 1

Status
Not open for further replies.

TheQuestioner

Programmer
Jun 29, 2002
78
GB
I seek advice from my fellow programmers.

My company has written a quote system in VB with an Access MDB as the backend. Our system works across a standard LAN, with a VB front-end installed on each client machine, connecting to the same Access MDB backend, which is on the server.

Management has decided that the next version of our software needs to have multi-branch functionality. This means that we need to be able to send small volumes of data from one machine to another over the internet.

I've been tasked with deciding which way to implement this. I can think of two choices:-

[ol]
[li]Via Outlook as email - sending data enrypted within emails to specific recipients. Data is automatically sent and recieved as emails.[/li]


[li]Directly through the internet using a TCP/IP Socket activeX control.[/li]
[/ol]

I'm strongly inclined to follow the second choice, but don't know much about sockets. I've done a bit of research, and would like to know about the following issues from my humble Tech-Tips counterparts:-
[ol]
[li]How do firewalls affect sockets?[/li]
[li]How would you setup a connection between two machines on the internet? Is it just a matter of specifying IP addresses and ports?[/li]
[li]How reliable is it?[/li]
[li]Is there a beter activeX/COM control for this purpose that the standard MS one? Is there a better free version?[/li]
[li]Are there any problems with using sockets?[/li]
[li]What are the best practices?[/li]
[/ol]
Thank you for your help.
 
Gosh, there are a lot of questions there!

Your first big question left you with two approaches, automated email and simple TCP connections.

I'd suggest a third: MSMQ. While it can operate fairly well while fully connected it can also automatically store-and-forward like email if clients aren't always connected.

Setting that aside, you have a second question cluster:

[ol][li]How do firewalls affect sockets?

Well, a lot! To begin with a local firewall needs to allow outbound connections using the port you choose. Then the remote firewall needs to allow those connections in. If the remote end is doing NAT it would need to be set up to forward your port connections to a particular server behind that NAT device.

[/li]
[li]How would you setup a connection between two machines on the internet? Is it just a matter of specifying IP addresses and ports?

More or less. See the NAT issue above though.

[/li]
[li]How reliable is it?

Pretty reliable. TCP includes a lot of service to ensure proper delivery of data. Of course you need client to server connectivity: if the server can't be reached you can't connect.

[/li]
[li]Is there a beter activeX/COM control for this purpose that the standard MS one? Is there a better free version?

Better? That's pretty subjective. Better at what? I've had pretty good luck with the regular MS Winsock control. Don't fall for that "CSocket" thing a lot of people will push on you. It has some serious warts.

[/li]
[li]Are there any problems with using sockets?

Most newbies make a dog's breakfast of the whole thing. Often their confusion leads them to throw in DoEvents() calls at random. Sort of a "Trust the Force, Luke" approach to programming.

Managing a Winsock control array in a server trips them up. Then properly handling socket state. But the big one is something I call "The Packet Fallacy." I think it stems from some "elite" feeling they get from using the word packet - but it usually leads them far off in the weeds.

[/li]
[li]What are the best practices?

Some really important tips can be found over at Winsock Programmer's FAQ. This is considered reasonably authoritative, and while written with the C/C++ johnny in mind it can be just as useful to a VB programmer working with the Winsock control.[/li]
[/ol]
All in all if you are planning to go the "socket" route having the server be visible to (connectable by) and available to (i.e. up and running for) your clients is the big issue. It is a lot easier if you can design things so clients "pull" data. A push approach means every client is actually a server, and adds a lot of complexity (lots of visibility and availability issues).

If you really want something that'll make the programming easier look for a component library that imposes a message structure on the TCP stream. I'm not aware of any free ones, though there are several commercial products. Some are specialized and only transfer files, while others give you a bit more control as a programmer.
 
Wow, thanks dilettante for all that information. I appreciate the fact that you've really taken the time and effort in answering my questions in such a fluid yet comprehensive way.

Going back to Point 1, your basically talking about port forwarding aren't you? As most businesses use routers with built in firewalls – port forwarding is probably the norm when dealing with winsocket programming.

When referring to using "push" implementation, do you mean requesting data from a remote machine, rather than sending data from a remote machine?
If you really want something that'll make the programming easier look for a component library that imposes a message structure on the TCP stream.
What do you mean by "message structure"? How does this make things easier?
Some are specialized and only transfer files, while others give you a bit more control as a programmer.
Do you have any links or suggestions?

Thanks for all your help, I really do appreciate what you've explained.
 
You're welcome.


Push/Pull

By push I was getting at an architecture where the clients more or less go about their business, but when the server has new information for them it opens connections to the clients and "pushes" data to them. For some applications this could conceptually be a great thing, but it isn't very practical in most cases.

Most applications that need this simulate it by having the clients periodically poll a server for new data. They connect out to the server and then either send a version transaction to it or else make a version request. Those that send the version out usually let the server make the call, and if the server "thinks" they need new data it pushes it over the now-established connection. Those clients that make version requests basically ask the server what the newest version it has is, and then the client somehow makes the "judgement call" about whether or not to request the newest data.

A lot of the big-name Messenger clients and anti-virus products work this way.


Message Structure

Unless you can live with a simple model like "the client connects and the server slams a stream of data to it" you will probably need to design some sort of system of messages and signals. The client and server use these to "talk to" each other over the TCP stream connection.

If you are using a single-threaded development environment like typical VB 6.0 programming you don't want "slam a stream" very often. Typically your client VB program is going to have some sort of user interface, and your server may need to service more than one client at a time. In order to avoid "starving the program for events" it is necessary for each end to accasionally take a breath - so to speak.

You end up needing to invent some sort of messaging protocol to use on top of TCP. Example:
[tt]
WELCOME WC
LOGON LO{ID}{Password}
AUTHFAIL AF
AUTHPASS AP
VERSIONCHECK VC{Version info}
VERSIONOK VO
VERSIONBAD VB
SENDNEW SN
FILESEND FS{File bytes}{File name}
FILEACCEPT FX
FILEREFUSE FR
FILEABORT FA
FILEBLOCK FB{Block of binary data}
FILEPAUSE FP
FILECONTINUE FC
FILEEND FE
BYE BY
[/tt]
So once you have these messages and their layout defined you have the basics for communication.

The problem is that TCP is a reliable stream protocol. It doesn't provide anything like records or messages (or ugh, "packets") you can count on. A TCP sender sends out stream fragments and a receiver receives fragments - and what is sent out isn't necessarily what gets received! A lot can happen to the TCP stream between the sending application and the receiving one and the stream can be refragmented according to all sorts of criteria.

What you are guaranteed is that the fragments will arrive in proper sequence.

So, just as a Windows stream file (like a typical *.txt file) doesn't have any record structure neither does a TCP stream. By convention we usually separate "lines" of text in a textfile by CR, LF, or CFLF. For TCP streams we need to do something similar.

One might think sticking a CRLF after each "message" would do the trick and it can in some cases. Even then it is usually easier to choose a single character as a message delimiter though (maybe just CR). But what about moving binary data, like a transferred file? The odds are good that you'll have at least a few of whatever delimiting string you choose in your files now and then.

So usually a message length prefix makes more sense. A 16-bit unsigned integer is probably the most common form of length-prefix message framing.

Of course the receiver (and both ends, client and server, have to be receivers) needs to be careful to assemble whole messages before trying to process them. On top of that one TCP "receive" might even bring in several messages in one stream fragment.


Message Oriented Components

By using some 3rd party messages over TCP component you save yourself writing and debugging stream buffering and message framing logic. You also save yourself some pain in terms of reliability and performance.

People have spent a lot of time trying to make code like this work and work quickly in VB. There is a lot about VB that conspires against you when dealing with large volumes of data. The most common headache they run into involves trying to get decent performance in the face of String concatenation overhead.

In spite of this most programmers seem to roll their own. Maybe that's why there are so few commercial component libraries out there. The other problem is that when you try to add abstraction you often tend to add complexity even when your goal was to simplify things.

I posted some all-VB sample code at another site. Maybe you could use this as a starting point, at least something to look over: Winsock TCP Message Framing Example
 
Oops, left a point out.

Yes, you'd need to deal with port forwarding if you have a NAT router in front of your server.
 
Once again dilettante, your information has been really helpful, and I am in your gratitude.

A lot of commercial applications – such as MSN Messenger – seems to communicate to a remote machine behind a NAT router, without the need of port forwarding. How is this possible? Is it something do with the port settings?

I've looked at the previous links that you sent (thanks – they were very useful), and have one question: how do you know which port number to use? (I understand that certain apps are allocated certain port numbers, but that's all I know).
 
Connecting out through a NAT router is no problem, it's connections in from the outside Internet that's a problem.

Which port? You pick one.

There are guidelines for doing so, like avoiding 1-1024 which are more or less reserved. Consult PORT NUMBERS.
 
But doesn't MSN Messener go both ways (in and out through the router)?
 
What worries me is:
with an Access MDB as the backend.
Check if Access can handle the traffic. Access may be OK when you have up to 5 – 8 users at the time, but after that you may find out that DB can not handle more users and will crash on you.

Some people, mostly ORACLE pushers, state that Access is a Fisher Price of Data Bases.

Keep that in mind. You may want to go to SQL, or MySQL, or even Oracle for your DB.

HTH

---- Andy
 
Perhaps a client/server approach would be better than peer/peer.

If you have one high availability server machine... say it is a web server... and then you make a web application to run on the server using your favorite server-side tool such as ASP/JSP/PHP/Perl/ISAPI or whatever.

Your remote client could persist an ADO recordset to file and then use the standard "POST" method of HTTP Request to send it to port 80 just like uploading a file to a web site or, more specifically, doing an attachment for a web-based email like hotmail, gmail, yahoo mail etc..

The web-application would sort of pool the database connections of the various clients for you... it would sit as a sort of database proxy between the access database and the clients so that, from the database's point of view, there is only one client.

 
Programs like MSN Messenger normally work by connecting OUT to a server for basic communications. There are some cases where it can establish true peer-to-peer connections, and when it can't it may try a few tricks that often work to poke a temporary hole through NAT. When this fails it falls back on relaying traffic through a server that both peers connect to and continue operation in client/server mode.


I wouldn't worry about using an MDB as your back-end datastore. While Jet certainly has its limitations (total MDB size limited to 2GB for one thing) an application like yours can get around most of them.

The single biggest problem with Jet (and Access) involves letting client systems "connect" (i.e. open the MDB file) over a LAN. Even worse, over a larger network with switches and routers involved. The BIG problem stems from a client who is updating (a.) crashing, (b.) losing network connectivity, or (c.) pretty much anything else that causes the client to fail to close the MDB gracefully.

This is where orphaned locks and actual database corruption comes from most of the time.


In your case you'd want to add a middle tier "server" program that you probably don't have today. One way to go is the way Sheco suggests, using a web server and write your application as a web application. A variation might be to use a web server and still have "fat" (a.k.a. "rich") client programs that talk to the web server using web services or a similar technique.

The other way is to write your own application server, perhaps in VB.

This would be a "socket server" that contains your database access logic and a substantial part of the application logic. The idea is to "crunch" as much over in the server as you can, to limit how much the client code needs to know about the database and to limit the amount of data to pump over the wire between clients and the server.

This server might be designed with one database connection per connected client or it might share a single connection for all database access. If you write the server in VB it'll almost certainly be single-threaded, and thus more than one open database connection won't help you very much.

You may well want to use one Recordset (and possibly one Command) object per connection though. This is especially true if your clients make repeated requests that don't always need a new query (like paging through longer Recordsets). This becomes a tradeoff between server complexity, server memory consumption, the amount of data to shove over the wire to clients if you push the whole Recordset worth of data, and whether or not the client will commonly end up fetching everything in the resultset (Recordset) anyway.

You won't be using Jet ("Access") security because everything will run in the server's user context. So you'll need a security scheme you develop yourself. This could just be a Users table with user IDs, a "rights level" field, and passwords, though storing the passwords in plain text isn't advisable. The server could always do an MD5 hash or something on passwords before storing them though. You levels of rights might be simple (admin, read/write, and read-only or some scheme like that) or complex. Your server logic should enforce the rights.

If at all possible, have the MDB file sit on a drive or drive array local to the box running the server program. Avoid a network connection between the server and the MDB.

Jet 4.0 can handle up to around 255 database opens if need be. Your server program should only really need ONE. I've handled as many as 340 users in this way with no problem, though the transaction rate was low (around 3 per second at peak times, maybe 25 per minute most of the time). This was on a machine with 256MB of RAM and a PIII at 500 Mhz. Newer hardware dwarfs this.

The site only uses the application pretty much 7:00 AM to 6:00 PM Mon-Fri. At midnight each night the server program forcibly closes any connected users after a 5 minute warning.

Then it closes its single ADO Connection, renames the MDB from APP.MDB to APPYYYYMMDD.MDB, and then it runs JetComp.exe to clean and compact the new-named MDB as APP.MDB. Once that completes it re-opens the Connection to the clean MDB and begins listening for clients to connect again. Then it starts a separate script to copy the renamed MDB to a file server and remove it from the local drive upon successful completion.
 
I agree with dilettante regarding the choice of database. A properly designed database with good programming practices is more important than the choice of database.

I once saw a programmer bring SQL Server to its knees because of a poorly coded stored procedure. I've also seen an Access database burn through tables with a couple million rows with no appreciable performance problems.


-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top