Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to keep a local copy or "shadow" copy of a database locally so that software can run f

Status
Not open for further replies.

A_Radiolog_inFoxPro

IS-IT--Management
Mar 29, 2020
40
CY
Dear All ,

I would like to know if its possible to somehow keep in cache or keep a local copy of some big database files that are located in a remote share , because when loading some fields that are memo ( just long texts ) it takes time and sometimes the program freezes.

Is not a mater of speed but a mater of how VFP opens said remote tables.

Is there a way to keep some updated local copy of the tables so that they can be accessed fast enough ? and also to be updated constantly

If this is not a good practise let me know what is the proper solution for this.

Thank you in advance
 
Broadly speaking, you've got two options:

1. Keep a "shadow" copy on the local drive. (This is the usual meaning of the word "shadow" in this context.) This means for every insert, update and delete that you do on the remote copy, you do the same on the local copy. Clearly, this could involve a huge number of changes to your existing code. The advantage is that both copies will always be in sync.

2. Periodically copy the remote database to the local drive; for example, you might do this overnight. This will involve no code changes in respect of inserts, updates and deletes. For queries (including reports, etc), you will only need to point your code to the local copy. The obvious disadvantage is that, when using the local copy, the data will only be up to date as at the previous copying.

If you would like to consider the above, and then let us know which of the two options you would like to explore further, we will be able to provide more detailed help.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
By the way, the second of the above two options is the basis for data warehousing. In that case, you only create a local copy of the data that you will actually need for your queries, reports, etc. Furthermore, there might be opportunities to aggregate or summarise the data in advance.

For example, say you need a regular report showing total sales per customer. In the local copy, you would only need the totals of the sales data, not the individual sales. And you might only need the customer name, not the address or other contact details. You can achieve that with a simple SELECT ... INTO TABLE, thus greatly reducing the amount of data that you would need to store locally and thereby speeding up you queries and reports.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
A_Radiolog_inFoxPro said:
a matter of how VFP opens said remote tables.

Is it? VFP does not open remote tables, in any kind of remote database access you have to use remote SQL language to query data. And if you know you need the text of a single record "memo" field, (it should be equivalently a varchar(max) in MSSQL or a text field or longtext MySQL column), then you just do SELECT * FROM table WHERE id=x to fetch only one record. If that is slow, you should perhaps look for another ODBC driver.

Before I'd go deeper into that, first of all, no there is no such thing as a "shadow" copy you can do, you can query into a VFP cursor but you can't get changes in remote table forwarded to that VFP cursor. No matter if it's a cursor "connected" to the remote database via cursor properties making that an updatable cursor. This is always only allowing your local changes to be committed to the database, not changes of the databse be forwarded to the cursor. In short you don't get a USE of a remote table with means of ODBC connections. The only part the "equivalent" of SELECT * FROM tbale does is give you all data - at the time of query. Changes and new data, also deletions don't get done in that cursor.

Before I go any deeper and sugggest implementing a meta data architecture with which you could do things like query changes since you last queried the full table to apply them to your local cursor and therefore establish your own shadow copy mechanism, what remote database are we talking about?

Chriss
 
Mikes suggestions are interesting.

Ad point 1: If you do "double bookkeeping" like that, you only have all you own changes in your local copy, that does not include what other clients do in the remote database, so you'd also neeed to add a detection for those changes or a mechanism with which other clients tell you that and what they changed. This is an overhead opf programming I'd not recommend.

Ad point 2: This is an idea that would move the majority of data loading unnecessary. But whenever you depend on the latest changess of the current day, you only have a slight advantage of having all the old records available. At start of the application you could make an "append" of records with ids higher than the max id in your local copy. But - and that's a big but - this will not update you with changes in the bottom rows of your local shadow copy, i.e. in the area of max id and slightly lower ids. And in most cases of applications, the "action" is in the last rows of a table, the older a row, the more likely it is in its final stage anyway.

That last sentence suggests another approach: Keep the tables slim by moving all data that is in a final stage into an "archive" section. Not to be taken literally meaning you don't need them anymore at all, but they are not the core of todays work. Or more gnenerally speaking: Think of a database design where the currently worth loading data is in separate tables. Or, as a simpler idea: Have a lastchanged datetime field and always only query data where lastchanged > Gomonth(DateTime(),-1) or similar in the remote sql language. The problem still will be that in all cases your old VFP code was at the state of !FOUND() or EOF(), you'd have too look if data can be in the section of the remote tables with older lastchange datetimes. So you'll still need a lot of additional programming to do.

But get over it: Remote databases are not DBFs. You can't reapply all code you got with DBFs when changing to a remote database, you change to a client/server architecture.

Chriss
 
One suggestion for you as a radiologist: If you have patients oriented data, what would be the closest you get to shrink your data retrieval to what you need for a day: Query data by patient. Is it that hard? That will be very selective. Assuming - a very straight forward assumtion - your patients have appointments with you, you can know even weeks in advance what patients data you neeed during a specific day. That should be a fraction of all data. What remains is data not directly related to patients. I guess the texts you're concerned with are within the patient data, though.

And if you have emergency patients woth no appointment, that just needs the loading of that patients data, if it's not a new patient with no data, anyway.

So think in that direction. Isn't that straight forward?

Chriss
 
On the technical background reasoning why no such shadow copy mechanism is possible unless you develop one for your needs yourself:

The connection to a server is twoways in the sense that you read and write through a connection to a remote database server, but the server is in the server, literally "servant" role. It is not actively pushing data changes to clients.

The endpoint of data, like cursors in the case of VFP, arrays or objects in case of php, datatables or Linq objects ADO recordsets in VBA and anything else, do you really expect that a general serverside mechanism could keep all those different architectures up to date with the server data? No, that's not existing. The most you could expect is that a database server could serve changes up to the "doorstep" or an "inbox" the client side odbc object could have, but there is no such thing foreseen and it is just a waste of network bandwidth to implement this idea just in case some client may need that data. This presumption is wrong and actually does more harm than good, so servers will not develop in that direction, ever.

The role of a server is to serve, not to act, you're expecting it to know you don't just want all current data, but also all future data changes of a table.

I could point back to what I said about this in a recent thread: thread184-1827951

Chriss
 
Dear All ,

Thank you for your very very valuable input ,

One of the issues is that unfortunately in the medical filed the mayor key aspect is the trend of the patient.

Meaning I can lookup back 2 years worth of data to see how he was and how he is now for example ,

Double bookkeeping will be messy because there are counters that need to up to date with in all the remote users. So if one users orders exam 000001 I can't have one with a remote db with 000001 also.

Before the software used the db files in explicit mode.

After going multiuser it switched to shared db files.

Also they are discrete tables instead of a single db. So there are a lot of tables holding several of the data.

As you already know from the past the code is uncommented with no documentation.

I am trying to optimise some things , but as I understand are quite difficult to do.

Thank you in advance for your time and ideas ,



 
Chris said:
VFP does not open remote tables, in any kind of remote database access you have to use remote SQL language to query data.

I didn't realise that the OP was wanting to access a back-end database (as opposed to a VFP database that happened to hosted remotely). Although I don't think that makes a lot of difference to my original reply.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
True, Mike

All A_Radiolog_inFoxPro talked about was a remote share, not a remote database server.

I also don't think that changes much about what I wrote. If it's a DBF in a WAN instead of a LAN or "in the cloud", that mainly means a latency time that is much higher than with a local file or a LAN. In that case what can generally be said is that you have "offline files", but I don't see how to actively make use of that mechanism yourself and let WAN files be cached somehow. In the end the only master file is still the remote file, what is cached can become invalid at any moment and the problem is you can't predict when that moment comes so there would always need to be a cache validity check which can only be answered by the remote server, which has the same latency time as direct file access, so you can only exchange file access to querying some service on the same remote distance with the same latency time between question and answer.

Blame me for false assumptions, but don't blame me for not knowing what you don't explain about your situation.

Tackling this on a "no assumptions made" basis, well grant me one, that 'shadow' is referring to cached data. Well, any cache has the aim of providing something faster than going to the original. The compromise to be made with caching is that either you risk looking at outdated cached data or you go to the original master file to verify that and that means the cache performance advantage is gone.

Chriss

One last edit: "opportunistic locking" is essentially a caching mechanism and it does not work out nice for VFP, as many know by experience. It's not based on local files, though, it's part of the file system mechanisms.

And one last recommendation when latency to data becomes a problem, then run your software where the data is local and use remote desktop to access the software. The latency of dta access then becomes low, the latency of visual updates of the displayed interface may become slow, but usualy that's working out fine enough. It would be a solution to a WAN case, II just fear this assumption also isn't what really is the case.
 
If you ever heard of something like that, maybe you have heard about the Volume Shadow Copy Service (VSS)


This "shadow copy" exists server side though, not client side. And it's not usable to establish what you want to have, even if your remote database is SQL Server. Not only because this shadow is also remote (local from the persepective of the MSSQL Server), it isn't meant for client access, but as this documentation article describes it is meant to be used for backup applications, to give access to a snapshot moment of data without interrupting the database operation at all.

Chriss
 
A_Radiolog_inFoxPro said:
Thank you in advance for your time and ideas

If I take that statement as motivation to still provide further ideas you could use, then I don't know why you don't realize we still even puzzle about very basic things like whether your shared centralized data is in DBFs or a datbase server, if you involved a WAN or have the data in a cloud database and any more technical details.

Especially since you make it seem your application code is "set in stone", the only chance is to work on the peripheral, isn't it?

All those details on the technical architecture contribute to what can be done at all, so that's information you should provide. If you expect everybody to remember every detail you already told in previous threads or read them all, that's pretty darn... I don't even know what word to use for that.


For example, with a database system like MSSQL you could set up a database server that acts as master of a replication network with many slave servers each on local installations of your application. That would be a backbone of replicating all data to all the local (slave) servers. The appliction would use its local database as if it is the only one and all changes could be forwarded via master to all other slaves. That DBF files are royalte free and need no licensing is of no advantage here, as VFP has no replication mechanism like that and I know of none such thing being done.

It wouldn't be the simplest setup and needs many licenses, perhaps it's even more expensive than the remote desktop scenario I already suggested to use, i.e. install a terminal server with your software and data local to it, to be able to use the application anywhere else, too, without the need to make any application changes, which seem to be so impossible to do.

Chriss
 
Hi,

Are your images / graphs and other documents stored in the database itself or do you have a database with only linkes to documents on disk? If they are stored in the database, this may explain the slowness you experience.

Further I would tak care about (patients) privacy, you should not have unneeded data on you local workstation.

Regards, Gerrit
 
Gerrtit points out n important other criterion: Data privacy.

If we talk about DBFs your only level of protection of the privacy of the data is limiting access to you or other doctors or anybody, to which it should be allowed. I won't deep dive into that aspect alone, though.

I wonder about the latency times you have to struggle with. Take a look at a networking discussion here:
Measuring latency times to internet domains is a bad idea, as user58476 can learn from JFLs answer, but lets just concentrate on some other estimate made there: The lowest latency you can expect if something is really halfway around the earth is 200ms roundtrip time. Well, light could travel around the earth about 7 times per second, but even with a direct optical fibre there are times necessary for changing from electrical to optical and think of some nodes in between.

I don't know, but even considering a Foxpro table memo read that requirtes to first read a record in the DBF file, which points to where the actual memo content starts in the FPT file, from which you need to read a length first, so you know what portion of the FPT file to read, which means you have 3 roundtrips of data packages which mean 600ms wait time even for a short memo text, because this latency then is much longer than the time necessary to pull some bytes of the text itself, unless that's a MB chunk of the FPT file, or even larger.

Within the same country, expect shorter times than 600ms.

If you talk about a latency of 5 or 10 seconds, then you configured something wrong in your network.

You said
A_Radiolog_inFoxPro said:
Is not a mater of speed but a mater of how VFP opens said remote tables.
Well, in the end the speed of it depends on how data is stored and retrieved, that's true. You leave us in the dark with no concrete information about what you mean by this. If you add encryption into the game, lets say by using Cryptor, that will add more roundtrips, and the latency time will play a bigger and bigger role the more complex the access up to the actual text you want to get becomes. But then we can only help to make up a better plan, if the current technical architecture is known.

Chriss
 
Hello,

we have many application with HUGE data using Remote views + some SPT for reports, its fast enough.

So my tips :
- Use a server DB (like MS-SQL)as backend data , seems you are using DBF on a server
- Optimize queries, "select * from ... into cursor..." is a bad idea, use "where" and reduce fieldlist
- we have one view "adr_pkadr" which has a filter on primary key so result is 1 record for insert / update / delete and multiple views like "adr_sele" with only the fields needed for presentation / report and appropiate filters. We index the cursor to allow fast switch of sorting

- Use primary key / foreign key system and have index on foreign keys in all tables
- Add index for search fields
- Think of storing big documents (scans, pdf,..) in files and only store the path+filename in db (to keep it small). Or use pdfcompressor / free magick lib to compress doc / pic
- Check LAN/WAN, on LAN 1GB for cards, switches,..

HTH
Best regards
tom
 
Others made the assumptions you talk of documents or other files, while you clearly stated
A_Radiolog_inFoxPro said:
just long texts

Well, that should not take that devastatingly long.

A_Radiolog_inFoxPro said:
sometimes the program freezes

That does not sound like you just read one text. I'd guess you do something like SELECT * FROM dbfwithmemo, and that takes long, limit this to single records you actually need. And when you absolutely don't want to change the application logic and depend on a full list of records, then one step for less freezing is to only fetch all fields except memos, so:

Code:
SELECT fieldlist_without_memos, Cast('' as M) as memofieldwithsamename FROM someremote.dbf
assuming only one memo field, if there are more, use this CAST expression for all of them additional to the fieldlist without memos.
Then, when it later comes to displaying a record and its memo(s), fetch it (them) separately for this one record.

On the other hand, only pulling in data for your current patient should already make it take much less time.

I wonder about the reason for your "remote" database, it makes no sense to me in the first place. And one nightmarish thought I had would be you use a single VFP database to hold data for many radiologists, every doctor's office should have its own separate database, VFP databases work best locally, second best still shared in a LAN, with complications and lags in a WAN, but surely you'd never operate a multi-tenant database based on DBFs, neither on a WAN nor in the internet. Do you? This situation cries for a database server, no matter how fast your network cards are.

Again, what is the architecture? It seems broken to me, unrealistic. Or your demands are, and you want the text to appear immediately, while it takes a second. What is the lag in time you experience, really? Are you just having unrealistic demands, perhaps?

Chriss
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top