Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Foxpro data anomaly

Status
Not open for further replies.

dmusicant

Programmer
Mar 29, 2005
253
US
Weekly, I back up my at-home FoxPro data to a portable hard drive and bring it to my volunteer job, where I copy the data to a laptop so I can access it just like at home. I use batch files to copy the data (from my NAS via WIFI), just copying what's changed.

Yesterday I did that but when I used my app on the laptop at "work" to browse the data it was screwy. The function I was using browses a single table. AFAIK, only that table was affected, but that may certainly not be the case. Browsing that one table, a considerable portion of the data I was seeing (guessing over 20% maybe 30% or more) appeared to have blank records. None of the fields were populated. Other records looked "fine."

I brought the portable hard drive home along with the laptop, hoping that the data I'd copied to them wasn't likewise screwy. To my relief, it looks fine.

Today, I opened the questionable table I was looking at using my home laptop (and not using my app, just the VFP 9 command window), the one on the portable hard drive, and it looks just as screwy... there appear to be a whole lot of blank records.

I tried this: In a command window of VFP 9.0, I set the record pointer on a seemingly empty record and did this:

?len(alltrim(artist))

Artist is a character field of 80 characters length. It should never be empty and is almost never the full 80 characters.

VFP returned 80. If the field were empty it should say 0.

I then entered this:

?Subs(artist,1,1) = " "

It returned .F. I'd expect that for any populated record (artist is always left justified) but not for a record whose artist field appears blank in a browse.

Now almost all entries in the Artist field are less than 80 characters, IOW, LEN(ALLTRIM(ARTIST)) should return an integer less than 80, more typically maybe 25.

The number of records in the strangely acting DBF file is the same as the number on my NAS's version, which AFAIK is fine (26682). The table has a complex index and 30 fields. I don't know how to interpret what's going on here. I'd like to know, of course.

It seems that the data that wound up on the portable HD after I ran my batch file yesterday is corrupt. How and why I have no idea. I'd have no idea how to retrieve it if I didn't have a "working" copy on my NAS. Of course, at work yesterday I was barely able to use the data.
 
You seem to have 0 bytes in some blocks of the DBF. It may be interesting to see at the block level eg using VFPs Hexedit.

Zero Bytes in a char(80) field are looking empty and are not trimmed, thus you get LEN(ALLTRIM(field)) simply being field length. Simply check out what Len(Alltrim(Replicate(Chr(0),80)) is, it's 80.

I know someone struggling with such bursts of zero bytes appearing in DBF, but that was a topic well over a decade ago and he identified network errors to cause this and let a file server write blocks of zeros into the dbf file.

When you say you copy files to an external drive, is that a NAS that is connected via Ethernet cable? Or is it a usual interface like USB?

Bye, Olaf.

Olaf Doschke Software Engineering
 
Hello,

the original is not damaged, so seems to be a problem with the portable harddisk and/or transfer.
We have this from time to time due to USB-Errors or damaged portable disks.

Can you copy to another device (USB - Stick) and check it ?
Or burn it to a cd/dvd and check that or use another USB port ?

You may also check the portable drive with S.M.A.R.T tools or windows checkdisk
And there are tools which copy and verify, Total commander for example.

Regards
tom

 
The portable HD not infrequently is reported to have file system errors that Windows offers to fix and does successfully. I didn't encounter such messages the other day, however.

Now this I failed to say in the OP: The batch file that copies the data gets the data from my NAS and via WIFI. I could use an ethernet connected machine, but don't. I have been running the batch file on a WIFI connected machine. So, I'm wondering if there was an error by virtue of that connection. The batch file didn't indicate that there had been a problem AFAIK. It said it was "complete." I've been doing things like this for probably over 2 years. Haven't seen this problem before. Maybe in the future I should run upstairs where I have a laptop asleep that's ethernet connected to my router. I suppose that would be a more reliable way to get the current data on the portable HD via batch file.
 
Dan, I believe we had a thread here some years ago in which a user was seeing corrupted data on an external drive, and eventually solved it by switching from WiFi to an ethernet connection for the drive. I'm sorry I can't find the original thread, but if my memory serves me right, that would support your idea.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Well, you can stay with wifi as it works most of the time.

But verify your data instead of relying on the batch status. The batch only reports all commands ran, doesn't it? As reported from the experience of a colleague, such zero byte burst can occur, they don't put a copy/xcopy/robocopy command in an error state, nor a VFP COPY FILE. And I have to dig, but I don't see any of these shell command doing more than copy, ie none of them does verify what it copied and compares that to the original, which would mean yet another copy. Even if they did, when the error occurs reproducibly at certain blocks for whatever reason, the verification would also just report the same data.

What you can do is open the copied dbf with low-level file function FOpen/FRead/FClose and read through your copy to see if you find blocks of zero bytes longer than a record, that would alarm you to repeat the copy.

Or even simpler scan through the file and check where you have 80 character long artist entries again, or ASC(artist)=0.

Bye, Olaf.

Olaf Doschke Software Engineering
 
Hi dmusicant,

Welcome back - we already had this thread in 2014.

Weekly, I back up my at-home FoxPro data to a portable hard drive and bring it to my volunteer job, where I copy the data to a laptop so I can access it just like at home. I use batch files to copy the data (from my NAS via WIFI), just copying what's changed.

1) How do you copy the data back to your NAS?
2) Did you ever consider of remote accessing your NAS (e.g. , WebDav, WebDrive ...). This would solve more than just this copying problems.
hth
MarK
 
For what it's worth, I had exactly the same issue by sending data to an external hard drive via WiFi. Some of the dbf's had data missing and one ended up with "This is not a valid table" error.

This was resolved by simply plugging in the external hard drive via a USB port.

I'm not an expert, just mentioning my issue like yours and resolution to reinforce Mike's observations about WiFi.

Thank you

Steve Williams
VFP9, SP2, Windows 10
 
mjcmkrsr,

Fact is, on only rare occasions do I copy any data back to my NAS in this scenario. If I have any changes to make, they are recorded to paper and later entered via direct table entries using my app after I get home. That's with the exception of a single table, which very occasionally I will copy back to the NAS, using a simple copy/paste (Windows commands), from the portable HD.

Now, remote accessing the NAS sounds interesting. However, I think that would necessitate a WI-FI connection. I do have one there... sometimes! Thing is, it's at UC Berkeley, and their guest wifi is horribly iffy. More often than not it just doesn't work, at least where I am situated, I don't know why. However, if I did have a reliable wifi connection, the remote access possibility is intriguing. I'm not familiar with the technologies you mention... WebDav, Webdrive...
 
Olaf,

Scan through the file looking for LEN(ALLTRIM(ARTIST))=80, and have a look? In the command window:

LOCATE FOR LEN(ALLTRIM(ARTIST))=80
?ARTIST
CONTINUE
?ARTIST

If I see any blank returns, it would indicate a similar problem

Correct?
 
Hi dMusicant,

Fact is, on only rare occasions do I copy any data back to my NAS in this scenario. [...] very occasionally I will copy back to the NAS, using a simple copy/paste (Windows commands), from the portable HD.

That's very risky since you don't know whether the data has been corrupted during this NAS - HD - Laptop - HD - NAS copy process and you might copy bad data back to your NAS without being aware of it

Now, remote accessing the NAS sounds interesting. However, I think that would necessitate a WI-FI connection.

You'll need a stable internet connection (both at your home and your office) - for configuring the NAS please check its user guide.

hth
MarK

p.s. From an earlier thread I remember your NAS is a Synology - is that correct?
 
Yes, I have a Synology DS214play NAS.

I can't remember transferring data from the portable HD back to the NAS except for one single table. That table is relatively unimportant. It's generated by a function acting on the table that's the subject of this thread. I can always regenerate it using that function. I take that table (which is essentially a report of missing media) to the station, search for missing media, alter to indicate updated info, e.g. such and such has been located. Then I take that table (altered) back home and work off it live to update the main table on the NAS. If anything seems wrong with the table, I'd very likely know. There's little risk, nothing in that table gets transferred to data on the NAS without my direct keyboard entry in edits to a browse window to the main table on the NAS.

Now, maybe I'm taking some risk by doing practically everything to my data using Wifi from my local machines to the NAS's data. That crossed my mind when I had this scare a few days ago. Fact is, I backup once in a while to other external HDs for offsite storage, but I haven't been doing it regularly. I'm thinking maybe I should institute a backup routine locally, i.e. something like backing up all critical tables on a weekly basis and keeping those backups for a while, assuming space isn't an issue. I could automate that somewhat. The location of those backups would, I suppose, be on the NAS itself (which has mirrored 3TB disks), or to a local external HD.
 
Scan through the file looking for LEN(ALLTRIM(ARTIST))=80, and have a look?

Yes but not.

1. If you have any artist with 80 charadcter you have to verify.
2. But you don't have to look, no Artist name begins with CHR(0), every artist name does not begin blank.

Bye, Olaf.

Olaf Doschke Software Engineering
 
As far as I follow the discussion you rarely also copy something to the NAS. In general if file copies through WiFi are a problem that speaks for ethernet cable. Anyway, the health check you can do can also be a checksum of the file and you should obvioulsy avoid overwriting important nonrecoverable files but first copy to a "inbox" directory, check the file validity and then move it to it's destination by renaming it. If you do that step by copying you go via WiFi twice more, readin on the file to Write it back, that double riscs the same problems once more.

Synology usually is fine hardware, did you put in drives Synology puts into its NAS themselves or picked anything else?

Does the NAS get hot? If your drives are permanently hot, then you also get file problems, that speaks for the wrong drives used in the NAS, insufficient cooling. The times for usual platter drives are almost over, maybe you even have SSDs in there, but they can also get too hot.

Bye, Olaf.

Olaf Doschke Software Engineering
 
Actually, I do often copy files to the NAS. One of my two main applications deals in temporary files until processing is complete. At a certain point in the prg I'm asked this question:

@ 13,25 Say 'Do you want to update the permanent files now? Y/N'

This is almost always done via Wi-fi. However, the temporary files are not local, they reside on the NAS. After the permanent files are updated from the temporary files the temporary files are zapped. Working in temporary files is one strategy taught by the teacher who introduced me to Foxpro in a class. It was FoxPro 2.0 for DOS in the middle 1990's. It was then that this application was spawned... it was my class project (everyone had to work up a project as part of class requirements). Fortunately, mine was a project that has continued to be useful to me. I've built it out greatly, but the basic initial structure and core functions remain, though augmented considerably. It's basically all procedural code with some SELECT commands.
 
If file copying wouldn't work generally, wiFi would fail so miserably you wouldn't just have sparse errors. But still the kind of blocks errors you have and all I heard/read about problmes with DBF file corruptions network errors are more likely to lead to such bursts of zero bytes than HDD errors. That's all I can say.

TEMP files should be on a local drive, not on a NAS. You want performance, and you don't get that via a bottleneck of ethernet. For the same reason, GETENV("TEMP") will also point to a C: drive directory (in your windows account user profile).

I work without any temp DBFs, I only use the permanent backend and cursors. So you learned a very old technique not used in general. You won't be able to change from that without code rewriting, in itself it may contribute to the problem because, of course, COPY FILE reads a file blockwise into RAM of the process and then back to the target folder and file(s). The same applies to Shell copy commands, you don't instruct the NAS to copy the file, though it could do so in principle as it is not just a network-attached drive but a whole system in itself, Linux based.

I don't want to point to replacing the coping mechanisms from NAS to NAS to something you implement in Linux on the NAS itself. Just be aware, if you have such file problems they will happen regularly, even if just infrequent when every block you copy has a low chance of 0.001% of being swapped out with zero bytes that come from a transmission recognized as having worked and having transferred these zero bytes. The typical other problem of failing HDD sectors typically doesn't go undetected. In SSDs it's even planned to have a contingent of replacement memoery cells called overprovisioning and to detect when it'snecessary to swap out a failing block every write has to be read back internally and verified. This is not happening to network packages, as iot would reduce the transfer speed to 50%.

Cable transmission always is much more reliable.

So, if you want reliability, introduce cabling, if you want to avoid that avoid as many transfers through WiFi as you can, for files already on the NAS you want to move on the NAS, renaming is working without copying the file content, you really just send over the command to change the table of content ( a tree structure of directory and file nodes with file nodes pointing to the starting sectors of files). Renaming has just one problem: Unlike copying you can't overwrite exisitng files, so you either rename them to be in some archive folder (as a backup you can revert to) or you delete them first, of course.

One advantage of the way to work with TEMP files of course is that it makes the untouched permanent backend DBFs an archive of the previous data situation before your application session. Fine, you can also do that inversly, use your "TEMP" files as permanent storage and treat your previous backend as backup. Then don't jsut overwrite that, make the copies to new empty foilders. Anything you detect did go wrong can a) be repeateed and b) you still have older versions of the data.

That may be oversimplyfying your situation when your TEMP files are not 1:1 all the permanent backend files. But your technique works as single user solution only anyway. If multiple users would each work in temp files, the last one ending his application session and overwriting a central backend, all edits other users made are overwritten. This is not how multi user applications work. There's awhole cahpter in the VFP help on programming for shared access.

The only session specific data in such multi user applications typically are the buffered changes in Cursors or Viewcursors before you update the backend with conflict checks (concurrent changes that happend by other users). In a backend only connected by WiFi that has always failed with the user environments I had to deal with, with a magnitude of 500 users working on lab notebooks - in parallel, really. So the companiy really introduced netebook docking stations connecte with cables.

The less users you have the less likely errors occur, but they do come up. It doesn't matter how rare some WiFi package fails unnoticed (the protocol also works with checksums), but when it happens, you do have data loss.

Bye, Olaf.

Olaf Doschke Software Engineering
 
Just to highlight a history of the application I mentioned early with lab notebooks.

Before I was even the maintainer of this appliation there was a central PC version (Windows) and a lab notebook version (DOS). Data transfer was via null modem cables. 3.5" diskettes were introduced when the serial transfer became too slow and also serial ports became unavailable. Until the data packages to transfer became more complex and of course larger, too large for 1.44 MB disks, even zipped. And as a last step USB thumb drives where used.

The major strategy about the data always was - at least since the DOS app was swapped out with a Windows version, too - about having a complete backend with just the formulas and test data (data about lab tests) necessary for that lab notebook. Data retransfer worked the same and merged with the central database as necessary, eg weighing data from electronic scales and other test data (let me oversimply here) only was one or two columns updated in records of the PC backend.

Later as WiFi was introduced the thought came up to stop using local backends. And it failed miserably. And another department tried again a few years later and failed again. And in a third attempt, when data was on SQL Server and the WiFi connection was merely about the connectivity between ODBC driver and SQL Server service, it failed yet again. For other reasons, WiFi endpoints on the different stories caused notebooks to switch the endpoints they connect to and then SQL Server connections fail due to changing SPIDs for the same SQL Server connection handle, which then is rejected.

With that experience, of course, to me WiFi is merely being a connection you use for your smartphone and tablets for internet connectivity, not as a network LAN connection.

Bye, Olaf.

Olaf Doschke Software Engineering
 
Hi,

Yes, I have a Synology DS214play NAS.

1) In order to remote access your NAS from your office a simplified scheme would look like this:

NAS (and other devices) - Modem/router (home) - Internet - Modem/Router (office) - OfficeLaptop and other devices

2) Your NAS may definitely be configured as WebDav server either HTTP or HTTPS or both - (see under Package Center)
3) In addition you'll need some software e.g. WebDrive to be installed on your OfficeLaptop to allow you to access your data on the NAS (the folder with the VFP-Data as mapped network drive) at home

Since this isn't a VFP issue I'll stop it here.

hth
MarK

 
Mark has a big point here, in that you could simplify the file transfer by opening your Synology for remote access. He surely knows more about it than I. It compares to using Dropbox or Google Drive or such things, only that it all is kept in your hands without a third party.

It's still not a WAN equivalent, WebDAV is a better FTP, works through HTTP and thus overcomes some companies policies about FTP ports, for example.

What it would end is your need to go through a drive at all, you sync your data. It's not a good idea to do so on a backend, even if this is just your major network share working at home.

WebDAV, just like FTP, does no automatic syncing, you still do need to connect and download files, but it all goes through the internet. In that way, it doesn't compare to DropBox and that's good. Because a backend put into a dropbox folder even for small-sized DBFs means with every single record added or changed Dropbox sync the whole file. I know there's more to Dropbox, there is an API, but that just leads off-topic.

What WebDAV would be able to solve is the use case you work on some data at home with laptop and NAS in Home office days or evenings, next day in office the first thing you do is get the latest data to your office NAS via WebDAV, then work with the same setup, going home, again get the latest data via WebDAV before continuing work. This is a single user scenario with data kept outside your notebook, maybe for reasons of confidentiality and of course the practicability of using the NAS backend with a few users.

Otherwise, of course, a BitLocker encrypted drive on your notebook would solve the problem as you never nee any data transfer. Then the NAS would just become a backup archive device. But multi-user applications are not possible with local data, obviously.

If I understand your situation correctly, you always risk the data transfer problems you encounter and may be able to reduce that via cable but otherwise your usual procedures or going the route through internet, if one assumes your NAS is internet-connected via Ethernet to your cable/DSL modem, for example, and likewise in office.

The other cause of file corruption still is using DBFs via WiFi while you work in your application. A TEMP dir on the NAS helps in still having the data as it was before making that copy, but of course, the moment you copy it back that backup or archive of previous data is lost, so this only solves half a problem if something gets corrupt during the day and you detect that via errors, you know you have good copies.

Backup strategies would be a topic for data stability anyway on top of all that.

In any case, you're far from a client/server and distributed data situation, you don't replicate data, you don't sync data, you move data and will need restrictions of who works where and when with the data to never need to merge changes multiple users make on the separate data locations.

If you'd go for a real Client/Server scenario, of course, you'd need some database available to you from wherever you work and one simple way would be stopping all these multiple location scenarios, have one central backend and work either in office or via Teamviwer on computers you leave on all time.

That's the easiest way to not need to change to either more intelligent and stable data syncing or to change code to real client/server with a database like MySQL hosted on the internet.

Bye, Olaf.

Olaf Doschke Software Engineering
 
Last comment: Your usage also reminds me about shared repositories development teams sometimes used the old way before Github made it easier to work concurrently on the code.

Something like The code repository also VFPs IDE can easily integrate to: Source Safe.

You check out files, making them exclusive to you and forbidding others to change while you have the ownership. Then check it back in, when you're finished and allowing another developer to check it out. You don't have reported the problem of two users overwriting each other's data changes, I just mention it, as it's something easily done to extend single user software to multiple but never concurrent users.

It's still the system files are shared in SharePoint server, so this, in general, is about collaboration tools. You could manage your data files the way such tools manage code or dicuments.

That doesn't solve the problem of such bad file blocks, you can only avoid WiFi and/or double-check any file just copied to really be an exact copy.

Bye, Olaf.

Olaf Doschke Software Engineering
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top