Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Microsoft Visual FoxPro (Not responding)

Status
Not open for further replies.

dmusicant

Programmer
Mar 29, 2005
253
US
Yes, I get this not infrequently. My machines run Visual FoxPro locally, the development environment. My apps are not compiled into EXEs or APPs. Well, even if they are I don't run those, instead I run a top level PRG that gets things going.

My data is on my network, on a Synology NAS which reboots once a day at 3AM. I initiated the daily NAS reboot in an effort to alleviate this vexing problem. Usually my apps are very quick. But for some reason occasionally VFP will get dog-slow. Just opening it can take a minute or more even though no table is opened yet. Then when my app tries to open a table, it can take several additional minutes. There is nothing going on there at that point in the proceedings. In fact, it may just be opening a screen and a table.

I have made pains to close all tables when putting any of my machines to sleep, thinking this might help eliminate this problem. However that hasn't stopped it.

Yesterday I did a search and found an MSDN support thread that seemed to deal with a similar issue: [link ]Link[/url]

In that thread, Craig Berntson discussed some possibilities, including suggesting the possiblity of SMB2 issues. I have no idea what that would entail (he said those issues were discussed in other threads there).

Yesterday, after seeing Craig's advice, I deleted my temporary files. That was followed by snappy VFP action, but today it didn't help at all, evidently. Nor did a reboot of the machine on which I'm seeing the problem the last week or two, a wifi connected Windows 10 64bit machine. I think I've seen the problem on several of my machines including another Windows 10 machine which is wifi connected but is 32 bit. Also a Windows XP machine that's ethernet connected.

I am really at my wits end on this one. Last week I assigned IP addresses for my machines manually instead of automatically in my router config. But the problem returned a few days later. Rebooting the Wireless N router hasn't resolved the problem, nor the NAS, nor the client machine.
 
You're having VFP open for days, months, years?

Any long running processes I did had some flaws, which can be mildened, but the simplest solution is to restart from time to time.

The first encounter of non-responsiveness I had was with a data migration application. It was a huge migration from a legacy LIMS (lab information management system), not just a few tables. about 20 years of collected data were migrated for a redesigned and normalised better version of the data reimplemented with VFP7 at that time.

Details don't matter much, all such migrations are alike, you query data from the legacy database in the form of the new database and then insert that data, append it, whatever. You do some data cleansing on the way, some transformations, etc.

Coming to the core point of it, when you do such things like shoveling data from one place to another, your EXE process - also running within the IDE doesn't change that - is very unresponsive during all queries and since it runs sequences of endless queries without any user interaction in between, it also doesn't matter how fast each single query runs. The core process thread is just busy and doesn't react to system wanting a response to an "are you still alive?" ping to the process. The VFP runtime handles this, maybe even the C runtime the VFP runtime uses handles this, but surely not your own code, so all you can do to not hinder this response from the VFP/C runtimes is have DOEVENTS FORCE lines happening from time to time.

There are several things involved:
1. File handles. Often enough even with all dbfs closed by USE, even with all low-level files opened by FCREATEE or FOPEN closed by FCLOSE and TEXT merge to files, all locks unlocked and all coverage files closed etc some file handles only are finally done with, when you exit VFP.
2. Memory allocations. All you may do is manually triggering garbage collection via SYS(1104)
3. Event queue, not only of your process but also about any OLE object you use, will sometimes be cramped with not yet processed events locked up by your process because you never get back to your idle state READ EVENTS.

Within this data migration, the effect I had after having implemented manual garbage collection and DoEvents, even after the migration process ended and I did a CLEAR ALL EXTENDED, the IDE behave sluggishly.

There are some more minor things and hints on some tricks about temporarily resetting sys(3050) to a low memory and back to how you set it at start to enforce a purge of anything stuck in that memory, but all in all just having a fresh start from time to time is all that helps to recover from an unresponsive state, the only thing cleaning up the system is a restart.

VFP is not meant to run this way just like a server process.


There's one thing related to that, if you want to run a VFP EXE as a system service, which is started at startup of the system and only reset when a client restarts or really is meant to run forever on a server it runs on, like the MSSQLSERVER service, you have to really implement something about that windows message testing your process is still responsive, that's even more complicated for a service. You have to react to incoming probing message SERVICE_INTERROGATE from the service control manager.

In the end, if whatever you do is not a service your simplest solution is to restart from time to time. If you want this automatic don't try to fiddle with restarting yourself and quitting from within your code, you'll just use a secondary EXE doing that or let your EXE quit after 24h and let a daily task scheduler job restart your exe.

Bye, Olaf.


 
Thank you Olaf for your speedy reply. I have not read the whole thing but I absolutely will right now, you bet! But I want to reply here immediately that your assumption that I'm not restarting VFP for days/weeks/months is totally wrong. In fact, in an effort to stop this from happening I have gotten into the habit of exiting VFP before putting each machine to sleep, thinking this should eliminate any confusion on the part of the server (NAS) concerning the state of tables/files in my FoxPro data. I have no reason to believe that that has helped. I very often have to restart VFP anyway, because of weird things happening. My screens are all FPW 2.6 designed and generated. I will now read your post entirely and carefully. Again, thank you!

Edit: BTW, I have certainly tried to work around having to wait for sometimes up to 5 minutes or more for VFP to again become responsive by shutting down the VFP process in Task Manager and then restarting VFP 9.0. I have done that over 100 times easily, I imagine. Very often it doesn't seem to have a beneficial effect and I again see a blank screen and (Not responding) after I restart VFP 9.0. Using my apps/data is in my life's blood and this can be very aggravating. Even restarting the machine and starting VFP 9.0 from scratch doesn't always (maybe ever, I'm not sure) eliminate the long waits.

The great majority of the time, as I said, my apps are snappy and the problem is not evident. Then suddenly, for no apparent reason, I'll see wait states that can be pretty much any length of time, even up to and over 5 minutes and I can feel like tearing my hair out! :)

I continue to think that there is a solution. A contributor here was of the opinion that it's caused by my network being improperly set up. I have tried and retried to approach the problem from that perspective, but nothing I have done has eliminated the problem.
 
dmusicant said:
I have done that over 100 times, I imagine. Very often it doesn't seem to have a beneficial effect and I again see a blank screen and (Not responding).
Well, if you execute any long running query or tight loop you get that behavior "for free".

Just do
Code:
DO WHILE .T. 
ENDDO

You have to avoid such things, partially will not be able to do so, but this tight loop causes no harm, if you extend it as
Code:
DO WHILE .T. 
DOEVENTS FORCE
ENDDO

Long running queries can be seen as loops, the iterate over table rows, often enough multiple tables, sometimes up to GB large files, that's when VFP is unresponsive even from the start and at the first run, you can't do anything about that but process data in smaller chunks.

Bye, Olaf.
 
There shouldn't be anything going on in terms of long running queries or queries at all when this is happening. I have some queries in my apps, but they are rudimentary, on small well-indexed tables and they have nothing to do with what's going on here. I'm not running the code that uses any selects when this is happening.

One of my two favorite apps opens a table with a filter, i.e. limits records shown in a browse window to records satisfying a filter condition. However, I'm usually seeing these delays before I am able to establish the filter conditions. So, there shouldn't be any crunching going on. Before I even establish what table I'm interested in, when I start this there is only one table that needs to be opened, a very simple table with maybe 15 records. That table is metadata. It's records contain info about other tables of metadata. Those secondary metadata tables contain info about my many tables of actual useful data. Those tables are often not even in play here because I can't get to the point of trying to open them. I hope that isn't confusing.
 
It does not only happen with queries, I already showed that. It also doesn't need to be a single loop. Anything, that keeps VFP busy for a longer period before going back to READ EVENTS causes the non-responsive behaviour.

You can measure that via coverage logging, as that will typically log all lines running for split seconds before you go back to READ EVENTS.

As you say you have FPW code: All I know about it is, it has other READ cycles quite similar to what VFP does with READ EVENTS. I am not even sure the interrogation messages windows sends to processes to judge how responsive they are will be answered by legacy foxpro code, even if you run it in VFP.

The other aspect might be, what I think you already tested out in several ways: LAN performance. If you have both cable LAN and wifi, that might still make wifi disturb the stable data flow you can have with cable LAN. At Ms patch day I can regularly observe a big corporate LAN collapsing and maybe, just maybe, a repeater could help, depends on the geometry of your house/flat, but I'd turn off Wifi on your PC only leave it on at the router for your smartphone connection, otherwise purely work on cable LAN.

Bye, Olaf.
 
Well, I'm trying to open a table that's on the NAS and it's been around 15 minutes and still "Not Responding." I'm going to try rebooting the NAS. Accessing the data I want to see in that table (it's a memo field) normally takes under 10 seconds.
 
Uh huh. After rebooting the NAS I am back to quick access. That proves nothing, however I have a notion that the problem is involved with the server's supplying the tables to VFP. I'd checked and made sure that VFP was not open on any of my other machines.
 
How hot do the HDDs get within the NAS casing? I ask because once I bought very fast but also very high RPM HDDs and could only run some NAS for a few hours, just good enough for making backups. It was good enough for me, but it depends whether a NAS is good enough to act as a file server or just a better external backup drive. Once drives get hotter, they don't stop working, but performance degrades and they age faster, when run with these high temperatures in a casing not providing good enough cooling. I had a blessing in disguise, as the casing I used was having a temperature alarm and automatic shutdown, otherwise, that might have happened unnoticed and I wouldn't run this NAS anymore. I recently bought a much simpler and yet faster and higher capacity 3.5" external drive and a Blu-ray M-disc burner as a combination for live and long time backups, but I do not use a NAS for file serving anyway. A good NAS should be usable as a Linux/Samba file server and also provide 1-10 Gbit/s LAN bandwidth. If I look back a bit within forum search I think I'll find a discussion about the eligibility of a NAS as a DBF backend server.

Bye, Olaf.



 
Accessing the data I want to see in that table (it's a memo field) normally takes under 10 seconds.

You don't say what exactly you refer to, a query to find a certain record via $ operator or LIKE clause, for example, can surely take almost any time span depending on the amount of data, once a result is found though, you should be able to get at the memo field value within a browse window double click in split seconds.

Again it depends on what you're doing, a query on the field itself would mean it is already loaded and cached to check for a LIKE condition, for example. Technically the browse cell of a memo field as you know displays "memo" for empty memo fields and "Memo" for memo fields with content, which simply is decided by the 4-byte value the DBF file stored the memo, which merely is an offset address to load from the FPT file. So technically loading a memo content by double clicking on "Memo" means opening the FPT file and seeking to that offset, where there is a meta data structure telling the length of data followed by the data itself. Unless the NAS is slow or bandwidth to the NAS is very low, that doesn't need much time. Opening an FPT may cause some small validation checks quite like TABLEVALIDATE causes for DBF opening, but otherwise, a SEEK to an offset address already known from the DBF file is not meaning a read of all these bytes, it's merely positioning to the file block.

After rebooting the NAS I am back to quick access. That proves nothing
Well, that proves a lot, indeed, if it always is the case, there is some reason for slow file access, and that's not because some memo fields need multiple rounds of acquiring the memo byte position within the FPT file, which even doesn't happen through memo bloat effect. Memo bloat simply means some offsets within the FPT file hold values, which are not referenced from the DBF file, bytes within the FPT not deallocated just like deleted rows of a DBF also not being removed from the file right away, so any such delay doesn't point out a VFP failure or weakness, but a hardware weakness. The most obvious will be about the drive itself, then of course OS and caching memory and network could be involved and optimised, but of course, you can only cache what has been read once at least and so the essential timing always is the first read without cache acceleration. If that takes long there's something wrong.

I could perhaps imagine some authentication process giving or blocking access to the NAS could be in the way, but such things typically happen once and then isn't re-checked for every single file block access, you have a valid file handle, so you can access a file. Multi user file locking could cause something like no response, but do you RLOCK/FLOCK anything?

Bye, Olaf.
 
I'm not using RLOCK/FLOCK anywhere in my code. In fact, in trying to stop this from happening I've pretty much been in the habit of just shutting down VFP when putting any of my machines to sleep or when I'm done using VFP. So, no other machines have VFP running when this is happening on a particular machine. I've checked that, I checked it yesterday when I had a 15+ minute delay.

I called Synology (my NAS) support yesterday, really for another issue but I got a guy on the phone who seemed pretty heads up, so I ran this by him. He said I should submit a support ticket and I did, from the machine with which I'd just experienced the issue -- 15+ minute wait to gain access to data (a Not responding situation). I got this message in reply, which is quite interesting, I think, for more than one reason. It implies (to me) that there could be a problem with the CDX for a VFP file. I will wait for the problem to occur again and then BEFORE REBOOTING THE MACHINE, do as the email suggests.
- - - -
My message to Synology support in my request:
- -
I'm using Microsoft Visual FoxPro 9.0 to access data kept on my
DS214play, which is setup with both HDDs in RAID1. My machines are wifi
or ethernet connected. I think I also have this issue with ethernet, but
usually am using my wifi connected machines, which have the data drive
mapped to a drive letter in Windows 10.

Usually the performance is quick and I have access to my data in a few
seconds (i.e. 5 or 10 seconds). Things can go well for a few days and
then without warning it will take a lot longer to gain access to the
data. For example, this morning it took over 15 minutes and I still
didn't have access. I rebooted the NAS and the problem was resolved this
time, but doing that doesn't always solve the problem.
- -
The email from Synology in answer later in the day:

Hello,

I understand you're having issues with intermittent poor performance with SMB.

I took a look at the debug logs and I believe this may be related:

../source3/smbd/oplock.c:714: [2017/07/26 11:09:50.062680, locking 0, pid=19195] oplock_timeout_handler
Oplock break failed for file Data/Ans/Metadata/invest.CDX -- replying anyway

Basically, oplocks are when a NAS detects that something is accessing a file, so it refuses to let other programs/users modify the same file to prevent data corruption. Unfortunately, since the logs were generated after a reboot, I'm not able to verify if this is the case since any locks would have been wiped out by the reboot.

When the issue occurs, are you able to access DSM normally? If so, what I'd like you to do is to go ahead and wait for the issue to reoccur, then before rebooting download a debug log using the instructions below and send it back as an attachment.

1. Log into DSM/SRM as the default 'admin' account.
2. Go to Main Menu - Support Center - Support Services and click Generate Logs.
2b. If on DSM 6.x, please do not enable "Log more detailed information for diagnosis" in Support Center.
2c. Please only check the "System" box in Log Generation unless otherwise specified.
3. Please download and save the debug.dat file to a location on your computer.
4. Attach the debug.dat file to your reply via email. Please do not put this file into another compressed file and simply attach the debug.dat file as is.

In the meantime, here's a few basic things you can try to at least work around the issue (assuming that you are indeed encountering file locking issues):

Turn off Preview pane in Windows Files Explorer. See this article on our website for more information.
Turn off Preview in Mac Finder: View menu > Hide Preview
Make sure the document was closed properly earlier, and there is no temporary file in the same folder.
Open the application (like Excel) first, and open the document from the File menu in the app, rather than double-clicking on the document directly to open it.

If you have any further questions, please feel free to reply to this email.
 
Oplocks are a known source of CDX corruptions, at least if both client and server are windows based. Synology uses Linux, AFAIK, but for serving files supports the SMB protocol including oplocks. The usual solution is to turn oplocks off, but see my post at 26 Jul 17 08:13 in thread184-1779306

I'm no Linux pro, but I have the faint hope in conjunction with Samba (which usually is the Linux daemon for the SMB protocol) can be configured to not allow oplocks, unlike Windows Servers forcing it upon you in SMB versions 2 and 3.

Oplocks are in fact only given, as long as one client is the only and exclusive client accessing a file. When they are granted, they usually at the same time cause read ahead caching and thus a whole DBF/CDX file is given in the hands of the client, which now has faster local access, until any other client demands file access. Oplocks are not there to prevent corruptions, they are there to offer the opportunity to a single client to work faster with a file without LAN involvement after a block of the file is read, the server gives the client with an oplock the responsibility of it and only takes responsibility back to itself after further clients request file access.

The break of an oplock is the action that then finally will be done to prevent corruption, to prevent different clients reading the out of sync server version of the file and in turn with their modifications rooting in an older file version cause corruptions. Not granting an oplock should itself not hinder file access, especially when only unchanged file blocks are to be accessed. But it means an oplock granted to another client needs to be broken. That other client needs to write back any locally cached changes (due to its oplocks of the file) and then you can access the file. That shouldn't take 15 minutes.

So did you use the same DBF on other clients? Are they powered off? Maybe something isn't working correctly with file management, as of course any correctly shut down client would commit any changes to files and unlock any oplock done. But what seems to be happening is oplocks remaining, which should already have gone. Especially if you know your client is the only one accessing files.

Bye, Olaf.

Edit: As oplocks like any normal file handles or locks or any other allocation aspects about files are reset after a NAS restart, this must always resolve access problems.

dmusicant said:
but doing that doesn't always solve the problem.
The only way I can take that serious is: Even though you reset the NAS the same misfortunate oplock situation establishes very fast and repeats itself. Or there is yet another not uncovered problem.

You can read up that in situations even one network card in a whole LAN is misconfigured, outdated, misbehaving, this can cause havoc to this system. Oplocks are for a system having a low number of concurrent access, database systems don't really profit of this. You can make up tests with two clients and yes, the performance will drop drastically without oplocks, but actually, the higher performance of a single only user access is the lie. It's a nice idea the very common multiple writes to the same block of a file are kept at a client, but oplocks don't just prevent that from causing double triple or more writes and combining them into one, oplocks keep all changes even if there is no load on the server hindering to start writing back and committing changes, like write through caching should do. So at the time another client requests file access, even just read access, that's when you get your unbearable waiting times, as the server is too lazy to even take in earlier and already long committed changes from the client with the oplock earlier.



 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top