Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Live Shared REINDEX

Status
Not open for further replies.

Rick C. Hodgin

Programmer
Oct 15, 2021
172
US
We have customers that have bad networking software / settings. We've worked with their IT to try and get them squared away, but some customers just won't address it or can't seem to get it fixed. As such, we run into missing records not in index (but they are in the table) on a regular basis.

I've been coding a live reindex feature that reindexes the table while it's in shared use. It works, but when I was almost finished it occurred to me there's a VFP9 solution.

If your system doesn't use the DELETED() flag / condition on any records, but instead uses something like lDeleted, then you can create your index tags with an added FOR clause, which is FOR NOT DELETED(). This will include every record by default, and give you the live reindex ability.

If a record is ever not in index, which can be detected through a variety of ways, then it can be put back into the index by going to that record with no active index set (SET ORDER TO), and then issuing DELETE, and then issuing RECALL.

It seems to work well on single-records.

On large-scale DELETE and RECALL operations it also works, even in a SHARED manner, but it is slow and produces a CDX that is much larger than the original. In one test with 10M records, the table was 48 MB, the index 49 MB before the DELETE and RECALL. After, the index had grown to 115 MB. So it's not a wide solution for everything, but single point failures it might be doable.

--
Rick C. Hodgin
 
Hello Rick,

nice. I see how using the FOR NOT DELETED() filter of an index would help. And it does what many assume is the case anyway, that deletion of a record removes it from index tags, too. This also enables things like reusing IDs, which is a two edged sword, as you may want to prevent that. Well, you can make the FOR NOT DELETED() decision per index, so you can decide per use case.

Rick C. Hodgin said:
we run into missing records not in index (but they are in the table) on a regular basis.
I had cdx corruptions, but I only had missing records in IDX, where they are easily explained by your responsibility to open an index when data is changed.

I guess DELETE and RECALL is causing quite the same for an index as two UPDATES changing a field and then go back to the original value, or not.
I checked how a RECALL is interpreted by VFPs triggers: It's interpreted as an INSERT.

So, if you have such a defect could you test whether two UPDATEs of a record missing in an index put it back, too? Like

Code:
original = alias.field
REPLACE field with dummyvalue
REPLACE field with original

I'd just like to know whether that could cause less bloat in a single point failure fix, because it only affects the tags related to that specific field and not all.

Chriss
 
The issue arises due to network errors related to caching. When we disable network caching it significantly helps, but still doesn't resolve the issue at all of our customer sites.

This is the first solution we've had that actually allows a fix while the users are in the data. The live reindex algorithm I wrote accomplishes the same thing, but I think the VFP9-only solution is cleaner.

The design: Use an external app to constantly monitor the DBF and verify things that were written remain in index. If any are identified as not being in index, send a message indicating the table, tag, and RECNO() to the main app. That app goes in and uses an algorithm written to DELETE the record without using VFP's own DELETE command (can be done with FOPEN(), FSEEK(), FWRITE()). It then USEs the table, and uses VFP's normal RECALL command on that RECNO() to re-insert it back into the index.

In addition, if the system design knows that some parent / child relationships should exist, if it ever doesn't find the child that it expects to be there, it can force the scanner to process that table on-the-fly and do the immediate recovery if it's missing.

It's actually a clean, elegant solution for single-point index failures.

--
Rick C. Hodgin
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top