Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Using Related Tables 2

Status
Not open for further replies.

linousa

IS-IT--Management
Mar 8, 2013
79
US
Have five related tables with one-to-one and one-to-many relationships, is it better to unite one-to-one ones together for cleaning and performance purposes? right now, it is a little bit confusing, having quite a few commands done with "tableX.fieldX" addressing. Any good resources(samples, books, video, faq, etc.) about using related tables and uniting/merging tables?

Thank you.
 
Using Related Tables in VFP covers a LOT of topics.
Therefore it is difficult to point you to one or two references to assist you.

You would be better with a more specific question regarding the use of Related tables and we would be better able to assist you.

While it would not cover all possible uses, you might want to look at the free online (or downloadable) VFP tutorial videos at:
One of them is described as:
Building a Simple Application - Pt. 5

In this video we continue to develop the screens (forms) for our Issue Tracking application. Highlights of this video are creating temporary relationships between two tables in the data environment, using expressions in grids for display formatting and binding a combo box to a lookup table.​

Good Luck,
JRB-Bldr
 
it is a little bit confusing, having quite a few commands done with "tableX.fieldX" addressing.

What you refer to as "tableX.fieldX" is not directly connected to related tables. It is simply a way of referencing a particular field in a particular table. You use it when you have more than one table open, and you want to reference a field in a table other than the one in the current work area. Some developers also consider it good practice to use this notation at all times, regardless of work areas, etc., as it removes any doubt or ambiguity as to which table you are talking about.

is it better to unite one-to-one ones together for cleaning and performance purposes?

I'm not sure what you mean by that. Are you asking if you should combine two or more related tables into a single table? If so, there's no simple answer to that. It depends on several factors. But, most of the time, related tables exist for a reason. Unless you have a good reason to do otherwise, you should keep them as they are.

If this doesn't help, it would be useful if you could make your questions more specific.

Mike



__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
I watched his videos yesterday and the best one was: "Q&A: Using Related Tables In A Report", but it is still a totally different animal. Using related tables in reports and forms is clear and simple enough for me. In my case, let make it simple, I have two tables with >150M records(all addresses in the nation), related by one-to-one "id" index and I have to do data manipulations, like:

do while !eof()
if lower(table1.fname)="john"​
replace table1.fname with "david"​
replace table2.recupdate with dtos(date())​
endif​
skip​
enddo

Don't ask me why we didn't use "replace all", because we need to use loop for many reasons and this is just an example. And my question- right now it takes 10's of minutes to run all data manipulations. Want to try simplify few things up, what is the best way to merge/unite these two tables? I think, I have seen the solution a while ago, but I don't remember where(something exchange) and I can't find it. Next, how to address them more efficiently, if possible.
 
Mike said:
What you refer to as "tableX.fieldX" is not directly connected to related tables. It is simply a way of referencing a particular field in a particular table. You use it when you have more than one table open, and you want to reference a field in a table other than the one in the current work area. Some developers also consider it good practice to use this notation at all times, regardless of work areas, etc., as it removes any doubt or ambiguity as to which table you are talking about.
Ok, when two tables related and you jump to any record in table1, your "tableX.fieldX" reference will result in corresponding record in table2. Another words, same reference in two not related tables will always result in table2 current record, no matter where go in table1.

Mike said:
If this doesn't help, it would be useful if you could make your questions more specific.
Making long story-short: want to try different options, need efficient code for:

Option1:
Search for a record and pull all related data from several tables into memory variables.
Example:
given:
table1(id,fname,lname,add1,add2,add3)
table2(id,acct,race,sex)
task:
Pull by acct into m.fname,m.lname,m.add1,m.add2,m.add3,m.acct,m.race,m.sex

Option2:
Merge/unite two one-to-one related tables into the new table/cursor:
given:
table1(id,fname,lname,add1,add2,add3)
table2(id,acct,race,sex)
task:
Create table3 with corresponding fname,lname,add1,add2,add3,acct,race,sex fields

Option3:
Merge/unite two one-to-many related tables into the new table/cursor, case when people have up to 10 acct's:
given:
table1(id,fname,lname,add1,add2,add3)
table2(id,acct,race,sex)
task:
Create table3 with corresponding fname,lname,add1,add2,add3,acct(1..10),race,sex fields
 
I use Related tables a lot, but based on your 3 items above, much of what you want to do would be best done via a SQL Query to accumulate the desired records into a single composite table and then utilize its records as needed.

Most likely you would want to have this SQL Query utilize the JOIN option.

Look over this website to get an understanding on which 'flavor' of JOIN would best meet your needs
or

Then a very broad generality of what you might do could be something like:
Code:
SELECT <whatever fields from whatever table(s)>;
   FROM Table1 <JOIN however you want to Table2>;
   AND <if you need additional selection criteria>
   INTO TABLE Table3

SELECT Table3
< do whatever >

Search for a record and pull all related data from several tables into memory variables.
Remember that since the memory variables you describe represent field values, they can only hold one table record's field values from each table at a time, not multiple records (fields from single records in multiple tables OK, fields from multiple records NO).

So if you had the following
Code:
SELECT Table2
SET ORDER TO id
SELECT Table3
SET ORDER TO id
* --- RELATE Table1 to Both Table2 & Table3 simultaneously ---
SELECT Table1
SET RELATION TO id INTO Table2 ADDITIVE
SET RELATION TO id INTO Table3 ADDITIVE
LOCATE FOR Table2.Acct = "XXXXX"
IF FOUND()
   SCATTER MEMVAR  && Populate memory variables from Table1 Fields
   SELECT Table2
   SCATTER MEMVAR  && Add to those memory variables field values from Table2 Fields
   SELECT Table3
   SCATTER MEMVAR  && Add to those memory variables field values from Table3 Fields
   < do whatever with the various memory variables >
ENDIF

You would be OK with that - although that might not be how I would do it.
But if you changed record(s) all of those memory variables would need to be re-populated with new values.

Good Luck,
JRB-Bldr
 
1:1 relations are special. Most of the time they are needed, because the number of total fields is more than 255. The split into two vertical halfs also could be done becuse of the 2GB limit. It's a bad choice, as a further split due to reaching the 4GB limit then would take much effort to change all code. You better split tables horizontal, if the field number limit is not the problem.

The RELATION is binding the two half in a way you actually could use REPLACE, as the record pointers are bound and move along. In such situation you may even do the relation based on the record number, as a 1:1 relation means exactly same number of records most of the time, unless it rather is a 1:0-1 relation and the right side of data are all optional fields.

So what is your situation with the two tables? Do they share a common ID each in field 1 or not? From the DO WHILE loop code example you give I detect your code already does a 1:1 relation, as you only do 1 skip and still work in table1 and table2. So table2 is autoskipped by the relation.

Let's address your performance problem: A locate of rows can make use of an index (even if the relation would enforce table2 to be ordered by a certain index), so you better use loops that locate a start point and then SCAN REST WHILE or you even begin and continue with SCAN FOR.

Your sample loop...
Code:
do while !eof()
    if lower(table1.fname)="john"
       replace table1.fname with "david"
       replace table2.recupdate with dtos(date())
    endif
    skip
enddo

...can be done better
Code:
SCAN FOR table1.fname = "john"
       replace table1.fname with "david", table2.recupdate with dtos(date())
ENDSCAN

...and if there is no name collision (eg no field you address exists in table1 and table2)

Code:
SCAN FOR fname = "john"
       replace fname with "david", recupdate with dtos(date()) [b]IN table1[/b]
ENDSCAN
notice I added an IN clause here, stating IN table1, though recupdate is part of table2. The relation starts in table1 and the table name you specify in the IN clause has to be the table the relation starts in.

Final step then really is just one replace ALL
Code:
Replace [b]ALL[/b] fname with "david", recupdate with dtos(date()) [b]FOR[/b] fname="john" [b]IN table1[/b]

You don't profit from the fact you can do a single loop and test several different conditions and in the different cases make diffent replacements, the UPDATE-SQL also doesn't allow you to do different things on different conditions met, you simply do two updates in that complicated case and it also pays in case of REPLACES, as you never loop all rows, if you index your data according to rushmore optimization.

So it seems you're under the wrong immpresssion the repeated iteration will kill the process and mean much more time spended. No, its vice versa: Even the single full iteration is your killer, you don't make use of indexes this way, as the conditions you check are only checked for the current record with your if statements. If you do a SCAN FOR loop, that FOR clause like LOCATE can make use of an index to optimize the for condition and ideally can visit the relevant records directly and skip all rows not fulfilling the condition. That alone pays so much, that you SCAN multiple times to work on other conditions and make other changes, as you would also be forced to do in Update-SQL.

I don't say you have to change your main loops to all single REPLACES, you can work with SCAN FOR , LOCATE+ SCAN REST and other variants and test what is best. Just get the idea out of your head you're better off with visiting all rows in a single loop and do all the stuff while being there. If the idea is out of your head crush it, kill it, so it never can get back to you.

Imagine you would really travel USA and visit all homes, ask if a John is living there and if there is tell them to now listen to the name David. You rather look at your index of first names and only visit addresses you know a John is living there to tell them the news about their renaming.

Bye, Olaf.
 
Some more reasoning, why a single iteration and checking conditions on all rows is so much more expensive:

1. The skip itself is inexpensive, you can do an empty SCAN..ENDSCAN very fast. What is done? The record pointer is moved, it always moves the same number of bytes, the RECSIZE(). There is no reading of data, and even RECSIZE() only needs to be determined once.
2. The IF conditions are expensive: You test a condition, more probable a list of conditions, so even in case nothing has to be done on a record, you read parts of it maybe even several times in different IF condditions.

To compare to the real travelling example you travel all addresses with a list of conditions you then ask and at most addresses will have this lengthy conversation of all the questions which all are answered with no. You don't do that.
As I said the FOR clause of SCAN can skip all rows not fulfilling the condition, it's not the skipping paying most, it's knowing to be able to skip without asking the records each.

An index has one place for a value and in that value stores the record numbers having that value. That means one IF check doesn't only tell you yes or no for a current row, it gives the row numbers you need to visit. That's what rushmore does.

Bye, Olaf.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top