Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

compare values from different size tables

Status
Not open for further replies.

fanlinux90

Programmer
Oct 31, 2022
22
CO
With vfp 6 I am trying to find the different values that two tables have but the result is empty, I am using two scans to go through the tables simultaneously.
The SopFact table has 100 records and Fac_Bire 102. Inconbi should have 2 records: 1020 and 850 , but it comes out empty

select Fact
scan
select SopFact
scan
if SEEK(SopFact.codigo)
else
SELECT inconbi
APPEND BLANK
replace incon.codigo WITH Fact.codig
endif
endscan
endscan
 
First of all, you should realize that this code does not do what you describe because it acts on the aliases Fact and Sopfact, not Fact and Fac_Bire. Then your code is missing the section where you set the index orders, you can't just SEEK without an index order set. It should be ordered by codigo, as that's what you seek, but your error could be as simple as having the wrong index order set.

Then, it makes no sense to have nested scan loops with SEEKS, you can only scan one table and seek within the other for missing records. That will only tell you about records in table1 not in table2, to alo find records in table2 not in table1 you then need to do the same the other way around, but that's not done nested, that's done one after the other in two scan loops. Note: At the begin of SCAN loops you are one record further from where you were at the endscan, so if you SEEK in table2, the inner scan loop does continue from the found reecord or none, if none was found. So that'll stop early and won't enable to search in both tables at the same time.

I hear you. In short, you say you want to list "the different values that two tables have", well. Let me show you how that's best done with a full outer join in SQL.

Code:
Create Cursor alias1 (codigo int)
Insert into alias1 values (1)
Insert into alias1 values (2)
Insert into alias1 values (3)

Create Cursor alias2 (codigo int)
Insert into alias2 values (2)
Insert into alias2 values (3)
Insert into alias2 values (4)

From the sample data it's clear alias1 has a codigo=1 record, which is missing in alias2 on the other hand alias2 has a codigo=4 that's missing in alias1.

Now here's the SQL that'll list these two records without a match in the other table:

Code:
Select alias1.codigo as codigo1, alias2.codigo as codigo2 from alias1 full outer join alias2 on alias1.codigo = alias2.codigo;
where IsNull(alias1.codigo) or IsNull(alias2.codigo) into cursor result
browse

And here's the result:
codigo_dvrecb.jpg


It's not what you wanted. But it's not just a list with 1 and 4, it has two fields instead of just one. With this structure of two fields, it tells you more than just a plain list. It tells you that codigo1=1 exists in alias1 and is missing in alias2, whereas codigo2=4 exists in alias2 and is missing in alias1. So it has more information for you than just a list of 1 and 4.

If you really just aim for a straight-forward list that's also doable with:
Code:
Select Iif(IsNull(alias2.codigo), alias1.codigo,alias2.codigo) as codigo from alias1 full outer join alias2 on alias1.codigo = alias2.codigo;
where IsNull(alias1.codigo) or IsNull(alias2.codigo) Into Cursor result2
Browse

I'm pretty sure both queries avoid using anything VFP6 is incapable of executing. In VFP9 I would make use of NVL(), but I know that's not available in VFP6.

Chriss
 
Hi,

... or if you don't feel comfortable with SQL, you may want to try something like this

Code:
use table1 in 0
use table2 in 0
use table3 in 0

select table1
scan
[indent]If not INDEXSEEK(codigo, .t., table2, TheIndex)[/indent]
[indent]Append Blank in table3[/indent]
[indent]replace table3.codigo WITH table1.codigo in table3[/indent]
[indent]endif[/indent]
endscan 

Select table 3
browse

hth

Mark
 
MarK,

that's only getting the 1 in my sample data, you have to do this in reverse, too.
Code:
use table1 in 0
use table2 in 0
use table3 in 0

select table1
scan
If not INDEXSEEK(codigo, .f., table2, TheIndex)
Append Blank in table3
replace codigo WITH table1.codigo in table3
endif
endscan 

select table2
scan
If not INDEXSEEK(codigo, .f., table1, TheIndex)
Append Blank in table3
replace codigo WITH table2.codigo in table3
endif
endscan 


Select table 3
browse

Also, no need to go to the record in the other table, when it's found in indexseek, you just want to know whether it exists or not, if not you take the codigo of the source table for the seek, that has the codigo value to note down in table3. Therefore lMovePointer can be .f.

I spare to replace the APPEND+REPLACE with a single INSERT-SQL if you really want to avoid SQL, but see, it's the logical alternative and does this in one go, also using indexes on the codigo fields in both tales to optimize the query, including to create and populate the result alias.

Chriss
 
Hi Chris,

you have to do this in reverse, too.

That depends on what you want - see fanlinux's code

Also, no need to go to the record in the other table, when it's found in indexseek, you just want to know whether it exists or not, if not you take the codigo of the source table for the seek, that has the codigo value to note down in table3. Therefore lMovePointer can be .f.

You're right

to replace the APPEND+REPLACE with a single INSERT-SQL

Again you're right, but if fanlinux were comfortable with SQl he would have used it and not his code - don't you think?

MarK
 
That depends on what you want - see fanlinux's code
Well, his code is not really descriptive. And from the record count sample he gives, he might only need the 2 records of Fac_Bire not in SopFact. But what if SopFact had 104 records and Fac_Bire still only 102?
My sample also shows, that even two tables with an equal number of records can differ in the set of codigo values.

Fanlinux, you might have had an idea that actually works, too, just not as you coded it, based on your description:
fanlinux said:
I am using two scans to go through the tables simultaneously.

If you expect just a few differences when sorting both tables by codigo it's likely both start with the same value and then skip forward 1 record in both at the same time that remains until you get to a point where one has a codigo value missing in the other. You might have thought the way to seek within the inner scan you do that. But you forget that a SEEK not finding a record goes to EOF, sabotaging the parallel scanning by ending prematurely.

Here's a solution for scanning both tables in parallel until differences are found and noted:
Code:
Create Cursor alias1 (codigo Int)
Insert Into alias1 Values (1)
Insert Into alias1 Values (2)
Insert Into alias1 Values (3)
Insert Into alias1 Values (6)
Index On codigo Tag codigo

Create Cursor alias2 (codigo Int)
Insert Into alias2 Values (2)
Insert Into alias2 Values (3)
Insert Into alias2 Values (4)
Insert Into alias2 Values (5)
Insert Into alias2 Values (6)
Insert Into alias2 Values (7)
Index On codigo Tag codigo

Create Cursor result (codigo int, missingin c(10))

Select alias1
Scan
   If not alias2.codigo==alias1.codigo
      Do While alias1.codigo<alias2.codigo
         Insert into result values (alias1.codigo, 'alias2')
         Skip 1 in alias1
      EndDo 
      Do While alias1.codigo>alias2.codigo
         Insert into result values (alias2.codigo, 'alias1')
         Skip 1 in alias2
      EndDo 
   EndIf   
   Skip 1 in alias2
EndScan

Select alias2
Scan rest 
   Insert into result values (alias2.codigo, 'alias1')   
EndScan 

Select result
Browse
You see, I use one scan loop and "scan" through alias 2 by SKIPs. For a series of matching records, the only code that executes is the If not alias2.codigo==alias1.codigo which doees nothing and I skip 1 in alias2. So it does parallel scanning. In case there is a mismatch that points out a missing record in one of both tables. In case alias1.codigo<alias2.codigo the alias1 value is missing from alias 2, and it could be a series of missing values, so that's done in a while loop, the same for the other way around. In a normal case only one of the while loops will iterate once.The end situation after both while loops is alias2.codigo==alias1.codigo again, and then I skip 1 in alias2, as the scan loop (skips 1 in alias 1), so that's always for parallel scanning.

You might have thought nesting scans does the same job combined with the seek, but it doesn't. Because a not found record will sabotage your idea of using that for parallel scanning, you go to eof of the second table earlier than you want to.

You can slightly vary this code. For example it seems unnecessary to skip in alias1 in a while loop, as the scan loop does anyway. But it becomes convoluted pretty fast, if you try streamline the code for minimum branching. It becomes hard to understand and therefore also bad to maintain. I still favor the SQL approach for the best clarity, you just have to understand what a full outer join result is and what part of it you want to keep by where clause filtering.

Chriss
 
fanlinux90,

there's still the open question requiring your feedback about what you really need. Your example in count does suggest, that you may also only need codigo values missing from one table, no matter what mayy be missing in the other. No matter if both tables can miss values from the other or that's actually only possible in one direction.

So, in short, even with the data example I gave so far where both tablesmiss values, you can also use both the SQL and the non-SQL code to only look for missing data in one of the two aliases.

1. left/right outer join instead of full

That's the solution to that problem in SSQL speak, instead of the full outer join you only look from the left or right side of it. And by the way, any right join can be turned into a left join by simply swapping table names around, so we can limit that to left outer joins, too.

All code samples need the same sample data, so just take the start section of code creating cursors, inserting values, and indexing codigo fields, then this code finds the data missing in one table only:
Code:
Select alias1.codigo as codigo from alias1 Left outer join alias2 on alias1.codigo = alias2.codigo;
where IsNull(alias2.codigo) Into Cursor result1

Select alias2.codigo as codigo from alias1 Right outer join alias2 on alias1.codigo = alias2.codigo;
where IsNull(alias1.codigo) Into Cursor result2

*or as left join 
Select alias2.codigo as codigo from alias2 Left outer join alias1 on alias1.codigo = alias2.codigo;
where IsNull(alias2.codigo) Into Cursor result3
* so result2 and result3 are the same and show records missing in alias 2

That's the way to do it if you ask me.

And the non-sql code in the first place only needs to do the inserts about the alias your interested it, not the other one, so the straightforward change to only care for records missing in alias2 would be:
Code:
Create Cursor result4 (codigo int, missingin c(10))

Select alias1
Scan
   If not alias2.codigo==alias1.codigo
      Do While alias1.codigo<alias2.codigo
         Insert into result4 values (alias1.codigo, 'alias2')
         Skip 1 in alias1
      EndDo 
      Do While alias1.codigo>alias2.codigo
         *Insert into result4 values (alias2.codigo, 'alias1')
         Skip 1 in alias2
      EndDo 
   EndIf   
   Skip 1 in alias2
EndScan

*Select alias2
*Scan rest 
*   Insert into result4 values (alias2.codigo, 'alias1')   
*EndScan 

Select result4
Browse

Notice I did remove the whole scan rest loop, as that's all about records missing in alias1, which isn't interesting when only looking in one direction, but I kept the Do while that skips over records in alias2, which is necessary to synchronize until both record pointers are at the same places in both aliases in the next scan loop start. Could we skip forward faster? Not really, there's no SEEK REST command, for example, there could be made a nested SCAN REST replacing the do while loop:

Code:
Select alias1
Scan
   If not alias2.codigo==alias1.codigo
      Do While alias1.codigo<alias2.codigo
         Insert into result4 values (alias1.codigo, 'alias2')
         Skip 1 in alias1
      EndDo 
      Select alias2
      Scan Rest While alias1.codigo>alias2.codigo
      EndScan
   EndIf   
   Skip 1 in alias2
   * ? Recno('alias1'), Recno('alias2')
EndScan

Which doesn't shorten it very much and still means a series of 1 record sskips instead of a leap forward. To me it also showcases how more intricate you have to think about the no-SQL code to adapt it to exact needs instead of mainly flipping from full to left/right join. Aside from the fact that this use of noSQL is not the meaning of the term when actually using a noSQL database, just literally meaning to not use SQL to solve a problem.

In comparison the sql version mainly changes from a full to a left/right join and once you understand the principle it's easier to maintain that and adapt it to changes, when they become necessary.

Chriss
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top