Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reconciliation utility needed - find difference in long list of recs

Status
Not open for further replies.

DwayneS

Instructor
Mar 29, 2002
70
Accounting user. Novice+ vb programming skills. Using Access 2003 for lots of accounting tasks and records.

I need a utility routine to help with a common reconciliation problem. I have a known difference in a table of thousands of entries (as many as 200,000).

I have a difference, say 12,218.13. The difference should be in one item, but often it is in more than one item. What I would love to be able to do is evaluate the long list of items for "candidates" that either (1) add up to the difference I'm looking for, or (2) Find two amounts, when subtracted from each other yield the difference.

I've done a fair amount of Access and vb work, but I don't know how to approach this. I think it's a logical problem, not technical.

Any ideas how I might be able to accomplish this without ending up with something that runs 24 hours?

Dwayne Streeter, CPA, CITP
 
Dwayne,

Is there some way for you to narrow down the search parameters? It looks to me like you've stepped into the world of combinatorial mathematics. As fascinating as it may be, it's not much fun.

Find two amounts, when subtracted from each other yield the difference.

Finding the difference of the first value compared to all others means 199,999 calcs. Plus the second value adds 199,998 calcs, plus the 3rd, 4th, ... 200,000th.

Probably the quickest check would be a query that subtracts your 'variance' from each value and then uses DCount() to see if that value exists in the original field.

I don't have a fix on your table structure or how you determine the discrepancy, but something along the lines of the following may help.

Code:
SELECT myVal, myVariance, myVal - myVariance AS myCompare, DCount("*","tblValues","[myVal] = " & Abs(myVal - myVariance)) AS myReturn 
FROM tblValues;

If myVal - myVariance = zero, then your discrepancy is equal to an existing value. If the DCount returns a value greater than zero, then your discrepancy is equal to the sum of two of the values.


HTH


John






When Galileo theorized that Aristotle's view of the Universe contained errors, he was labeled a fool.
It wasn't until he proved it that he was called dangerous.
[wink]
 
First of all BoxHeads approach will not work due to floating point arithmetic. First thing to remember is 2 - 1 = 1, but 2.0 - 1.0 may not equal 1.0. You can read about doing subtraction between floating point numbers.

Therefore you need this
Code:
Public Function myRound(ByVal Number As Double, Optional ByVal NumDigitsAfterDecimal As Integer = 0) As Double
  myRound = Fix(Number * (10 ^ NumDigitsAfterDecimal) + (0.500000000001 * Sgn(Number))) / (10 ^ NumDigitsAfterDecimal)
End Function

If you had a super powerful computer here is your solution
Code:
SELECT B.ID, B.dblValue, A.ID, A.dblValue, Abs(myRound([A].[dblValue]-[B].[dblValue],3)) AS Diff
FROM tblValue AS B, tblValue AS A
WHERE (((A.ID)<>[B].[ID]) AND ((A.dblValue)<>[B].[dblValue]) AND ((Abs(myRound([A].[dblValue]-[B].[dblValue],3)))=[Enter Value]));

However, since this does a cartesian product you would have N*(N-1) records. Close to 400 million.

So I think you need to use a binary search. Basically take your first record and then go to the middle of the recordset to get your second value. Determine if the difference is greater or less than your desired difference and then either split the upper or lower half. Keep splitting until you find your desired difference or you can split no more. This will radically decrease the number of checks you would have to do.

It is an interesting problem, I will see if I can code some of it.
 
Hiya, MajP.

You bring up a good pont. I hadn't considered the floating point numbers.

I had assumed (from the "CPA" and "Accounting User" and Dwayne's website) that the numbers we're looking at are dollars and cents and since currency is treated by Access as a Fixed Point number the simple subtraction would, I believe, be reliable.

I shouldn't assume.

______

Dwayne, I would repeat that if you can "narrow down the search parameters" you would be doing yourself a big favor. It may even be a step toward eliminating the discrepancies upstream.

You haven't posted enough information to really latch onto so I am left to some assumptions and my imagination. [ponder]
I have to imagine that there could be fields in the table where a comparison by Groupings could narrow the search.
For example, my calculated FICA wthholding on the total payroll doesnt match my reported withholding. I could maybe group by Department or Cost Center?

Food for thought...


John

 
You're helping. I'm getting ideas already and will keep you posted. I already thought of the FP issue and planned to deal with it.

What I've always tried to do with large records sets is to find logical candidates and post them in a "temp" table, then run my logic on the smaller sample. Not foolproof, of course, but makes the size more manageable. I'm trying to think of a logical way to find the candidates.

Dwayne Streeter, CPA, CITP
 
Duh! I slept on it and just came up with a big help. Thousands of items on my list will have exact offsets - i.e. 2,400.00 debit, 2,400.00 credit. Those items don't need to be tested. Can I use a "find duplicates" query some way to eliminate them from my population?

Dwayne Streeter, CPA, CITP
 
Are the fields in the same table or different tables? Is there a any key that relates a 2400 debit to a specific 2400 credit? Do posting dates and cost centers align?





When Galileo theorized that Aristotle's view of the Universe contained errors, he was labeled a fool.
It wasn't until he proved it that he was called dangerous.
[wink]
 
All the numbers are in one table. Typically, there should be a Debit and a matching credit, but in the real world the money out (Debit) does not always equal the amount in (Credit). My approach would be to use an absolute(round(number)) to make the amounts comparable, then look for unequal amounts. In theory, a very large majority of my "problem" items will be unmatched items. I know I can't really reconcile this thing, but it would be very, very helpful to quickly locate the majority of items.

Dwayne Streeter, CPA, CITP
 
Would it be safe to say that with any given number in the table, there should always be an even number of entries?

If so, combining DCount and the Mod functions could get you a list of amounts that are entered an odd number of times. Like DCount("*","myTbl","[myAbsRdAmt] = " & [myAbsRdAmt]) Mod 2

That should return zero for any value that has an even number of entries and 1 for values that have an odd (or no) number of matches.

A Totals query would probably yield the same results faster.


HTH

 
If myVal - myVariance = zero, then your discrepancy is equal to an existing value.
First of all BoxHeads approach will not work due to floating point arithmetic.
What if you compare the absolute difference to a threshold? For example:
Code:
abs(myVal - myVariance) < 0.5
 
Here is a pretty good solution using a binary search. Assuming I did this correctly. It seems to give logical solutions. This is actually only half the code, because it looks forward finding only those values where

Val2 - Val1 = yourDifference
does not find
-(Val2-Val1) = your Difference

But this is pretty fast. I ran this on some sets of a thousand records, and it only takes about 1900 loops (1/2 the answer). If you checked every one it would be 1,000,000 calculations. Orders of magnitude faster.
It needs some cleaning up on the output.

Code:
Public Function myRound(ByVal Number As Double, Optional ByVal NumDigitsAfterDecimal As Integer = 0) As Double
  myRound = Fix(Number * (10 ^ NumDigitsAfterDecimal) + (0.500000000001 * Sgn(Number))) / (10 ^ NumDigitsAfterDecimal)
End Function


Public Sub findDifference(strTable As String, strIDFld As String, strValFld As String, dblDiff As Double)
  Dim RS As DAO.Recordset
  Dim strSql As String
  Dim intRangeHi As Integer
  Dim intRangeLo As Integer
  Dim intVal1Position As Integer
  Dim intMovePosition As Integer
  Dim intLoops As Integer
  Dim val1 As Double
  Dim val2 As Double
  Dim ID1 As Long
  Dim ID2 As Long
  Dim dblCalcDifference As Double
  Dim intRecords As Integer
  Dim blnFound As Boolean
  
  strSql = "Select " & strIDFld & " AS ID, " & strValFld & " AS dblVal FROM " & strTable & " AS tblValues"
  strSql = strSql & " ORDER BY dblValue"
  Set RS = CurrentDb.OpenRecordset(strSql, dbOpenDynaset)
  RS.MoveFirst
  intRecords = RS.recordcount
  intRangeHi = intRecords
  
  Do While Not RS.EOF
      val1 = RS.Fields("dblVal")
      ID1 = RS.Fields("ID")
      intRangeLo = RS.AbsolutePosition
      intVal1Position = RS.AbsolutePosition
    Do
      RS.AbsolutePosition = (fncMovePosition(intRangeLo, intRangeHi))
      'intCurrentPosition = rs.AbsolutePosition
      val2 = RS.Fields("dblVal")
      ID2 = RS.Fields("ID")
      dblCalcDifference = val2 - val1
      'Debug.Print dblCalcDifference & ": val 1= " & val1 & " val 2= " & val2 & "  position= " & RS.AbsolutePosition
      If dblCalcDifference = dblDiff Then
        blnFound = True
        Debug.Print "Difference: " & dblCalcDifference
        Debug.Print "val1= " & val1
        Debug.Print "   val12 " & val2
        Debug.Print "   ID2= " & ID2 & "   ID1 = " & ID1
        Call checkAllLikeVal2(RS, val1, val2)
       ElseIf dblCalcDifference > dblDiff Then ' Go down
         intRangeHi = RS.AbsolutePosition
         RS.AbsolutePosition = (fncMovePosition(intRangeLo, intRangeHi))
       Else 'Go up
         'MsgBox "move up " & intRangeLo & " " & intRangeHi
         intRangeLo = RS.AbsolutePosition
         RS.AbsolutePosition = (fncMovePosition(intRangeLo, intRangeHi))
         intLoops = intLoops + 1
       End If
    Loop Until blnFound Or (intRangeHi - intRangeLo) <= 1
    RS.AbsolutePosition = intVal1Position
    RS.MoveNext
    blnFound = False
  Loop
  MsgBox intLoops
End Sub


Public Sub testDiff()
  Call findDifference("qryValue", "autoID", "dblValue", 2)
End Sub

Public Function fncMovePosition(intLo As Integer, intHi As Integer) As Integer
  fncMovePosition = Fix((intHi - intLo) / 2) + intLo
End Function

Public Sub checkAllLikeVal2(ByVal RS As DAO.Recordset, val1 As Double, val2 As Double)
  'Go to the beginning of like val2's
  Do While Not RS.BOF
    RS.MovePrevious
    If Not RS.Fields("dblVal") = val2 Then
      RS.MoveNext
      GoTo lblFoundFirst
    End If
  Loop
lblFoundFirst:
 Do
    Debug.Print "Possible Others:   ID2= " & RS.Fields("ID")
    RS.MoveNext
  Loop Until Not (RS.Fields("dblVal") = val2) Or RS.EOF
  Debug.Print
End Sub
 
Oops. I sent you the version I was using with integers for test purpose.

replace
dblCalcDifference = val2 - val1
with
dblCalcDifference = myRnd(val2 - val1, the number of digits)
 
Oops again. This version needed this at the end

Loop Until blnFound Or (intRangeHi - intRangeLo) <= 1
RS.AbsolutePosition = intVal1Position
RS.MoveNext
intRangeHi = intRecords
blnFound = False
 
One more thing. I mispoke about the complexity of doing it with a SQL join. You would not have to do a cartesian product, but an unequal join

where A.dblValue < B.dblValue

This would significantly reduce the number of records from
(n)*(n-1) to Summation(N-i) from (i:1 to N)

Also I suggest that my algorithm is efficient, but the code is not necessarily fast due to the recordset data structure. However, this code could be easily modified to an array and it would be very fast.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top