Is it possible to Compare Arrays directly

DocBus · Aug 8, 2005

I have a problem with a large table with redundant data (>1GB).

The table looks something like this

ID, date, sequence number, details
1, x , 1 , aaaaa
1, x , 2 , bbbbb
1, x , 3 , ccccc
1, x+1 , 1 , aaaaa
1, x+1 , 2 , bbbbb
1, x+1 , 3 , ccccc
.....

1, x+n , 1 , ZZZ

The problem arises in the fact all data is duplicated until it changes in the details field (e.g. from the series aaaaa, bbbbbb, cccccc to ZZZ). I must prune this table so that only the first instance of any combination (e.g. the aaaaa, bbbbb, ccccc AND ZZZ) is kept and the rest is deleted. All values in the table except for the date (the set) should exactly match before I can delete it and I need to keep the first unique set.
I wrote what I thought to be a very elegant code (which is obviously not) to identify the first instances and process it into arrays.

The code stored the first three rows into an array aFirstSeries and the next three rows into array aRedundantSeries

My problem comes in here

if aFirstSeries = aRedundantSeries ...

the logical value always returns true! In debugging the arrays clearly differ when for example the first rows of ZZZ is stored to aRedundantSeries. (array aFirstSeries now have 3 row elements and aRedundantSeries only one.

Does anyone have a help with this?

Thanks in advance!

mm0000 · Aug 8, 2005

I'm not sure if I understand what you require but wouldn't a simple SQL statment do it:

select distinct id, sequence_no,detail from redundant_table into new_table

Stella740pl · Aug 8, 2005

I didn't understand whether you need to keep a sample of each detail (one of each aaaa, bbbb, ...., zzzz)
or just the first one, like aaaa, is all you need.
Can you clarify, please?

What do you want to get as a result? Something like this:

ID, date, sequence number, details
1, x , 1 , aaaaa
1, x+1 , 1 , aaaaa
1, x+n , 1 , aaaaa

2, x , 1 , aaaaa
2, x+1 , 1 , aaaaa
2, x+n , 1 , aaaaa

Or it would be more like this:

ID, date, sequence number, details
1, x , 1 , aaaaa
1, x , 2 , bbbbb
1, x , 3 , ccccc

2, x , 1 , aaaaa
2, x , 2 , bbbbb
2, x , 3 , ccccc

Does any sequence number always start with 1 or it can start with something else?

In any case, it seems that you don't need to compare arrays for this.
You solution could be anything from a simple DELETE FOR … to a SELECT (sql) … GROUP BY statement to a code consisting of 2-3 SELECT statements.

Please clarify your requirements so a more targeted solution can be defined.

chpicker · Aug 8, 2005

You might try something like this:

Code:

SELECT field1, field2, field3, field4, MIN(datefield) AS datefield FROM MyTable GROUP BY 1,2,3,4 INTO TABLE pruned

If I understood your post correctly, the redundant records match each other in every field except for the date field. The above select statement assumes there are only 4 fields other than the date field and that the date field is the last field in the table. Obviously, you will need to change "field1", "field2", etc. to your actual field names and add additional fields if you need them. Just make sure the "GROUP BY" clause contains the field numbers for ALL of your fields except the DATE field.

Anyway, this statement will create a new table which contains only one record for each redundant record it finds and use the earliest date it finds for the date field. If you wanted the most recent instead, use MAX instead of MIN.

Ian

DocBus · Aug 9, 2005

Thank you all for the posts!

I have read my original post and apologize for the lack of clarity in the question. To clarify the problem:

The ID's vary from numeric to text. They represent a location. The result I need is something like this if the original table is this :

ID, date, sequence number, details
1, x , 1 , aaaaa
1, x , 2 , bbbbb
1, x , 3 , ccccc
1, x+1 , 1 , aaaaa
1, x+1 , 2 , bbbbb
1, x+1 , 3 , ccccc
1, x , 1 , ZZZ
1, x+1 , 1 , ZZZ
2, x , 1 , yy
2, x , 2 , xbxb
2, x , 3 , xaxa
2, x+1 , 1 , yy
2, x+1 , 2 , xbxb
2, x+1 , 3 , xaxa

I need

ID, date, sequence number, details
1, x , 1 , aaaaa
1, x , 2 , bbbbb
1, x , 3 , ccccc
1, x , 1 , ZZZ
2, x , 1 , yy
2, x , 2 , xbxb
2, x , 3 , xaxa
etc.

Therefor the first three records form a set that should be kept (the dates differ and the contents of details) but the group of details forms a set which repeats redundantly (aaaaa,bbbbb,ccccc) is repeated on the next day until you get to a record which contains the same date as the original set but the details are different but NOW the set only consists of one record containing -ZZZ - which repeats redundantly. The sets vary from one record only to 20.

I hope this helps and thanks already for the help. I could not get it to work with SQL.

JimWWinter · Aug 9, 2005

ChPicker already gave you this, but the following should work:

SELECT id, ;
SequenceNumber, ;
Details, ;
MIN(Date) as Date ;
FROM Source ;
GROUP BY 1, 2, 3 ;
INTO TABLE Target

Regards,
Jim

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Is it possible to Compare Arrays directly

DocBus

IS-IT--Management

mm0000

IS-IT--Management

Stella740pl

Programmer

chpicker

Programmer

DocBus

IS-IT--Management

JimWWinter

Programmer

Similar threads

Part and Inventory Search

Sponsor