SQL Puzzle - updating based on another field 2

gbaughma · Jul 17, 2017

So... I have a couple of tables (simplified view here)

Code:

Documents table  (The "Documents" table is the "pages" in the "files" table)
DocumentID    File ID 
1             1234    
2             1235  
3             1236    
4             1237    
5             1238    

Files Table  (The "Files" table is the header information)
FileID            Field2         Status
1234              Project1       Active
1235              Project1       Active
1236              Project2       Active
1237              Project2       Active
1238              Project3       Active

This would normally represent a one-to-many... where one Document would have multiple Pages. However, because of the way the importer worked, every page got its own Document and File.

... It *Should* look like this:

Code:

Documents table  (The "Documents" table is the "pages" in the "files" table)
DocumentID    File ID 
1             1234    (Linked to the first occurrence of "Project1" in the files table)
2             1234    (Linked to the first occurrence of "Project1" in the files table)
3             1236    (Linked to the first occurrence of "Project2" in the files table)
4             1236    (Linked to the first occurrence of "Project2" in the files table)
5             1238    (Linked to the first, and only, occurrence of "Project3" in the files table)

Files Table  (The "Files" table is the header information)
FileID            Field2         Status
1234              Project1       Active
1235              Project1       Deleted   (Now logically deleted, because no documents associated with it)
1236              Project2       Active
1237              Project2       Deleted   (Now logically deleted, because no documents associated with it)
1238              Project3       Active

So, I'm trying to figure this out. It's about 28,000 lines I have to go through. Essentially, I need to turn a one-to-one table association into a one-to-many, updating the Documents table pointing to the first occurrence of the project in the Files table, and logically deleting (by updating the status) in the Files table to logically delete it so it doesn't display with no documents inside of it (since the documents will have been "moved" to point to the proper Files header.

Thank you in advance for pointing me in the right direction!

--Greg

Just my $.02

"What the captain doesn't realize is that we've secretly replaced his Dilithium Crystals with new Folger's Crystals."

--Greg

bborissov · Jul 17, 2017

How about:

Code:

SELECT FileId
      ,Field2
      ,CASE NOT EXISTS(SELECT * FROM Documents WHERE Documents.FileID = Files.FileId)
            THEN 'Deleted'
       ELSE 'Active' END AS Status
FROM Files

NOT TESTED!

Borislav Borissov
VFP9 SP2, SQL Server

gbaughma · Jul 17, 2017

Mmm... that would do for setting the "Deleted" flag for orphaned records... but I need to figure out the UPDATE to make the one-to-one tables a one-to-many table.

Just my $.02

"What the captain doesn't realize is that we've secretly replaced his Dilithium Crystals with new Folger's Crystals."

--Greg

bborissov · Jul 18, 2017

And how do you know which record is deleted and which is orphaned?

Borislav Borissov
VFP9 SP2, SQL Server

Olaf Doschke · Jul 18, 2017

I think you should change your importing routine. Import into a staging table and then feed files and documents table from there in the right way.

Bye, Olaf.

gbaughma · Jul 18, 2017

I agree with you, Olaf... but unfortunately I'm constrained by the importer itself. I had to import the files using the vendor's importer, then import a CSV file, and rename fields after cross-referencing which file they were (it's a document storage system).

This means that I had to bring them in as individual documents then rename them in the database; now I have to consolidate the pages of the documents that are associated. It's a PITA.

Borislav: To answer your question, it doesn't really matter. As long as the associated files are pointed to the same "header", the rest of the now unused headers can be logically deleted.

So, to explain (since I may not have made myself clear the first time).... Let's say I have 4 files. These 4 files belong to two documents; document "A" and document "B"

So, I have header rows: Document "A", Document "A", Document "B", Document "B". I don't need two Document "A" and "B"'s, but the files are each pointing to the header rows, in a one-to-one relationship. I need to say "File 1 --> Document A" and "File 2 --> Document A", then delete the second "Document A" which is now no longer needed. So I need to switch the one-to-one to a one-to-many.

Just my $.02

"What the captain doesn't realize is that we've secretly replaced his Dilithium Crystals with new Folger's Crystals."

--Greg

Olaf Doschke · Jul 18, 2017

Greg said:
I need to say "File 1 --> Document A" and "File 2 --> Document A", then delete the second "Document A" which is now no longer needed. So I need to switch the one-to-one to a one-to-many.

I think you meant to say: "File 2 --> Document B". Then delete secondary rows of both Document A and B, and that finally means turning 1:n to 1:1 not vice versa. You now have 1:n as you have repeated foreign keys A (twice) and B (twice).

Referring to your original posted data you have multiple foreign key values for the same data, you want to get rid of those. I would say you need to group Files Table data by Field2 and determine MIN(FieldID) as the remaining foreign key. But you also need to know all the other FileIDs for same Field2 value, to be able to translate them to the min key.

Round1: Select Min(FileID) as MinFileID, Field2 from Files Group By Field2
Round2: Select FieldID, Field2, MinFileID from Files INNER JOIN (Round1 Query.MinFileID) ON Field2 matching and FileID>MinFileID
Round3: Use Round2 data to update Documents records with FileIDs to the MinFileID they correspond to
Round4: Delete File data with FileIDs as found in Round2.

Bye, Olaf.

gbaughma · Jul 18, 2017

Well, I think I got it.... but I had to use a cursor. That's not the end of the world, since this should be a "once in a great while" situation.

Code:

DECLARE @DocIndex VARCHAR(20)

DECLARE MY_CURSOR CURSOR  LOCAL STATIC READ_ONLY FORWARD_ONLY

FOR SELECT DISTINCT Field2 FROM Files 

OPEN MY_CURSOR
FETCH NEXT FROM MY_CURSOR INTO @DocIndex 

WHILE @@FETCH_STATUS = 0
BEGIN 

UPDATE Documents SET FileID=(
SELECT MIN(Documents.FileID) 
  FROM [Documents] WHERE FileID IN (SELECT FileID FROM [Files] WHERE Field2 = @DocIndex ))
  WHERE FileID IN (SELECT FileID FROM [Files] WHERE Field2 = @DocIndex)

UPDATE Files SET Status=7 WHERE Field2=@DocIndex  AND NOT EXISTS (SELECT * FROM Documents WHERE Files.FileID = Documents.FileID)

FETCH NEXT FROM MY_CURSOR INTO @DocIndex

END

CLOSE MY_CURSOR
DEALLOCATE MY_CURSOR

... I know... it's messy, and it's brute force...

Just my $.02

"What the captain doesn't realize is that we've secretly replaced his Dilithium Crystals with new Folger's Crystals."

--Greg

johnherman · Jul 20, 2017

CURSOR is not a dirty programming technique. Profanity is one language all IT professionals know fluently (humor).

A CURSOR is a reasonable solution to a problem, esp one time solutions. However, since CURSORs are usually less efficient than pure SQL, it may be necessary to write pure SQL if performance timings are critical.

Traditional modular programming techniques teach that the GO TO statement is an abomination. However, it is one of the most efficient statements. Needless to say, too many GO TO statements causes spaghetti code. One or two, however, can shorten programming time and perhaps even make the code more understandable to others.

A non-elegant solution is still a solution.

==================================
advanced cognitive capabilities and other marketing buzzwords explained with sarcastic simplicity

Olaf Doschke · Jul 21, 2017

If I had the time I would have put this up in CTE, you know you can have multiple CTE referencing previous CTE, so you can not only do a two step solution with CTE + query on the CTE.
Otherwise I agree with John. Sure we all know profanity

Bye, Olaf.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

SQL Puzzle - updating based on another field 2

gbaughma

IS-IT--Management

bborissov

Programmer

gbaughma

IS-IT--Management

bborissov

Programmer

Olaf Doschke

Programmer

gbaughma

IS-IT--Management

Olaf Doschke

Programmer

gbaughma

IS-IT--Management

johnherman

MIS

Olaf Doschke

Programmer

Similar threads

Part and Inventory Search

Sponsor