Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

split one row into multiple rows... help pls 2

Status
Not open for further replies.

peac3

Technical User
Jan 17, 2009
226
AU
Hello SQL gurus,

I am trying to split some of the data from one row into multiple rows with conditions.
in the data set, the original value of ActionType could be Changed, Added, or deleted.

The condition of changes are:
[ul]
[li]If ActionType = Changed, we must split it into 2 rows : ActionType = Deleted and then ActionType = Added. The Deleted row will have IdentityCode = FromIdentityCode and the Added row will have IdentityCode = ToIdentityCode[/li]
[li]If ActionType = Added, we don't need to split the row, replace IdentityCode = ToIdentityCode.[/li]
[li]If ActionType = Deleted, we don't need to split the row, replace IdentityCode = FromIdentityCode.[/li]
[/ul]

Here are the examples:

from the original dataset is as below:
Code:
RowNumber ParentID ChildID   ActionType      Date           FromIdentityCode ToIdentityCode
1         ABC        123     Change            2024-05-03              T         P
2         BCD        234     Deleted           2024-05-05              T         NULL
3         CDE        345     Change            2024-05-07              P         T
4         DEF        456     Added             2024-05-08              NULL      P

The expected result as per below:
Code:
RowNumber ParentID ChildID   IdentityCode ActionType      Date
1         ABC        123     T            Deleted    2024-05-01 
2         ABC        123     P            Added      2024-05-03 
3         BCD        234     T            Deleted    2024-05-05 
4         CDE        345     P            Deleted    2024-05-06
5         CDE        345     T            Added      2024-05-07 
6         DEF        456     P            Added      2024-05-08

Any idea on how to achieve this?

Thank you so much in advance,
 
This is much easier done, by going back to the data you previously had in thread183-1829988, isn't it?

It still is in the system, as SQL Server is a tranactional RDBMS server, i.e. it has a transaction log you can roll back.


Chriss
 
Hi Chris,

This is the other way around of the approach.

Previously, we would have liked to merge the original data and then match it with the other table.
but merging the data has a data quality issue hence we thought it would be better for another table to split than match with the original data in order to get more granular matching.

appreciate it if you could help us with this.

Thank you
 
Well, do you still have the original data? The script I gave you just created a result, it didn't change your data.
So you should still have that, then there's alsmost nothing to do, is there?

Chriss
 
I tried to avoid complicated SQL statements, so I loaded you example data into python-pandas data frame:

ksnip_20240619-204249_ea1rca.png


and processed it to the desired output:

ksnip_20240619-204252_nfr1lk.png
 
Your expected results are a little weird with respect to the dates.

There are various methods of creating multiple rows from a single row. You could join to a table that has multiple matches, this is a bit more complicated but can sometimes work well depending on your situation. In this particular case, I would approach it with a simple union all query, like this.

Code:
-- Setup sample data
Declare @Temp 
Table   (
          RowNumber Int,
          ParentId VarChar(20),
          ChildId int,
          ActionType VarChar(20),
          Date	DateTime,
          FromIdentityCode VarChar(20),
          ToIdentityCode VarChar(20)
        );

Insert
Into	@Temp
Select	1,'ABC',123,'Change' ,'2024-05-03','T', 'P'
Union All
Select 2,'BCD',234,'Deleted','2024-05-05','T', NULL
Union All
Select 3,'CDE',345,'Change' ,'2024-05-07','P', 'T'
Union All
Select 4,'DEF',456,'Added'  ,'2024-05-08',NULL,'P' 

-- Query goes here, once you understand how this works, apply it to your specific requirements.
Select  ParentId,
        ChildId,
        FromIdentityCode,
        'Deleted' As ActionType,
        Date
From    @Temp
Where   ActionType = 'Change'

Union All

Select  ParentId,
        ChildId,
        Case When ActionType = 'Deleted' Then FromIdentityCode Else ToIdentityCode End,
        Case When ActionType = 'Change' Then 'Added' Else ActionType End As ActionType,
        Date
From    @Temp
Order By ParentId, ActionType DESC


-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Hey Mikrom,
Do you mind showing the codes, I have put the thread on the other forum.

Hi Chris,
In a simple explanation, There are 2 datasets we need to match, dataset1 (the unmerged version) and dataset2 (the merged version).

Initially, I thought it would be better to match both datasets in a merged version hence merge the dataset1 first and then match it with dataset2.(which was in my previous thread)
however, it is not perfect because it could be duplicate added records and duplicate deleted records.

Hence it is better to match it in the unmerged version which means unmerge the dataset2 then match it with dataset1 which is what this thread is talking about.
 
Hi peac3,
As I wrote above, I tried it first with Python (not C#), so it would be more appropriate if i post the python code in the Python Forum and not in C# Forum where you created the thread. Would you create a thread in Python Forum please?

 
The idea of ​​gmmastros is amazingly simple and works well.
 
peac3,

I gave you a much easier idea for matching the merged and unmerged records: Just do not renumber the rows and keep the row numbers.

What you're obviously missing is the concept of a primary key, not only to be unique, non null, but how to propagate it, let it be constant and thereby enable easily to match origins with what you make of them. If you just would have listened to that, you now wouldn't need this thread at all.

Chriss
 
Hi mikrom,

our environment is Windows environment hence .Net/C# would be the native language. Last time you gave us the c# solution I believe this time you can do it again :)

I will try gmmatros solution.
 
Hi peac3,
Since the Python script already exists, i'm posting it in the C# forum.
When I have time, I can do something similar in C#, but I'm sure you can convert it too.
 
I've added two steps "Reexpanded" and "Verification" in a new post to your previous thread183-1829988

I was sure and now am even more sure about the solution, I think you misunderstood my caution warnings about continuing with the result for an insecurity about producing the desired result. On one side I told you you should want the result without renumbering it - for the exact reason you now want the reversed result, on the other side I told you that the resulting data will need a slightly different approach of further merging, but once you understand the mechanics of LEAD()) and LAG() and the OVER clause and PARTITION, then you could easily amend the code to whatever else you need.

Chriss
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top