Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SSIS Derived columns

Status
Not open for further replies.

andre1802

Programmer
Dec 2, 2014
2
ZA
I need help getting the following case statement into the derived column format in SSIS


CASE
WHEN SUBSTRING(REPLICATE('0', 9 - LEN(Data)) + CAST(Data AS VARCHAR(9)), 4, 6) IS NULL
THEN 0
ELSE RTRIM(SUBSTRING(REPLICATE('0', 9 - LEN(Data)) + CAST(Data AS VARCHAR(9)), 4, 6))
END

Here are examples for the data in Data
row 1 - 1201303
row 2 - 123456789
row 3 - null
row 4 - 12345678
the first row result would be 201303
row 2 result would be 456789
row 3 result would be 0
row 4 result would be 345678

 
Why you need derived column? Why not just include that case in your select or update statement and run on database site
 
Thanks gk53, so you would say a select into another table with the case statements would be faster than using a Data flow task with derived columns, what I am looking for is increased performance. I am not sure in derived columns will be a heavier workload than a straight select into. Note we are taking about 700,000,000 records in some cases
 
Code:
(DT_STR,9,1252)(TRIM(data) == "" || UPPER(TRIM(data)) == "NULL" ? (DT_STR,9,1252)"0" : (DT_STR,9,1252)(SUBSTRING(REPLICATE("0",9 - LEN([COLOR=#CC0000]data[/color])) + [COLOR=#CC0000]data[/color],4,6)))

assumptions are
the input is a file, not a sql server table - if it is a sql server then the convert is better done on the server side unless that server is under heavy CPU pressure - in this case SSIS should not run on that server anyway.
that the "null" value you mentioned on your example is either a empty string or word "null"
if it is a file contents of data do not contain leading zeros - if it does part in red needs to be changed

Note that with volumes you are talking about you should consider the following

1 - split flow into parallel loads into final table (which should be partitioned) - at least 5 can run at the same time, more if done properlly.
2 - if input is a file see if extract can be split into several files instead of a single one. if a single one use a conditional split to split input into its parallel logical files
3 - derived columns transformation - set as many as required so no more than 10 fields are being transformed in each one - e.g. transform1 - 10 fields, transform2 - 10 fields. flow becomes datasource->transform1->transform2->destination

4 - fine tune buffers/rows per buffer
5 - fine tune data type validation if you are certain input data is always valid

Regards

Frederico Fonseca
SysSoft Integrated Ltd

FAQ219-2884
FAQ181-2886
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top