I have been asked to review a fairly normalized datawarehouse where a portion of the FK's are set to be nullable.
This of course has an impact on performance and it induces outer joins for correct reporting.
The issue is how to find a remedy and repair the missing FK values (either by adjusting the ETL process , setting proper defaults or building post-job procedures that fix things)
With starschema's this would be quite simple, cause there are 1-2 levels of dependencies, but a relational model is more complex.
I've played around with some scripts to seperate the tables in groups like:
No FK, only PK:
A 2nd group that directly references the first group:
and a 3rd group that is the rest.
Does anyone experience in bringing this sort of an exercise into manageable steps?
Ties Blom
This of course has an impact on performance and it induces outer joins for correct reporting.
The issue is how to find a remedy and repair the missing FK values (either by adjusting the ETL process , setting proper defaults or building post-job procedures that fix things)
With starschema's this would be quite simple, cause there are 1-2 levels of dependencies, but a relational model is more complex.
I've played around with some scripts to seperate the tables in groups like:
No FK, only PK:
Code:
SELECT T.TABLE_NAME,S.CONSTRAINT_NAME,
MAX(CASE WHEN S.POSITION = 1 THEN S.COLUMN_NAME ELSE NULL END) AS COL1,
MAX(CASE WHEN S.POSITION = 2 THEN S.COLUMN_NAME ELSE NULL END) AS COL2,
MAX(CASE WHEN S.POSITION = 3 THEN S.COLUMN_NAME ELSE NULL END) AS COL3,
MAX(CASE WHEN S.POSITION = 4 THEN S.COLUMN_NAME ELSE NULL END) AS COL4,
MAX(CASE WHEN S.POSITION = 5 THEN S.COLUMN_NAME ELSE NULL END) AS COL5,
MAX(CASE WHEN S.POSITION = 6 THEN S.COLUMN_NAME ELSE NULL END) AS COL6,
MAX(CASE WHEN S.POSITION = 7 THEN S.COLUMN_NAME ELSE NULL END) AS COL7,
MAX(CASE WHEN S.POSITION = 8 THEN S.COLUMN_NAME ELSE NULL END) AS COL8,
MAX(CASE WHEN S.POSITION = 9 THEN S.COLUMN_NAME ELSE NULL END) AS COL9,
MAX(CASE WHEN S.POSITION = 10 THEN S.COLUMN_NAME ELSE NULL END) AS COL10
FROM SYS.ALL_TABLES T,SYS.all_CONS_COLUMNS S,SYS.ALL_CONSTRAINTS X
WHERE T.OWNER = S.OWNER AND T.TABLE_NAME = S.TABLE_NAME AND
T.OWNER = X.OWNER AND T.TABLE_NAME = X.TABLE_NAME AND
T.OWNER = 'KIMO' AND T.TABLE_NAME LIKE 'TB%' AND X.CONSTRAINT_TYPE = 'P' AND
S.CONSTRAINT_NAME NOT LIKE 'SYS%' AND X.STATUS = 'ENABLED' AND
T.TABLE_NAME NOT IN
(SELECT V.TABLE_NAME FROM SYS.ALL_CONSTRAINTS V WHERE V.OWNER = 'KIMO' AND V.CONSTRAINT_TYPE = 'R')
GROUP BY T.TABLE_NAME,S.CONSTRAINT_NAME
A 2nd group that directly references the first group:
Code:
SELECT DISTINCT D.TABLE_NAME FROM SYS.ALL_CONSTRAINTS D WHERE D.CONSTRAINT_TYPE = 'R'
AND D.OWNER = 'KIMO' AND
D.R_CONSTRAINT_NAME IN
(SELECT S.CONSTRAINT_NAME
FROM SYS.ALL_TABLES T,SYS.all_CONS_COLUMNS S,SYS.ALL_CONSTRAINTS X
WHERE T.OWNER = S.OWNER AND T.TABLE_NAME = S.TABLE_NAME AND
T.OWNER = X.OWNER AND T.TABLE_NAME = X.TABLE_NAME AND
T.OWNER = 'KIMO' AND T.TABLE_NAME LIKE 'TB%' AND X.CONSTRAINT_TYPE = 'P' AND
S.CONSTRAINT_NAME NOT LIKE 'SYS%' AND
T.TABLE_NAME NOT IN
(SELECT V.TABLE_NAME FROM SYS.ALL_CONSTRAINTS V WHERE V.OWNER = 'KIMO' AND V.CONSTRAINT_TYPE = 'R'))
INTERSECT
SELECT Y.TABLE_NAME FROM SYS.ALL_TAB_COLUMNS Y,SYS.ALL_CONSTRAINTS Z
WHERE Y.TABLE_NAME = Z.TABLE_NAME AND Z.CONSTRAINT_TYPE = 'R' AND
Y.TABLE_NAME LIKE 'TB%' AND Y.NULLABLE = 'Y'
and a 3rd group that is the rest.
Does anyone experience in bringing this sort of an exercise into manageable steps?
Ties Blom