Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Tough query please help 1

Status
Not open for further replies.

peac3

Technical User
Jan 17, 2009
226
AU
Hi guys,

I am struggling with query below and hope you guys can help me with this.
I have 2 tables that I need to join which are Sales table and Customer_code_history table as per below.

Code:
[b]Sales[/b]
Sales_Code	Customer_code
ABC	         X-A01
DEF	         X-B01
GHI	         X-C01

[b]Customer_code_history[/b]
Current_Cust_Code	Prev_cust_Code	start_date	end_date
X-A03	                   X-A02	31/01/2016	31/12/9999
X-A02	                   X-A01	20/01/2016	30/01/2016
X-B01	                   NULL	        3/01/2016	31/12/9999
X-C04	                   X-C03	6/03/2016	31/12/9999
X-C03	                   X-C02	31/01/2016	5/03/2016
X-C02	                   X-C01	20/01/2016	30/01/2016

I am trying to get the new customer code from joining the sales table and customer code history table.
The problem is the customer code from the sales table can exist in prev_cust_code or Current_cust_code in customer code history table

The pseudo code is

if the customer code in sales table exist in current cust code column
then take the current cust code and remove the first 2 characters ("X-")​
else if the customer code in sales table not exist in current cust code column but exist in prev code column
then we need to track the current cust code by finding the latest date which is 31/12/9999​
and remove the first 2 characters ("X-")​
else
null​


The expected result would be as per below

Code:
Sales_Code	Customer_code	New_Customer_Code
ABC	            X-A02	    A03
DEF	            X-B01	    B01
GHI	            X-C01	    C04

If you could help me with SQL query that would be much appreciated.

Thanks,
 
Hi there,

if I understand correctly, you could SELECT SUBSTRING(Sales.Customer_code, 2, LEN(Sales.Customer_code)-2) using
- a JOIN on Sales.Customer_code = Customer_code_history.Current_Cust_Code
- UNION ALL
- a JOIN on Sales.Customer_code = Customer_code_history.Prev_cust_Code

Would that do the trick?

Cheers,
MakeItSo

"Knowledge is power. Information is liberating. Education is the premise of progress, in every society, in every family." (Kofi Annan)
Oppose SOPA, PIPA, ACTA; measures to curb freedom of information under whatever name whatsoever.
 
you will need a recursive CTE to do this.

No time just yet to do it but will be able to give you a working example 7 hours from now

Regards

Frederico Fonseca
SysSoft Integrated Ltd

FAQ219-2884
FAQ181-2886
 
As far as I see the outset is a bit wrong, as every customer_code at some time was the Current_Cust_code. But if the end date of that record is passed, it is not the current Current_Cust_code.
If taking that into account records for X-A01 and X-C01 are missing but should look like the sample history record of X-B01 with NULL as previous code but an end date 1 day previous to the next history record start date, obviously.

We can always know a latest Current_Cust_Code by filtering for enddate='31/12/9999', but you don't necessarily find Sales.Customer_Code in there. That's why you need a recursive query with a CTE, as Frederico Fonseca said.

I'm fine with being topped by him later, but here's what I do in a similar situation:
Code:
Declare @Customer_Code_history as Table
   (Current_Cust_Code char(5),	
    Prev_Cust_Code char(5),	
    start_date date, 
    end_date date
    );

Declare @Sales as Table 
   (Sales_Code char(3), 
    Customer_Code char(5)
   );

SET DATEFORMAT dmy;

Insert Into @Customer_Code_history Values 
('X-A03','X-A02', '31/01/2016', '31/12/9999'),
('X-A02','X-A01','20/01/2016','30/01/2016'),
('X-B01',NULL,'03/01/2016','31/12/9999'),
('X-C04','X-C03','06/03/2016','31/12/9999'),
('X-C03','X-C02','31/01/2016','5/03/2016'),
('X-C02','X-C01','20/01/2016','30/01/2016'),
('X-A01',NULL,'01/01/2016','19/01/2016'),
('X-C01',NULL,'01/01/2016','19/01/2016');

Insert Into @Sales Values 
('ABC','X-A01'), 
('DEF','X-B01'), 
('GHI','X-C01');

WITH CustCode
     AS 
     (
       SELECT Current_Cust_Code as curCode, 
              Cast(NULL as char(5)) as prevCode, 
              Current_Cust_Code as initCode, 1 as Lvl
       FROM @Customer_Code_history
       WHERE Prev_cust_code IS NULL
     UNION ALL
       SELECT n.Current_Cust_Code as curCode, 
              n.Prev_cust_Code as prevCode, 
              p.initCode, p.Lvl+1 as Lvl
       FROM @Customer_Code_history n
       INNER JOIN CustCode p
       ON p.curCode = n.Prev_cust_Code
     )

Select Sales_Code,
   Customer_Code,
   Substring(
   (Select Top 1 curCode From CustCode 
           Where initCode = (SELECT initCode From CustCode Where curCode = Customer_Code) 
		   Order By Lvl Desc)
        ,3,1000) as New_Customer_Code
From @Sales

Bye, Olaf.
 
The general idea behind it is the recursive query to be able to determine the initial customer code from any customer code of all the history.
You therefore union a root element (where prev_cust_code IS NULL) with all child rows (ON p.curCode = n.Prev_cust_Code). And the recursion comes from joining the CTE within the CTE query (INNER JOIN CustCode p).

You could almost do without that, if all your history rows would retain the initial customer code as own column. See how similar the CTE table itself is by doing [tt]SELECT * FROM CustCode[/tt]. I only skipped the dates and instead added a level, besides forwarding the initCode to each row of the same customer.

Bye, Olaf.

 
Here's how it would work without a CTE, if you modify your history table to contain a Root_Cust_Code, as I called it now...

Code:
Declare @Customer_Code_history as Table (Current_Cust_Code char(5),	Prev_Cust_Code char(5),	Root_Cust_Code char(5), start_date date, end_date date);
Declare @Sales as Table (Sales_Code char(3), Customer_Code char(5));

SET DATEFORMAT dmy

Insert Into @Customer_Code_history Values 
('X-A03','X-A02','X-A01','31/01/2016','31/12/9999'),
('X-A02','X-A01','X-A01','20/01/2016','30/01/2016'),
('X-B01', NULL  ,'X-B01','03/01/2016','31/12/9999'),
('X-C04','X-C03','X-C01','06/03/2016','31/12/9999'),
('X-C03','X-C02','X-C01','31/01/2016','05/03/2016'),
('X-C02','X-C01','X-C01','20/01/2016','30/01/2016'),
('X-A01', NULL  ,'X-A01','01/01/2016','19/01/2016'),
('X-C01', NULL  ,'X-C01','01/01/2016','19/01/2016');

Insert Into @Sales Values 
('ABC','X-A02'),
('DEF','X-B01'),
('GHI','X-C01');

Select S.Sales_Code,
   S.Customer_Code,
   Substring(
   (Select Current_Cust_Code From @Customer_Code_history 
           Where Root_Cust_Code = (SELECT Root_Cust_Code From @Customer_Code_history Where Current_Cust_Code = S.Customer_Code)
		     AND end_date='31/12/9999')
        ,3,1000) as New_Customer_Code
From @Sales S;

After letting this sink in the idea to change your history table schema makes more and more sense. Every history data should have on common value like Root_Cust_Code to mark a family of records. Typically you have such a value in the once generated and never changing primary key of a record, that could either be this root code or CustomerID or a rowguid. Anything that simply is the never changing key, even if your customer code (printed on some bills/cards) is needing to change from time to time. This way the previous code is obsolete, you can sort by valid dates to see the history of values valid from start_date to end_date and don't need a previous value to crawl back the hostory recursively. If you have a common code, root code, the way to lookup the newest most current code is a two step process of looking up the common/root code from any code you have from any time and then lookup the latest code in that family of codes all sharing the root/common code. Besides that, you should also be able to tell the date any record was created or updated, so you can always take that to lookup the history record, which has that record date between it's star and end date. Then you should find the current_cust_code matching the one of the record. This does not just verifiy you used the right code at the time, but you can also lookup the root and latest value from there.

You might really be missing a non changing primary key in your data, then your customer code does not apply all rules of a primary key, not only to be unique, but also to never change in time. If you are of camp favoring natural keys, you have to accept cascading updates of a natural key change to all related and related related and so fourth data. A history table is a good idea in any case, but its true value comes from being able to easily query all rows belonging to one records history. The easiest for that is via a constant surragate primary key and not natural keys, even for natural key history you can always add a rowguid as secondary key and first citizen for your history data without the need to change the whole system from natural key usage to surrogate keys.

Bye, Olaf.
 
Hi OlafDoschke,

This table was designed before my time and I have been contemplating to add the Root_Cust_code as an anchor point to simplify the query as you pointed out.
However this table has a lot of historical data and have web front end on top of it so it may take a bit of effort to add new column, populate the data and refactor the web front end.

Thank you so much for your answer and recursive CTE and I really appreciate it, it works!
God bless.
 
Well, you could use the CTE code to find the root code and populate that new column. I'm unsure how old code would need a change, as you just add a new column. You can make use of it in this place and surely later more, but don't need to mend any code already working without it, which should continue running undisturbed - nothing is removed or changed oherwise.

Bye, Olaf.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top