Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

select top 4 values from columns 2

Status
Not open for further replies.

LaurenM

Programmer
Jun 5, 2006
62
0
0
CA
Hi Everyone,

I have a table with the following structure
EmployeeID Number
Year1 Date
Salary1 Number
Year2 Date
Salary2 Number
Year3 Date
Salary3 Number
Year4 Date
Salary4 Number
Year5 Date
Salary5 Number

This table stores the top five salaries of the employee, ordered by year, so year1 always be the most recent year of the top five. Now, my problem is I need the top four out of this list ordered by salary, not year. Any hints or pointers?

Thanks,

Lauren McDonald, B.Sc.
Programmer Analyst
Saskatchewan Healthcare Employees' Pension Plan

 
Try:
Code:
SELECT *
  FROM (SELECT   t.*
            FROM mytable t
        ORDER BY GREATEST (salary1,
                           GREATEST (salary2,
                                     GREATEST (salary3,
                                               GREATEST (salary4, salary5)
                                              )
                                    )
                          ))
 WHERE ROWNUM <= 4;
[3eyes]

----------------------------------------------------------------------------
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
 
LK,

Unless I'm missing something, your query suggestion displays data without re-arranging the salaries into salary-magnitude order for the top 4 salaries for the 5-year period (which, I believe, is what Lauren is looking for).

Lauren, it would be helpful to see a couple of rows of sample data and the corresponding output for which you are looking.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I provide low-cost, remote Database Administration services: www.dasages.com]
“Beware of those that seek to protect you from harm or risk. The cost will be your freedoms and your liberty.”
 
Here is some sample data:
ID YEAR1 SALARY1 YEAR2 SALARY2 YEAR3 SALARY3 YEAR4 SALARY4 YEAR5 SALARY5
123 31/12/2008 45000 31/12/2007 20000 31/12/2006 4700 31/12/2005 8700 31/12/2003 17000
321 31/12/2007 40000 31/12/2005 25000 31/12/2004 9500 31/12/2000 5000 31/12/1999 50000
222 31/12/2006 8500 31/12/2003 50000 31/12/2002 55000 31/12/2001 55000 31/12/2000 55000
665 31/12/2008 75000 31/12/2007 70000 31/12/2006 65000 31/12/2005 65000 31/12/2004 60000

So for employee 123 I need to ignore data for year3, for employee 321 ignore year4, for 222 ignore year1, and for 665 ignore year5.

There is an average that I need to calculate that only uses the top 4, and there is another that uses the top 5. In the past I just import this data to excel and calculate the averages there, but I would like to be able to do it in the SQL code itself.


Lauren McDonald, B.Sc.
Programmer Analyst
Saskatchewan Healthcare Employees' Pension Plan

 
Lauren,

your difficulty appears to stem from the fact that the db appears to have its table structured like columns from a spreadsheet. To get what I believe you want try the following.
Code:
DROP TABLE LAUREN CASCADE CONSTRAINTS PURGE;

CREATE TABLE LAUREN
(
ID INTEGER,
YEAR1 DATE,
YEAR2 DATE,
YEAR3 DATE,
YEAR4 DATE,
YEAR5 DATE,
SALARY1 INTEGER,
SALARY2 INTEGER,
SALARY3 INTEGER,
SALARY4 INTEGER,
SALARY5 INTEGER
);

INSERT INTO LAUREN (ID, YEAR1, YEAR2,YEAR3,YEAR4,YEAR5,SALARY1,SALARY2,SALARY3,SALARY4,SALARY5) 
VALUES (123,
        TO_DATE('31/12/2008','DD/MM/YYYY'), 
        TO_DATE('31/12/2007','DD/MM/YYYY'),
        TO_DATE('31/12/2006','DD/MM/YYYY'),
        TO_DATE('31/12/2005','DD/MM/YYYY'),
        TO_DATE('31/12/2004','DD/MM/YYYY'),
        45000,20000,4700,8700,17000);
        
INSERT INTO LAUREN (ID, YEAR1, YEAR2,YEAR3,YEAR4,YEAR5,SALARY1,SALARY2,SALARY3,SALARY4,SALARY5) 
VALUES (321,
        TO_DATE('31/12/2008','DD/MM/YYYY'), 
        TO_DATE('31/12/2007','DD/MM/YYYY'),
        TO_DATE('31/12/2006','DD/MM/YYYY'),
        TO_DATE('31/12/2005','DD/MM/YYYY'),
        TO_DATE('31/12/2004','DD/MM/YYYY'),
        40000,25000,9500,5000,50000);

INSERT INTO LAUREN (ID, YEAR1, YEAR2,YEAR3,YEAR4,YEAR5,SALARY1,SALARY2,SALARY3,SALARY4,SALARY5) 
VALUES (222,
        TO_DATE('31/12/2008','DD/MM/YYYY'), 
        TO_DATE('31/12/2007','DD/MM/YYYY'),
        TO_DATE('31/12/2006','DD/MM/YYYY'),
        TO_DATE('31/12/2005','DD/MM/YYYY'),
        TO_DATE('31/12/2004','DD/MM/YYYY'),
        8500,50000,55000,55000,55000);        
        
INSERT INTO LAUREN (ID, YEAR1, YEAR2,YEAR3,YEAR4,YEAR5,SALARY1,SALARY2,SALARY3,SALARY4,SALARY5) 
VALUES (665,
        TO_DATE('31/12/2008','DD/MM/YYYY'), 
        TO_DATE('31/12/2007','DD/MM/YYYY'),
        TO_DATE('31/12/2006','DD/MM/YYYY'),
        TO_DATE('31/12/2005','DD/MM/YYYY'),
        TO_DATE('31/12/2004','DD/MM/YYYY'),
        8500,50000,55000,55000,55000);
COMMIT;
        
DROP TABLE LAUREN_TIDY CASCADE CONSTRAINTS PURGE;

CREATE TABLE LAUREN_TIDY
(
ID     INTEGER,
YEAR   DATE,
SALARY INTEGER
);

INSERT INTO LAUREN_TIDY SELECT ID, YEAR1, SALARY1 FROM LAUREN;
INSERT INTO LAUREN_TIDY SELECT ID, YEAR2, SALARY2 FROM LAUREN;
INSERT INTO LAUREN_TIDY SELECT ID, YEAR3, SALARY3 FROM LAUREN;
INSERT INTO LAUREN_TIDY SELECT ID, YEAR4, SALARY4 FROM LAUREN;
INSERT INTO LAUREN_TIDY SELECT ID, YEAR5, SALARY5 FROM LAUREN;

COMMIT;

WITH DATA AS
(
SELECT ID, SALARY, YEAR, RANK() OVER (PARTITION BY ID ORDER BY SALARY DESC) RANK
  FROM LAUREN_TIDY
 ORDER BY ID
)
SELECT ID, AVG(SALARY) AVERAGE_SALARY
  FROM DATA
 WHERE RANK <= 4
 GROUP BY ID;

I created what I believe your data looks like, then used it to a table with a structure more suited to a relational db, and then used analytics to do the trick. Is this what you had in mind?

Regards

T
 
Thanks Tharg, the preliminary looks good. I was really hoping to get away without using another table. This table is an annual pension statement that is sent out to everyone, and their pension is based on a top 5 and top 4 average. So everyone has one record in this table for every calendar year that they are in the pension plan.

Lauren McDonald, B.Sc.
Programmer Analyst
Saskatchewan Healthcare Employees' Pension Plan

 
Lauren,

I surmised as much. Although you could do what I did with a view, it would require a horrible union of a union (5 times) which would probably have dire performance if significant quantities of data were involved.

The data feed which drives this is presumably coming from a db and not a spread sheet. If so, get the provider to give you direct access to the data, instead of splattering it across numerous columns, which only makes life difficult.

I think the analytics could be slicker, but maybe Dagon or Santa can tweak that bit for you.

Regards

T
 
Hey tharg, I have access to the raw data, but that is data going back 40+ years and for over 70000 people, when you are looking for each person's top 4 and top 5 earning years, for each and every year, that query would take quite a while to return.

Lauren

Lauren McDonald, B.Sc.
Programmer Analyst
Saskatchewan Healthcare Employees' Pension Plan

 
Well if performance is an issue I would first want to know how often this query has to be run.

If it's a weekly or monthly thing, then just schedule the refresh of an appropriate materialized view overnight. 70,000 people multipled by 40 years is only 2.8 million rows of data. By modern standards, that really isn't so very much.

Also, since you work for the pensions people in Saskatchewan, I presume that this is only relevant to retired people. Can the number of rows involved not be reduced by judicious predicate logic?

If it has to be run hourly or daily, then obviously you'd need a slightly different approach. For faster times I'd suggest a CTAS to create the table, report on it, and then drop the just-made table.

Can you let us know the structure of the real source table (ideally with a create table statement) and some sample rows of data?

Given a template of the data, I could probably cobble up something to generate 2.8 million rows easily enough, and do some timing tests.

Regards

T
 
Hey Tharg, its part of our annual audits on this data before we send out the annual statement to our members who are still working, we are required by law to send this notice every year, we are currently in the middle of reviewing the audits, seeing if they can be refined at all. Part of those is investing members whose 4 and 5 year averages took a significant drop or increase. So being able to grab just the four individual years that make up the 4yr average and compare those four earnings against last year's top four to see what happened to the top 4, would be extremely helpful. Was there data corrections over the year? Was the previous year just an extremely high year? etc. Hence the reason for grabbing just the four years out of the five.

Lauren

Lauren McDonald, B.Sc.
Programmer Analyst
Saskatchewan Healthcare Employees' Pension Plan

 
Well, if this is only needed annually, performance is not an issue.

If the query you come up with takes a fortnight to run, what do you (or anyone) care? As long as you start it running a couple of months before the due data, you're fine.

I surmise that what you really want is to take the 'top 4' from last year, and compare and contrast it with the 'top 4' from this year, to detect any significant differences. Is this the case? If so, post the create table statement for the table in your db, and post say 20 or 30 insert statements (see my examples above for how to do this). Obviously I would expect data to be anonymous, i.e. no real names or addresses etc.

We can then work on a single query that does precisely what you want, with no need for spreadsheets. We can then also make performance suggestions.

Regards

T
 
Hey Targ, here you go.

Code:
DROP TABLE ANNUALSTATEMENTS_TEST;

create table annualstatements_test
    (ID INTEGER,
    YEARENDDATE DATE,
    YEAR1 DATE,
    SALARY1 NUMBER(9,2),
    YEAR2 DATE,
    SALARY2 NUMBER(9,2),
    YEAR3 DATE,
    SALARY3 NUMBER(9,2),
    YEAR4 DATE,
    SALARY4 NUMBER(9,2),
    YEAR5 DATE,
    SALARY5 NUMBER(9,2)
);
  
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (111, TO_DATE('31/12/2007','DD/MM/YYYY'), TO_DATE('31/12/2007','DD/MM/YYYY'), 78559.73, TO_DATE('31/12/2006','DD/MM/YYYY'), 70938.91, TO_DATE('31/12/2005','DD/MM/YYYY'), 61888.30, TO_DATE('31/12/2004','DD/MM/YYYY'), 52638.78, TO_DATE('31/12/2003','DD/MM/YYYY'), 45403.50);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (111, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 82620.68, TO_DATE('31/12/2007','DD/MM/YYYY'), 78559.73, TO_DATE('31/12/2006','DD/MM/YYYY'), 70938.91, TO_DATE('31/12/2005','DD/MM/YYYY'), 61888.3, TO_DATE('31/12/2004','DD/MM/YYYY'), 52638.78);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (112, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 47928.71, TO_DATE('31/12/2006','DD/MM/YYYY'), 45567.35, TO_DATE('31/12/2005','DD/MM/YYYY'), 40962.39, TO_DATE('31/12/2004','DD/MM/YYYY'), 39166.5, TO_DATE('31/12/2003','DD/MM/YYYY'), 34418);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (112, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 49695.51, TO_DATE('31/12/2007','DD/MM/YYYY'), 47928.71, TO_DATE('31/12/2006','DD/MM/YYYY'), 45567.35, TO_DATE('31/12/2005','DD/MM/YYYY'), 40962.39, TO_DATE('31/12/2004','DD/MM/YYYY'), 39166.5);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (113, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 37304.79, TO_DATE('31/12/2006','DD/MM/YYYY'), 36570.26, TO_DATE('31/12/2005','DD/MM/YYYY'), 34930.09, TO_DATE('31/12/2004','DD/MM/YYYY'), 35861.88, TO_DATE('31/12/2003','DD/MM/YYYY'), 34503.33);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (113, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 38524.79, TO_DATE('31/12/2007','DD/MM/YYYY'), 37304.79, TO_DATE('31/12/2006','DD/MM/YYYY'), 36570.26, TO_DATE('31/12/2005','DD/MM/YYYY'), 34930.09, TO_DATE('31/12/2004','DD/MM/YYYY'), 35861.88);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (114, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 95392.65, TO_DATE('31/12/2006','DD/MM/YYYY'), 92334.42, TO_DATE('31/12/2005','DD/MM/YYYY'), 86150.75, TO_DATE('31/12/2004','DD/MM/YYYY'), 85642.59, TO_DATE('31/12/2003','DD/MM/YYYY'), 3144.44);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (114, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 106657.82, TO_DATE('31/12/2007','DD/MM/YYYY'), 95392.65, TO_DATE('31/12/2006','DD/MM/YYYY'), 92334.42, TO_DATE('31/12/2005','DD/MM/YYYY'), 86150.75, TO_DATE('31/12/2004','DD/MM/YYYY'), 85642.59);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (115, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 62991.56, TO_DATE('31/12/2006','DD/MM/YYYY'), 60700.68, TO_DATE('31/12/2005','DD/MM/YYYY'), 57997.96, TO_DATE('31/12/2004','DD/MM/YYYY'), 59736.46, TO_DATE('31/12/2003','DD/MM/YYYY'), 56908);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (115, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 64871.29, TO_DATE('31/12/2007','DD/MM/YYYY'), 62991.56, TO_DATE('31/12/2006','DD/MM/YYYY'), 60700.68, TO_DATE('31/12/2005','DD/MM/YYYY'), 57997.96, TO_DATE('31/12/2004','DD/MM/YYYY'), 59736.46);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (116, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 55061.5, TO_DATE('31/12/2006','DD/MM/YYYY'), 51594.15, TO_DATE('31/12/2005','DD/MM/YYYY'), 47178.91, TO_DATE('31/12/2004','DD/MM/YYYY'), 43614.83, TO_DATE('31/12/2003','DD/MM/YYYY'), 36628.89);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (116, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 57906.94, TO_DATE('31/12/2007','DD/MM/YYYY'), 55061.5, TO_DATE('31/12/2006','DD/MM/YYYY'), 51594.15, TO_DATE('31/12/2005','DD/MM/YYYY'), 47178.91, TO_DATE('31/12/2004','DD/MM/YYYY'), 43614.83);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (118, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 71553.06, TO_DATE('31/12/2006','DD/MM/YYYY'), 70018.5, TO_DATE('31/12/2005','DD/MM/YYYY'), 70018.5, TO_DATE('31/12/2004','DD/MM/YYYY'), 72711.56, TO_DATE('31/12/2003','DD/MM/YYYY'), 69423.17);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (118, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 73693.74, TO_DATE('31/12/2007','DD/MM/YYYY'), 71553.06, TO_DATE('31/12/2006','DD/MM/YYYY'), 70018.5, TO_DATE('31/12/2005','DD/MM/YYYY'), 70018.5, TO_DATE('31/12/2004','DD/MM/YYYY'), 72711.56);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (119, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 87924.49, TO_DATE('31/12/2006','DD/MM/YYYY'), 84728.57, TO_DATE('31/12/2005','DD/MM/YYYY'), 80953.88, TO_DATE('31/12/2004','DD/MM/YYYY'), 73731.56, TO_DATE('31/12/2003','DD/MM/YYYY'), 73123.33);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (119, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 92281.5, TO_DATE('31/12/2007','DD/MM/YYYY'), 87924.49, TO_DATE('31/12/2006','DD/MM/YYYY'), 84728.57, TO_DATE('31/12/2005','DD/MM/YYYY'), 80953.88, TO_DATE('31/12/2004','DD/MM/YYYY'), 73731.56);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (120, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 42820, TO_DATE('31/12/2006','DD/MM/YYYY'), 37312.82, TO_DATE('31/12/2005','DD/MM/YYYY'), 32239.32, TO_DATE('31/12/2004','DD/MM/YYYY'), 29172.31, TO_DATE('31/12/2003','DD/MM/YYYY'), 16106.67);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (120, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 47189.93, TO_DATE('31/12/2007','DD/MM/YYYY'), 42820, TO_DATE('31/12/2006','DD/MM/YYYY'), 37312.82, TO_DATE('31/12/2005','DD/MM/YYYY'), 32239.32, TO_DATE('31/12/2004','DD/MM/YYYY'), 29172.31);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (121, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 51830.61, TO_DATE('31/12/2006','DD/MM/YYYY'), 44154.83, TO_DATE('31/12/2005','DD/MM/YYYY'), 38454.53, TO_DATE('31/12/2004','DD/MM/YYYY'), 34180.51, TO_DATE('31/12/2003','DD/MM/YYYY'), 21866.22);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (121, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 56147.35, TO_DATE('31/12/2007','DD/MM/YYYY'), 51830.61, TO_DATE('31/12/2006','DD/MM/YYYY'), 44154.83, TO_DATE('31/12/2005','DD/MM/YYYY'), 38454.53, TO_DATE('31/12/2004','DD/MM/YYYY'), 34180.51);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (122, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 124220.95, TO_DATE('31/12/2006','DD/MM/YYYY'), 118183.67, TO_DATE('31/12/2005','DD/MM/YYYY'), 112333.33, TO_DATE('31/12/2004','DD/MM/YYYY'), 103802.72, TO_DATE('31/12/2003','DD/MM/YYYY'), 98075);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (122, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 130136.46, TO_DATE('31/12/2007','DD/MM/YYYY'), 124220.95, TO_DATE('31/12/2006','DD/MM/YYYY'), 118183.67, TO_DATE('31/12/2005','DD/MM/YYYY'), 112333.33, TO_DATE('31/12/2004','DD/MM/YYYY'), 103802.72);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (123, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 51719.59, TO_DATE('31/12/2005','DD/MM/YYYY'), 42834.83, TO_DATE('31/12/2004','DD/MM/YYYY'), 44331.02, TO_DATE('31/12/2003','DD/MM/YYYY'), 42462.83, TO_DATE('31/12/2002','DD/MM/YYYY'), 41219.09);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (123, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 57906.94, TO_DATE('31/12/2007','DD/MM/YYYY'), 51719.59, TO_DATE('31/12/2005','DD/MM/YYYY'), 42834.83, TO_DATE('31/12/2004','DD/MM/YYYY'), 44331.02, TO_DATE('31/12/2003','DD/MM/YYYY'), 42462.83);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (124, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 63282.31, TO_DATE('31/12/2006','DD/MM/YYYY'), 55265.03, TO_DATE('31/12/2005','DD/MM/YYYY'), 46859.59, TO_DATE('31/12/2004','DD/MM/YYYY'), 37753.85, TO_DATE('31/12/2003','DD/MM/YYYY'), 33608);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (124, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 67860.68, TO_DATE('31/12/2007','DD/MM/YYYY'), 63282.31, TO_DATE('31/12/2006','DD/MM/YYYY'), 55265.03, TO_DATE('31/12/2005','DD/MM/YYYY'), 46859.59, TO_DATE('31/12/2004','DD/MM/YYYY'), 37753.85);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (125, TO_DATE('31/12/2007','DD/MM/YYYY'),	TO_DATE('31/12/2007','DD/MM/YYYY'), 34040.51, TO_DATE('31/12/2006','DD/MM/YYYY'), 28488.38, TO_DATE('31/12/2005','DD/MM/YYYY'), 25530.09, TO_DATE('31/12/2004','DD/MM/YYYY'), 25021.54, TO_DATE('31/12/2003','DD/MM/YYYY'), 16306.44);
INSERT INTO ANNUALSTATEMENTS_TEST (ID, YEARENDDATE, YEAR1, SALARY1, YEAR2, SALARY2, YEAR3, SALARY3, YEAR4, SALARY4, YEAR5, SALARY5)
    VALUES (125, TO_DATE('31/12/2008','DD/MM/YYYY'),	TO_DATE('31/12/2008','DD/MM/YYYY'), 35174.87, TO_DATE('31/12/2007','DD/MM/YYYY'), 34040.51, TO_DATE('31/12/2006','DD/MM/YYYY'), 28488.38, TO_DATE('31/12/2005','DD/MM/YYYY'), 25530.09, TO_DATE('31/12/2004','DD/MM/YYYY'), 25021.54);
    
COMMIT;

Now obviously the table has more columns, but these are the ones that I am looking at right now. Year1, and Salary1 will always be the most recent year that the member's earnings made it into the top five, but not necessarily the highest of the top 5, although the trend is that the most recent year is the highest earning year. Same with year2, year3, year4, and year5. The earnings are sorted in chronological order rather than being ordered by earnings, which would make a lot more sense. Any help that you can provide would be greatly appreciated.



Lauren McDonald, B.Sc.
Programmer Analyst
Saskatchewan Healthcare Employees' Pension Plan

 
Lauren,

thanks for posting the statements - so few do.
It's nearly midnight here at the moment, and tomorrow is my birthday.

Please bear with a 24 hour delay in responding, whilst I feign delight as I get increasingly old and doddery. I'll have a squint at this when I have begun my 49th year of existence - sigh.

Regards

T
 
Lauren,

I've built the table and loaded the sample data. Your problem is that the data is de-normalised and duplicated.

For example, user 111 has entries for YEAR1 ending in both 2007 and 2008. Two rows exist for this (and every user) with most of the data repeated.

You must get to the source of this data, and stop it being provided in such a poor form. All you need is a user id, the relevant year and salary. Then store them in a normalised table such as the one I showed you.

The source data is bad here Lauren. Although it might well be possible to clean it up, it's much better to eliminate the problem at source. The code I provided will not work correctly because of this data duplication. Push back to the source of this. Somebody somewhere must have a simple list of each person and the salary they earned each year. Get that, and your problems will largely disappear.

Regards

T
 
Lauren,

any joy with the data source?

Regards

T
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top