Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Gaps and islands

Status
Not open for further replies.

peac3

Technical User
Jan 17, 2009
226
AU
Hi guys,

I have data like below with a list of employees, applications, status and dates.

Code:
EMPL_I     APPT_N        STUS_C                EFFT_D
123        001            NEW                  2023-10-02
123        001            NEW                  2023-10-09
123        001            DEFER                2023-10-09
123        001            DEFER                2023-10-10
234        001            APPROVED             2023-10-10
234        001            APPROVED             2023-10-11
123        001            APPROVED - YES       2023-10-11
123        001            APPROVED - YES       2023-10-16
123        001            NEW                  2023-10-16
123        001            APPROVED             2023-10-16
123        001            APPROVED - YES       2023-10-16
123        001            APPROVED - YES       2023-10-27
123        001            NOT PROCEED          2023-10-27

I would like to derive for each empl_i, appt_n and status I would like to get the Minimum efft_d, the problem is some of the statuses may come back with the same employee and application.
I have a problem creating the gaps and island and would like to get the expected output as per below.

Code:
EMPL_I     APPT_N        STUS_C                EFFT_D
123        001            NEW                  2023-10-02
123        001            DEFER                2023-10-09
234        001            APPROVED             2023-10-10
123        001            APPROVED - YES       2023-10-11
123        001            NEW                  2023-10-16
123        001            APPROVED             2023-10-16
123        001            APPROVED - YES       2023-10-16
123        001            NOT PROCEED          2023-10-27


Any help would be greatly appreciated.

Thanks.
 
First of all I would say this is more of a reporting problem than an SQL problem, especially since you can't have empty rows in an SQL result.

If you have exactly the result as in your first code section, then a reporting tool supressing double printing of the first three fields would do the job, there's no need to get the reow out of the sql query result. Indeed if solving that in SQL you would not get gaps at all and I don't see how the sql result would inform a report to print a gap line.

Chriss
 
Hi Chris,

Don't worry about the empty rows... It is meant to be no empty row.

I am just meant to show the rows that may be deleted. let me update it
 
Do you have your current query, or is the first listing just the table (top rows only)?

It''s also still true that a reporting engine getting the full first list can suppress repeated values and that solves the problem without any sql change.

Chriss
 
The simplest query giving about that result is

Code:
Select EMPL_I, APPT_N, STUS_C, MIN(EFFT_D) as EFFT_D FROM YOURTABLE GROUP BY EMPL_I, APPT_N, STUS_C ORDER BY APPT_N, EFFT_D, EMPL_I

This assumes all the data coems from one table, if you have a current query to modify then it likely pulls the columns from several different tables and that join mechanism has to stay as is.

Perhaps the result should also be ordered by some number that sorts the status texts not in alphabetical order but in the stages starting with NEW, I assume and ending on either APPROVED_YES or NOT PROCEED. A numerical status order column is what I expect to do this, but you don't list it.

Chriss
 
This is the data we have from raw data.

This query won't work as I said in the previous post, because the status may come back later on (e.g. Approved - Yes comes twice or two islands) and will look like below:

Code:
EMPL_I     APPT_N        STUS_C                EFFT_D
123        001            NEW                  2023-10-02
123        001            DEFER                2023-10-09
234        001            APPROVED             2023-10-10
123        001            APPROVED - YES       2023-10-11
123        001            APPROVED             2023-10-16
123        001            NOT PROCEED          2023-10-27
 
A query not giving you the desired result is still a query that can be changed to give the desired result, it can contain very correct and viable informations about all the involved tables and joins necessary and not giving it in a post just because you don't accept its result is just thinking in the wrong direction about how to amend code. Or are you talking about my query, now?

Chriss
 
Sorting priorities should be given by you, as that also will change if two rows with no status change appear successive or not. If you mainly sort by EFFT_D, as I think then many employes could change status and you'd not remove double lines. So I guess the sorting has to be at least by emplyoee as secondary columns, which your data isn't at all.

That's where the problems begin. If you want a query that can react to the value of a previous row or next row, this has to be a query that fully specifies sorting major, secondary, etc sort order. Otherwise you can't access a next or previous row. You may think why, the result is always in some order, but you don't access the result, you build up a result.

Technically the things available are LAG() and LEAD(), which need an OVER (ORDER BY ...) option to be specified.

The thing I would personally do is: Make employee_i the secondary sort order, if not even the first, because a row with employee 234 between several rows of employee 123 copuld easily counteract finding repeated values for employee 123.

So, first tell me how to sort the data exactly, then we can talk about the suppression of repeats with the help of LAG() or LEAD() expressions. I would say it is important to go per employee in the first order, but chronological order of the result seems more important to you. I also don't know whether the status is per appt (department/appartment?) and so an employee could have a status per department or whether its just his status indpendent of appt_n. As it's listed, I would assume it plays a role, which means major sort order would be by employee, secondary the appt_no and only then you could sort by effective date.

If you really only look for a way to amend the initial result by deleting repeats, that's actually the job of something that processes the sql result. Indeed that could also be sql itself, but in a more complex query like using a comon table expression or a nested query. There are no technical means to supress result rows as aftermath aside from a two staged process which a cte or a nested query are. And such things are indeed easier done as aftermath on the result instead of doing them when building the result, so for example with a report engine supressing repeats, back to square one.

Chriss
 
Did you figure it out yourself with the hint on LAG() and LEAD() or didn't you understand my demand of exact sort order?

Chriss
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top