I am using an Oracle 9i db but I thought I would post this question here as a) there is no specific Oracle SQL forum available, b) the Oracle developer forum looks pretty sparse, and c) I thought it might be interesting to see if/how my problem can be solved in 'pure' ANSI SQL. If I'm in the wrong place please do let me know and I'll try elsewhere.
Here is the background to my question. I have a pupil database consisting of a number of tables, each containing tens of thousands rows. I need to find TWIN pupils, and the criteria I'm using to determine a twin are:
1. Same birthdate
2. Same current address
The two tables involved are PUPIL and PUPIL_ADDRESS.
PUPIL contains a pupil_id column and various other columns such as forename, surname, birthdate, etc.
PUPIL_ADDRESS contains pupil_id (as a FK to PUPIL), address_id (as a FK to ADDRESS), main_address_ind (a column restricted to values M for Main - meaning current address - and P for Previous, meaning an old address), address_start_date (the date the address started for the record), address_end_date (the date the address ended for the record, if applicable - can be NULL).
In order to find the twins in the dataset I can run the following query:
So, I'm selecting five columns from two tables, I have joined PUPIL and PUPIL_ADDRESS on the PUPIL_ID column, and I have set the condition that the MAIN_ADDRESS_IND is 'M' to restrict the addresses to current addresses only.
I have used the 'exists' condition with a sub-query to ensure that the rows returned are those where there are related records in the same table where the PUPIL_ID cols don't match but the BIRTHDATE and ADDRESS_IND cols do match (also with the same restriction on the address). I hope this makes sense - and I hope ANSI SQL allows the 'exists' condition! The query does work on Oracle, I hasten to add.
My question is this. Can I transform this query so that instead of outputting a standard list like this:
I could use some kind of sub-query to have the first pupil details and the twin on the same line:
Imagining there are ten twins in the dataset, I would like the output to be five rows, with the first of the twins followed by the second on the same row.
I hope this makes sense. If anyone could give me any help with this I'd be most grateful.
Best wishes,
cj ashwell
Here is the background to my question. I have a pupil database consisting of a number of tables, each containing tens of thousands rows. I need to find TWIN pupils, and the criteria I'm using to determine a twin are:
1. Same birthdate
2. Same current address
The two tables involved are PUPIL and PUPIL_ADDRESS.
PUPIL contains a pupil_id column and various other columns such as forename, surname, birthdate, etc.
PUPIL_ADDRESS contains pupil_id (as a FK to PUPIL), address_id (as a FK to ADDRESS), main_address_ind (a column restricted to values M for Main - meaning current address - and P for Previous, meaning an old address), address_start_date (the date the address started for the record), address_end_date (the date the address ended for the record, if applicable - can be NULL).
In order to find the twins in the dataset I can run the following query:
Code:
select p.pupil_id
, p.forenames
, p.surname
, p.BIRTHDATE
, pa.address_id
from
pupil p
, pupil_address pa
where
p.pupil_id = pa.pupil_id
and pa.main_address_ind = 'M'
and exists
(select 1
from pupil p2, pupil_address pa2
where p2.pupil_id = pa2.PUPIL_ID
and pa2.main_address_ind = 'M'
and p2.birthdate = p.birthdate
and p2.pupil_id <> p.pupil_id
and pa2.address_id = pa.address_id
)
order by p.birthdate, p.surname, p.forenames
So, I'm selecting five columns from two tables, I have joined PUPIL and PUPIL_ADDRESS on the PUPIL_ID column, and I have set the condition that the MAIN_ADDRESS_IND is 'M' to restrict the addresses to current addresses only.
I have used the 'exists' condition with a sub-query to ensure that the rows returned are those where there are related records in the same table where the PUPIL_ID cols don't match but the BIRTHDATE and ADDRESS_IND cols do match (also with the same restriction on the address). I hope this makes sense - and I hope ANSI SQL allows the 'exists' condition! The query does work on Oracle, I hasten to add.
My question is this. Can I transform this query so that instead of outputting a standard list like this:
Code:
PUPIL_ID[TAB]FORENAME[TAB]SURNAME[TAB]DOB[TAB]ADDRESS_ID
106458[tab]Samatha[tab]Brown[tab]01/06/1982[tab]161583
107262[tab]Sasha[tab]Brown[tab]01/06/1982[tab]161583
I could use some kind of sub-query to have the first pupil details and the twin on the same line:
Code:
PUPIL_1_ID[TAB]PUPIL_1_FNAME[TAB]PUPIL_1_SNAME[TAB]PUPIL_2_ID[TAB]
PUPIL_2_FNAME[TAB]PUPIL_2_SNAME
106458[TAB][tab]Samanatha[tab][tab]Brown[tab][tab]107262[tab][tab]
Sasha[tab][tab]Brown
Imagining there are ten twins in the dataset, I would like the output to be five rows, with the first of the twins followed by the second on the same row.
I hope this makes sense. If anyone could give me any help with this I'd be most grateful.
Best wishes,
cj ashwell