Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Parsing Delimited Fields in SQL

Other

Parsing Delimited Fields in SQL

by  Golom  Posted    (Edited  )
Suppose we have a Table like this
[tt]tblEmployees
EmployeeID Children

123 Sally,Mary,Tom
234 William,Todd
[/tt]
and we want to see[tt]
EmployeeID Child

123 Sally
123 Mary
123 Tom
234 William
234 Todd
[/tt]
First, Create an "Integers" table with one field "Num"
Code:
CREATE TABLE Integers 
(Num Long, CONSTRAINT Num PRIMARY KEY (Num))
Then Load it with the numbers from 0 to 9
Code:
INSERT INTO Integers (Num) Values(0)
INSERT INTO Integers (Num) Values(1)
:
INSERT INTO Integers (Num) Values(9)
Then create a query (let's call it "qryInt2")
Code:
Select 10*I1.Num + I2.Num As [N]
From   Integers I1, Integers I2
... and now we're finally ready. All that building an integers table stuff (above) is something that needs to be done only once.
Code:
SELECT EmployeeID, 
       Mid("," & E.Children & ",", 
            S1.N + 1 , 
            S2.N - S1.N - 1) AS [Child]

FROM tblEmployees AS E, qryInt2 AS S1, qryInt2 AS S2

WHERE Mid$("," & E.Children & "," ,S1.N , 1 ) = ','
  and Mid$("," & E.Children & "," ,S2.N , 1 ) = ','
  and S1.N > 0 AND S2.N > 0
  and S2.N <= Len(E.Children) + 2
  and S1.N + 1 < S2.N
  and InStr(1,Mid("," & E.Children & ",", 
                        S1.N + 1 , 
                        S2.N - S1.N - 1),",")=0

ORDER BY 1, 2;
So what's going on here?

First the FROM clause
This clause does a cross join of three tables: the original employee table and TWO copies of the qryInt2 query. That will produce
2 * 100 * 100 = 20,000 records.

Now the WHERE clause that does all the work
[tt]
[color blue]Mid$("," & E.Children & "," ,S1.N , 1 ) = ','
and Mid$("," & E.Children & "," ,S2.N , 1 ) = ','[/color][/tt]
These conditions require that any string returned by the query starts and ends with a comma. That's why we attached commas before and after the field.

[color blue]and S1.N > 0 AND S2.N > 0[/color]
Character counting starts at 1 so we eliminate the zeros

[color blue]and S2.N <= Len(E.Children) + 2[/color]
we don't need to look past the end of the field where there are no characters

[color blue]and S1.N + 1 < S2.N[/color]
S1.N is the start of the string we are extracting and S2.N is the end and the start must always be before the end.

[color blue]and InStr(1,Mid("," & E.Children & ",", S1.N + 1 , S2.N - S1.N - 1),",")=0[/color]
we want only those strings that start and end with a comma but do not have any commas inside them.

and finally, the SELECT

[color blue]Mid("," & E.Children & ",", S1.N + 1 , S2.N - S1.N - 1) [/color]
Pulls out the substrings within E.Children. The only ones remaining after WHERE has done its work are the names we want.

Cautionary Notes:
[li]Obviously the task of reducing 20,000 records down to 5 (as in this example) is time consuming. It is probably best to use this to create normalized tables from un-normalized ones. I would not suggest using it as something you run every day.[/li]
[li]The "qryInt2" table needs to have enough values in it to span the length of the longest string in the field you are parsing plus 2. If you have long strings (more than 97 characters in our example) then you would need to supply a table or query with enough values to span the string.[/li]

Credit to Joe Celko who originally developed this SQL. My contribution was just to convert it to MS/Access SQL.

Register to rate this FAQ  : BAD 1 2 3 4 5 6 7 8 9 10 GOOD
Please Note: 1 is Bad, 10 is Good :-)

Part and Inventory Search

Back
Top