Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

parse string query 1

Status
Not open for further replies.

ksnpc

Technical User
Jan 23, 2004
91
0
0
US
I would like to take a scientific name string a parse it into three separate columns - genus, species, and subspecies.

Examples:
Mustela nivalis -->
genus: Mustela
species: nivalis
subspecies: <blank>

Ophisaurus attenuatus longicaudus -->
genus: Ophisaurus
species: attenuatus
subspecies: longicaudus

Etheostoma sp. 9 -->
genus: Etheostoma
species: sp. 9
subspecies: <blank>

Noturus flavus pop. 1 -->
genus: Noturus
species: flavus
subspecies: pop. 1

I was trying to use substr/instr -
select substr(sn.scientific_name, 1,Instr(sn.scientific_name,' ') - 1) as Genus
but it doesn't really work for "pop. 1" or "sp. 9" where there are spaces that are supposed to be included.

Does anyone have any suggestions on how to do this?
 
ksnpc,

I notice that in the "exceptional" cases that you listed, both have an embedded ". ". Is that coincidental, or can you depend upon a ". " in each of the situations where a blank in not a reliable delimiter?

Is there a comprehensive master list of valid, unique values for GENUS, SPECIES, and SUBSPECIES? If so, then we can formulate a parsing method that uses the list of values.

Let us know,

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I provide low-cost, remote Database Administration services: www.dasages.com]
 
Thanks for responding Mufasa,
I believe all the non-standard names would all have a "." included. I'm afraid I don't have a master list though.

 
Sorry to be tardy in responding, ksnpc...I was out late at a wedding and just saw your response.

Here is code that reproduces, exactly, what you originally specified. I've enhanced your original data, slightly, to show that the code handles "Genus-only" entries.

Section 1 -- Test Data:
Code:
select * from classification;

TITLE
---------------------------------
Mustela nivalis
Ophisaurus attenuatus longicaudus
Etheostoma sp. 9
Noturus flavus pop. 1
Mustela

5 rows selected.
Section 2 -- Query code:
Code:
col x heading "Scientific Classifications" format a45
select replace(title,'.','. ')||' -->'||chr(10)
||'genus: '     ||replace(substr(title,s1,l1),'.','. ')||chr(10)
||'species: '   ||replace(substr(title,s2,l2),'.','. ')||chr(10)
||'subspecies: '||replace(substr(title,s3,l3),'.','. ')||chr(10) x
  from (select trim(title) title
              ,1 s1
              ,instr(title,' ',1,1)-1 l1
              ,instr(title,' ',1,1)+1 s2
              ,instr(title,' ',1,2)-instr(title,' ',1,1)-1 l2
              ,instr(title,' ',1,2)+1 s3
              ,length(title)-instr(title,' ',1,2) l3
          from (select replace(title,'. ','.')||'   ' title
                  from classification))
/

Scientific Classifications
-------------------------------------
Mustela nivalis -->
genus: Mustela
species: nivalis
subspecies:

Ophisaurus attenuatus longicaudus -->
genus: Ophisaurus
species: attenuatus
subspecies: longicaudus

Etheostoma sp. 9 -->
genus: Etheostoma
species: sp. 9
subspecies:

Noturus flavus pop. 1 -->
genus: Noturus
species: flavus
subspecies: pop. 1

Mustela -->
genus: Mustela
species:
subspecies:
                                                                  

5 rows selected.
Let us know if this satisfies your need.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I provide low-cost, remote Database Administration services: www.dasages.com]
 
I had to adjust it slightly, but that got me what I needed. Thanks so much Mufasa!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top