Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Advice about tables structure 1

Status
Not open for further replies.

wvandenberg

Technical User
Oct 24, 2002
125
CA
I am designing a db with the specs outlined below. I am not trained in database development and have taken over this existing db and then modified it. I am hoping for some advice on my design. I would like to develop something that is “normalized”.

The system I am designing will be used to store water quality data collected at various sites on a number of rivers under various projects. Samples are collected by my company and then the laboratory sends us the analysis results in an Excel file. We also receive Excel files with data collected by other agencies. I will need to import all data we receive into this db. The system will also have to export stored data based on user selected parameters (eg. site name, hydrologic unit, sample name etc.).

tblAnalysis_Methods: Stores all analyte analysis methods
pkAnalysis_Method_ID
Analysis_Method_Description

tblAnalysis_Methods tblAnalytes
pkAnalysis_Method_ID (1) -> (M) fkAnalysis_Method_ID
Attributes: Enforced

------------------------------------------

tblAnalytes: Stores All iterations of analyte names from various analysis sources
pkAnalyte_ID
fkLab_Group_ID
fkReporting_Units_ID
Analyte_Name
Detection_Limit
fkAnalysis_Method_ID

tblLab_Groups tblAnalytes
pkLab_Group_ID (1) -> (M) fkLab_Group_ID
Attributes: Enforced

tblAnalysis_Methods tblAnalytes
pkAnalysis_Method_ID (1) -> (M) fkAnalysis_Method_ID
Attributes: Enforced

tblAnalytes tblGALGrouped_Analytes
pkAnalyte_ID (1) -> (M) pkfkAnalyte_ID
Attributes: Enforced

tblAnalytes tblResults
pkAnalyte_ID (1) -> (M) pkfkAnalyte_ID
Attributes: Enforced

tblReporting_Units tblAnalytes
pkReporting_Units_ID (1) -> (M) fkReporting_Units_ID
Attributes: Enforced

------------------------------------------

tblGAL_Groups: Grouping for analytes to be exported
pkGAL_Group_ID
GAL_Group_Name

tblGAL_Groups tblGALGrouped_Analytes
pkGAL_Group_ID (1) -> (M) pkfkGAL_Group_ID
Attributes: Enforced

------------------------------------------

tblGALGrouped_Analytes: Stores select groups and assigned analytes in specified order
pkfkGAL_Group_ID
GG_Order
pkfkAnalyte_ID
GA_Order

tblAnalytes tblGALGrouped_Analytes
pkAnalyte_ID (1) -> (M) pkfkAnalyte_ID
Attributes: Enforced

tblGAL_Groups tblGALGrouped_Analytes
pkGAL_Group_ID (1) -> (M) pkfkGAL_Group_ID
Attributes: Enforced

------------------------------------------

tblHydrologicUnits: Self-referencing table containing Basins, Watersheds and Rivers
Unit_ID
Unit_Name
Unit_Type
Parent_Unit_ID

I am still trying to determine relationship between tblHydrologicUnits and tblSites

------------------------------------------

tblIDLocations: Keeps track of the original location from which the data were acquired
pkIDLoc
IDLoc_Description

tblIDLocations tblSamples
pkIDLoc (1) -> (M) fkIDLoc
Attributes: Enforced

------------------------------------------

tblLab_Groups: All iterations of group names from analysis sources
pkLab_Group_ID
Lab_Group_Description

tblLab_Groups tblAnalytes
pkLab_Group_ID (1) -> (M) fkLab_Group_ID
Attributes: Enforced

------------------------------------------

tblOrganizations: Organizations responsible for initiating sampling programs
pkOrganization_ID
Organization_Short
Organization_Name

tblOrganizations tblProjects_Programs
pkOrganization_ID (1) -> (M) fkOrganization_ID
Attributes: Enforced

------------------------------------------

tblProjects_Programs: Project numbers, name and purpose
pkProject_Code
Program_Project
fkOrganization_ID
Project_Purpose

tblOrganizations tblProjects_Programs
pkOrganization_ID (1) -> (M) fkOrganization_ID
Attributes: Enforced

tblProjects_Programs tblSamples
pkProject_Code (1) -> (M) fkProject_Code
Attributes: Enforced

------------------------------------------

tblReplicates: Replicate results
pkfkAnalyte_ID
pkfkSample_ID
Sample_Name
Result_Value
Symbol
Qualifier_Code
Outlier
QCFail
Comments
IDLoc
pkfkAltSample_ID

tblResults_Qualifiers tblReplicates
pkQualifier_Code Qualifier_Code
Attributes: Not Enforced

tblSamples tblReplicates
pkSample_ID pkfkSample_ID
Attributes: Not Enforced

tblSamples tblReplicates
pkSample_ID pkfkAltSample_ID
Attributes: Not Enforced

------------------------------------------

tblReporting_Units: List of result reporting units
pkReporting_Units_ID
Reporting_Units_Description

tblReporting_Units tblAnalytes
pkReporting_Units_ID (1) -> (M) fkReporting_Units_ID
Attributes: Enforced

------------------------------------------

tblResults: Numerical results of sample analysis
pkfkAnalyte_ID
pkfkSample_ID
Sample_Name
Result_Value
Symbol
Qualifier_Code
Outlier
QCFail
Comments
IDLoc

tblAnalytes tblResults
pkAnalyte_ID (1) -> (M) pkfkAnalyte_ID
Attributes: Enforced

tblResults_Qualifiers tblResults
pkQualifier_Code Qualifier_Code
Attributes: Not Enforced

tblSamples tblResults
pkSample_ID (1) -> (M) pkfkSample_ID
ttributes: Enforced

------------------------------------------

tblResults_Qualifiers: Codes used to further quantify a result
pkQualifier_Code
Qualifier_Description

tblResults_Qualifiers tblReplicates
pkQualifier_Code Qualifier_Code
Attributes: Not Enforced

tblResults_Qualifiers tblResults
pkQualifier_Code Qualifier_Code
Attributes: Not Enforced

------------------------------------------

tblSample_Types: Sample collection types
pkSample_Type
Sample_Type_Description

tblSample_Types tblSamples
pkSample_Type (1) -> (M) fkSample_Type
Attributes: Enforced

------------------------------------------

tblSamples: Samples collected at Sites
pkSample_ID
fkIDLoc
fkProject_Code
fkSite_ID
Sample_Name
Start_Date
Missing_Time
End_Date
fkSample_Type
Comment

tblIDLocations tblSamples
pkIDLoc (1) -> (M) fkIDLoc
Attributes: Enforced

tblSites tblSamples
pkSite_ID (1) -> (M) fkSite_ID
Attributes: Enforced

tblProjects_Programs tblSamples
pkProject_Code (1) -> (M) fkProject_Code
Attributes: Enforced

tblSample_Types tblSamples
pkSample_Type (1) -> (M) fkSample_Type
Attributes: Enforced

tblSamples tblReplicates
pkSample_ID pkfkSample_ID
Attributes: Not Enforced

tblSamples tblReplicates
pkSample_ID pkfkAltSample_ID
Attributes: Not Enforced

tblSamples tblResults
pkSample_ID (1) -> (M) pkfkSample_ID
Attributes: Enforced

------------------------------------------

tblSites: Site location information
pkSite_ID
Northing
Easting
UTM_Zone
GAL_Site_Name
AENV_Site_Name
RAMP_Site_Name
Industry_Site_Name
Other_Site_Name
Place_Name_Description

tblSites tblSamples
pkSite_ID (1) -> (M) fkSite_ID
Attributes: Enforced

------------------------------------------

tblUnit_Types: Basin, Watershed, River
UnitTypeID
UnitTypeName

tblUnitTypes tblHydrologicUnitsSites
pkUnitTypeID (1) -> (M) fkUnit_Type_ID
Attributes: Enforced

I have two specific questions in addition to general advice anyone can offer.

1. I would like to be able to "link" similar analyte names to a standard analyte group and analyte name. tblAnalytes.Analyte_Name will contain data like:
Mercury (Hg)
LL Mercury
Trace Mercury
Hg DISSOLVED
So, I would like to keep the original data but be able to allow the user to match up the analyte name to a standard list that my company uses. Not all analyte names would match to one in the standard list.

2. I'm not sure how to link tblHydrologicUnits to tblSites. Should I put a foreign key in tblSites to store the lowest child record ID in tblHydrologicUnits? Will this allow me to create a form that will allow users to "drill down" to find a site?

Sorry for the informaiton overload but I hope someone can make some sense of it all. Any advice, criticism etc is greatly appreciated. Thanks.

Wendy

 
I saw your post the other day about watershed, basin, and river. I have never faced this issue myself, so I dont have the definitive answer you seek. I feel there must be a formal answer to this problem, but it will be at an abstract level. Having said all that.

An aspect of the problem is that different sites seem to be at different levels in a hierarchy, some are one-deep, some are two-deep, some three-deep. What about defining some pseudo-branches where there are always three coordinates, just that some of them map to the same place. Like a projection. So we think of points in 3-space as (1,3,5); but a lot of points are in two-space, (1,3,0). Does a site have exactly one location? If so, why not say that the Mendota bypass canal is in the (WestSierra, WestSierra, WestSierra) Hydrologic Unit if there is no particular Watershed or River for that site?

When you wish to summarize by Basin, that is dimension one, that can always break down into dimension two, because there always is a value for dimension two. Although sometimes it is a catch-all value.

BTW, I would have guessed that a watershed contains basins, which contain rivers. No?

 
rac2,

Thanks for your input. To address your points;
1. Yes, all sites have exactly one location.
2. I don't think I completely understand the pseudo-branches. The heirarchy is a Basin has one main river and that main river has Watersheds branching off which contain many rivers as outlined below.

Athabasca Basin
Athabasca River (main Basin river)
ARSite1
ARSite2
Clearwater Watershed
Clearwater River
CRSite1
CRSite2
Hangingstone River
HRSite1
Beaver Basin
Beaver River (main Basin river)
BRSite1
BRSite2
BRSite3
Manatokan Watershed
Manatokan Creek
MCSite1
MCSite2

Based on the data above, how would I apply the pseudo-branches?

Wendy
 
If the maximum depth of the hierarchy is fixed then you could locate a site like so. A table with four columns could store the location of any site if the maximum depth is four. Here, there is an actual Creek = Manatokan. The other Creek values, Athabasca and Clearwater are pseudo-values, there are no actual creeks with those names. If you GROUP BY Creek, you get data for sites that have four levels of location as well as those that have one, two or three. So data for ARSite2 is included even though there is no Athabasca Creek.
Site Basin River Watershed Creek
ARSite2 Athabasca Athabasca Athabasca Athabasca
CRSite2 Athabasca Clearwater Clearwater Clearwater
MCSite2 Beaver Beaver Manatokan Manatokan

However, it is looking like there is no maximum number of levels in the hierarchy. So this approach will break when someone wishes to record data for the Farmer Brown ditch that flows into the Manatokan Creek in the Manatokan Watershed for the Beaver River in the Beaver Basin. In that case you will need to represent the hierarchy in a Parent-Child structure. The reporting for that, which is a kind of linked list will be more complicated. I believe there are ways to handle that, but I am not familiar with them.

You may find a better answer in the ANSI-SQL forum, . Your last post is a very clear statement of the requirement so just start a new thread with it in that forum.
 
Would it be correct to say that a Basin is an area, that a Watershed is an area that is a subdivision of a Basin. And that a Basin comprises one or more watersheds. So that is a hierarchy of regions.

Then to say that a Basin has one or more rivers.
And that a watershed has one or more rivers. The rivers are not subdivisions of the areas, they are properties of the areas.

That creeks are small rivers, nonetheless they are streams just as rivers are streams. Streams are another hierarchy, generally correlated, perhaps exactly correlated, with the hierarchy of areas.

Rivers, creeks, and streams drain areas.

Sites are places in a stream.

The smaller areas drained by the smaller streams are subdivisions of watersheds, that, because these areas are small they do not have names other than the stream within them.

Instead of
tblHydrologicUnits: Self-referencing table containing Basins, Watersheds and Rivers
Unit_ID
Unit_Name
Unit_Type
Parent_Unit_ID
Is it possible that Unit_Type is used to distinguish areas from streams? And that instead of a Unit_Type column separating areas and streams in this table another table could represent the streams and their location in an area.

Again, rivers are a property of a Hydrologic Unit. And that the relationship of Hydrologic Unit to stream is one to many, an HU may have one or many streams.

So an alternative structure
tblHydrologicUnits: Self-referencing table containing Basins, Watersheds and smaller areas
Unit_ID
Unit_Name
Parent_Unit_ID

with tblStreams: a list of rivers, creeks, streams, canals, ditches, etc located in a Hydrologic Unit
Stream_ID
Stream_Name
Hydro_Unit_ID


tblSites
Site Stream_Name Unit_Name
ARSite2 Athabasca River Athabasca Basin
CRSite2 Clearwater River Clearwater Watershed
MCSite2 Manatokan Creek Manatokan Watershed

tblHydrologicUnits
Unit_ID Hydro_Unit_Name Parent_Unit_ID
1 Athabasca Basin NULL
2 Beaver Basin NULL
3 Clearwater Watershed 1
4 Manatokan Watershed 2

tblStreams
Stream_ID Stream_Name Hydro_Unit_ID
1 Athabasca River 1
2 Clearwater River 3
3 Hangingstone River 3
4 Beaver River 2
5 Manatokan Creek 4

 
Yes, rac2, that is exactly what I was looking for. I have updated my tables to reflect your alternative structure. It will allow my client to further subdivide Watersheds if they so choose in the future (which they swear they won't ;-) ).

One last question is: Shouldn't I use Unit_ID and Stream_ID rather than Unit_Name and Stream_Name as the foreign keys in tblSites?

Wendy
 
Yes of course you should. I was just noodling there. Good that you saw what I meant, not what I wrote. Good luck with your project.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top