Database Design, choose between circle reference or constraint problem 1

Olaf Doschke · Apr 19, 2012

I have a problem with either a circle reference or a constraint. It seems I have to choose one or the other to me, but perhaps there is something SQL 2008R2 offers I have overlooked.

To describe a bit, first:

1. There are production charges, which are tested in several tests. (so I have charges, 1:n related to tests)
2. Each test will require 1 or more samples (so I have samples 1:n related to tests)

So far a simple 1:n:m relationship between Charges-<Tests-<Samples

3. The samples are samples of the charges of course

I avoid the circle reference problem, by not letting samples reference the charges they come from, they indirectly reference the charge by the tests they are assigned to.

The number of samples needed depends on the tests, so they are created while putting together the test details, that's also a good reason to let samples reference the tests they are for, not the charge they come from.

The samples are simply numbered sequentially and the business logic of the application creates unique sample numbers for each charge by determining the max sample number from all samples related to any test of the same charge, and that works OK, even if not all tests of a charge are loaded into the test design UI. But by Murphy's law, we found double sample numbers in the database, where there shouldn't be.

The constraint that I would need to define in the database, to make it failsafe against this, would need to take the sample and lookup the indirectly related charge to make a check for uniqueness of charge ID and sample number.

I would be ok in doing such a constraint rule, but is there a better design I could also apply? For the ease of the sample number constraint, I could let samples reference the charges they come from, then simply all value tuple sample.chargeid,sample.number would need to be unique.

I just fear this will lead to even worse nightmares with samples referencing tests that in turn don't reference the same charge as the sample would. Each foreign key constraint can be okay and still you can get such references not closing the circle, but making a spiral reference to a wrong test or charge. This would lead to a ripple effect in the worst case you could hardly know which reference is right and which wrong.

Finally the question: How could I harden the database to not allow double sample numbers? Is there something SQL2008R2 offers about constraints with data from parent tables? Or is there something checking if a circle reference of three tables really does form a circle and has no inconsistency of direct vs indirect references?

Bye, Olaf.

Olaf Doschke · Apr 19, 2012

Maybe moving the charge reference from the test to the sample would solve it half ways, making the samples table a junction table.

Charge table
ID (,production date, produced by user id,...)

Test table
ID (, testbegin, test evaluation assigned to user ID, ...)

Sample table
ID, ChargeID, TestID, Samplenumber (, weight, ...)

This just introduces a new problem: As it is now, a test is for a certain production charge. This design allows the same test to be combined with different charges. I might make this a feature, but I have to think about the consequences. The problematic part about this is, that the tests happen at a series of test dates, which depend on the production date.

But the unique constraint on (ChargeID, Samplenumber) is easy now, of course.

Bye, Olaf.

Qik3Coder · Apr 19, 2012

Olaf,

You can add a unique constraint to the Sample table for the ChargeID/TestID/and SampleNumber, preventing dupes.

Are the samples results of the tests, or are they a subgroup of the charges table?

Your use of the word test is vague, so i'm not sure if you are "flexing" the integrity of the Charges or if you are validating the integrity of the charges.

Flexing-actually using the charges to validate they are okay.
validating-inspecting a sample of the charges to ensure weight/height/length

Lodlaiden

You've got questions and source code. We want both!

Olaf Doschke · Apr 20, 2012

Tests are quality assurance tests. samples of a production charge of some formulation are stored at several storage temperatures and tested in several ways.

A unique constraint on ChargeID, TestID, Samplenumber will not solve the problem, that one test must be for one charge only, but a Charge is allowed to be tested with multiple tests (with different test conditions), therefore a charge is split into several samples.

Simplyfying with numbers, this is possible, but shouldn't be:

Charge1, Test1, Sample1
Charge2, Test1, Sample1

In this case the uniqueness is not violated, but Test1 would reference two charges, which it can't.

In turn, what should be allowed is:
Charge1, Test1, Sample1
Charge1, Test1, Sample2
Charge1, Test2, Sample3
Charge1, Test2, Sample4

A Charge can be tested with many tests, and each test can need many samples.

And again what is not allowed is

Charge1, Test1, Sample1
Charge1, Test2, Sample1

A Sample is always exclusive to use for one test. That follows from the test conditions of sample storage. You can only store a sample between test dates at one temperature and not at another temeperature, at the same time, so in the general case a sample can't be used for an incubator test (at >100 °F) and also as a fridge test, that needs to be done with seperate samples.

At least this sample table structure has all the data to be checked in it, I try to figure out using stored procs in a constraint.

Bye, Olaf.

Qik3Coder · Apr 20, 2012

So, could you put a unique key on the sample column alone in the Tests table?

You've got questions and source code. We want both!

Olaf Doschke · Apr 20, 2012

There is no sample column in the test table. There is a test column in the sample, no matter if we talk about my idea of a future redesigned table for the samples or the situation as is. I initially said there is a 1:n:m relationship between Charges-<Tests-<Samples

The real world objects are the charges, initially. Once they have been produced samples are taken from each charge to get - well - small samples of that charge to test and to store under different conditions and with different test criteria and measurements.

From that perspective the real world would be better reflected by designing the tables as Charges-<Samples-<Tests. The tests are taken on samples, which are taken from a charge. But any test is about many samples of the same charge. A test has a very complex substructure of a schedule of dates and test criteria/measurements taken at these dates.

That's why there really is a circle relationship between the three entities. First a charge is produced to make some tests on it, and as the test is planned, depending on what test plan is used, the number of samples needed are generated in the database and the user then splits the charge into these number of samples.

So the relations are, numerically:

1 charge has n tests
1 test needs m samples
1 sample is for 1 test of the charge it is taken from.
1 charge has Sum(m) samples, aggregating over all tests with each m samples.

And in regard of the conditions, that must be met:
A charge is produced to make several tests on it.
A charge is split into several samples for this.
A test is about 1 charge
A test is measuring data of several samples of that 1 charge.
A sample is taken of 1 charge (physically it can't be of two seperated charges, samples are never a mix of two charges)
A sample is dedicated to 1 test.

And the last two conditions suggest to let each sample reference both the charge it's taken from and the test made with it.

But the condition "A test is about 1 charge" also suggests to let the test reference the charge it is meant for.

Bye, Olaf.

JonFer · Apr 21, 2012

Charges have 1 or more Samples
Charges have 1 or more Tests
Test belongs to 1 Charge
Sample belongs to 1 Charge
Tests have 1 or more Samples but ONLY for the Charge associated with the Test

Test table:
Foreign Key ref to the Charge table
Unique Key of TestID + ChargeID (to allow Foreign key referencing)
TestID is Primary key

Sample table:
Unique constraint for (ChargeID + SampleNumber)
Foreign Key ref to Charge table on Charge ID
Foreign Key ref to Tests on (TestID + ChargeID)

Olaf Doschke · Apr 23, 2012

Thanks Jon,

I'll star that, even though this means the circular reference I'd like to avoid, but then the uniqie key on TestID,ChargeID used for the foreign key ref will ensure it's the correct circle of records.

The sample table would have a sampleid, testid, chargeid and reference to charge via chargeid and to test via (testid,chargeid), correct?

How do I setup such a foreign key constraint in code?

Thanks in advance,
Olaf.

Olaf Doschke · Apr 23, 2012

Ok, found out by SSMS, after doing it manually.

Code:

USE [Database
GO

/****** Object:  Index [xcTestBatch]    Script Date: 04/23/2012 17:55:40 ******/
CREATE UNIQUE NONCLUSTERED INDEX [TestCharge] ON [dbo].[Tests] 
(
	[TESTID] ASC,
	[CHARGEID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
[code]

Also thanks to @Qik3Coder, for making me refine my specifications.

Bye, Olaf.

Olaf Doschke · Apr 24, 2012

Actually the harder to find option to find (for me) was, that you can reference multiple fields in a foreign key as in

Code:

ALTER TABLE [dbo].[Samples]  WITH CHECK ADD  CONSTRAINT [FK_Samples_Tests] FOREIGN KEY([TestID], [ChargeID])
REFERENCES [dbo].[Tests] ([ID], [ChargeID])
GO

For future reference of a similar database design: This is a script creating the full database with the needed unique indexes and foreign key constraint

Code:

--create a new database "sandbox", then...
USE [sandbox]
GO

CREATE TABLE [dbo].[Charges]
(	[ID] [int] IDENTITY(1,1) NOT NULL,
	[number] [int] NOT NULL,
 CONSTRAINT [PK_Charges] PRIMARY KEY CLUSTERED 
        (
	[ID] ASC
        )
)
GO

CREATE TABLE [dbo].[Tests]
(	[ID] [int] NOT NULL,
	[number] [int] NOT NULL,
	[ChargeID] [int] NOT NULL,
 CONSTRAINT [PK_Tests] PRIMARY KEY CLUSTERED 
        (
	[ID] ASC
        )
)
GO

CREATE TABLE [dbo].[Samples]
(	[ID] [int] IDENTITY(1,1) NOT NULL,
	[TestID] [int] NOT NULL,
	[ChargeID] [int] NOT NULL,
	[number] [int] NOT NULL,
 CONSTRAINT [PK_Samples] PRIMARY KEY CLUSTERED 
        (
	[ID] ASC
        )
)
GO

--essential ingredient #1: unique index on chargeID and number to prevent double charge sample numbers
CREATE UNIQUE NONCLUSTERED INDEX [SampleNumber] ON [dbo].[Samples] 
(
	[ChargeID] ASC,
	[number] ASC
)
GO

--essential ingredient #2: unique index on ID (=TestID) and ChargeID in the Test table
CREATE UNIQUE NONCLUSTERED INDEX [TestCharge] ON [dbo].[Tests] 
(
	[ID] ASC,
	[ChargeID] ASC
)
GO

--essential ingredient #3: using that unique index for reference in a foreign key of the Samples table.
ALTER TABLE [dbo].[Samples]  WITH CHECK
   ADD CONSTRAINT [FK_Samples_Tests] FOREIGN KEY([TestID], [ChargeID])
   REFERENCES [dbo].[Tests] ([ID], [ChargeID])
GO

ALTER TABLE [dbo].[Tests]  WITH CHECK ADD  CONSTRAINT [FK_Tests_Charges] FOREIGN KEY([ChargeID])
REFERENCES [dbo].[Charges] ([ID])
GO

Problem solved, thanks for all your participation.

Bye, Olaf.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Database Design, choose between circle reference or constraint problem 1

Olaf Doschke

Programmer

Olaf Doschke

Programmer

Qik3Coder

Programmer

Olaf Doschke

Programmer

Qik3Coder

Programmer

Olaf Doschke

Programmer

JonFer

Programmer

Olaf Doschke

Programmer

Olaf Doschke

Programmer

Olaf Doschke

Programmer

Similar threads

Part and Inventory Search

Sponsor