Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Not clustering on identity.

Status
Not open for further replies.

pdbowling

Programmer
Mar 28, 2003
267
US
Good day everyone.
My boss said to me 'Do not put a clustered index on the identity. I want to look at the data later and decide what to cluster on.'

I have a snippet of code. It functions, but I am not sure that I completely grasp what is happening when I run it.

Here is my snippet that I run in SQL Management Studio 2008 R2.

Code:
CREATE TABLE BendInspection (
       BendInspectionID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED,
       TemplateVersionID UNIQUEIDENTIFIER NOT NULL,
       ReportTypeID INT NOT NULL,
       ReportID UNIQUEIDENTIFIER NOT NULL,
       GeneralInformationInspectionDate DATETIME NOT NULL,
       GeneralInformationProjectNumber UNIQUEIDENTIFIER NOT NULL,
       GeneralInformationLineNumber UNIQUEIDENTIFIER NOT NULL);

I put in NONCLUSTERED to attempt to meet the request, but I am a little unclear what effect this is having behind the scenes.

First question: Does having and IDENTITY(1,1) PRIMARY KEY automatically make a field a clustered index?
Second question: Does adding NONCLUSTERED do anything? (ie. does it get ignored because the answer to the first question is yes or does it actually prevent clustering?)

Since the snippet runs and creates a table, I am not sure

Thank you all so much,
Patrick B
 
Did you create the table already?
Script out the table and look at what it created by the script generated to see if what you did had any effect.
 
Certainly looks like it worked.

Scripted table.
Code:
CREATE TABLE [dbo].[BendInspection](
	[BendInspectionID] [int] IDENTITY(1,1) NOT NULL,
	[TemplateVersionID] [uniqueidentifier] NOT NULL,
	[ReportTypeID] [int] NOT NULL,
	[ReportID] [uniqueidentifier] NOT NULL,
	[GeneralInformationInspectionDate] [datetime] NOT NULL,
	[GeneralInformationProjectNumber] [uniqueidentifier] NOT NULL,
	[GeneralInformationLineNumber] [varchar](128) NOT NULL,
	[GeneralInformationGrade] [varchar](32) NOT NULL,
	[GeneralInformationPipeDiameter] [float] NOT NULL,
	[GeneralInformationWallThickness] [float] NOT NULL,
PRIMARY KEY NONCLUSTERED 
(
	[BendInspectionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
 
By default, a primary key is clustered. This is just a long standing default. Unfortunately, because of this default, you will find a number of people who believe that clustered index is the same thing as a primary key constraint, and they are certainly not the same thing.

By its nature, you only get one clustered index on a table. If you find you do a lot of range queries on say a date field of some sort, you may want to save the clustered index for those fields.
 
yes only one one things I want to mention clustered index MUST be unique...
 
Well.... "Must" is a little strong there. It may be highly preferable, but it is not a technical requirement.

Code:
create table test1
(col1 int,
 col2 varchar(20))

create clustered index c_ind on test1 (col1)

insert into test1 values (1, 'hello'), (1, 'uh-oh')

select *
from test1
 

MS said:
If the clustered index is not a unique index, SQL Server makes any duplicate keys unique by adding an internally generated value called a uniqueifier. This four-byte value is not visible to users.

It will have it's performance benefits to separate the decision on clustered index and primary key. A clustered index done on an identity integer is wasted, because it makes the table a heap, as the clustering is done by the sequential number the records have, you get chronological record order. A good example is clustering by a combination of keys which are indeed only halfway unique, eg cluster by customerid, orderdate might not be unique for customers making two orders at the same date, but then you have clustered together customer data, so it's fast to retrieve it for a single customer or do statistics grouped by customer.

Bye, Olaf.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top